
This document is an addendum to the "Bulder Builder" documentation. Its purpose is to explain in detail the THM2PBB command added to the most recent version of the application.
THM2PBB existed prior to Builder Builder and is distributed as a command line program. The purpose of this tool is to parse THML encoded files and create HTML files that are PBB ready. Bible references, footnotes, page numbers, etc.are all translated into PBB tags and the overall project is divided into a set of files that are managable by the PBB compiler.
THML is short for "Theological Markup Language" It is a type of XML encoding that is specifically designed for theological documents. THML contains tags that specify things like scriptural references, ancient languages, verse structure, and so on as well as tags that specify standard literary information like glossaries or footnotes. THML was developed by Calvin College for its documentation archived on their CCEL web site (http://www.ccel.org). You can read more about THML at: http://www.ccel.org/ThML/ThML1.04.htm.
The original idea behind Builder Builder was to provide a useful html/PBB editor to go along with thm2pbb. THM2PBB does a good job of creating project files that are 'almost' ready to compile. The problem with the created files mostly involves HTML headers. In PBB documents it is important to distribute headers evenly and with a proper hierarchy because it is this header distribution that determines the structure of the contents panel in the finished Libronix resource. The problem with THML documents is that they are not constructed with PBB in mind and the distribution of headers is mostly a function of the whim of the document designer. Some documents don't use HTML header tags at all. Others use far too many from a PBB perspective. The bottom line is that once a user has created a set of PBB project files with thm2pbb there is invariably a fair amount of work to do to neaten things up and optimize the look for a Libronix resource window.
The first versions of Builder Builder that were distributed did not have the thm2pbb functionality built into them. It occurred to me that, past a certain point, the program was useful just as an HTML editor with PBB tagging utilities that are missing from other editors. Also, as time went on, the project grew increasingly larger and, after a while, it became unclear as to whether I was ever going to be able to finish it. So I said to myself, "half a PBB editor is better than no PBB editor", so I released it. Since then additions to the program have been released on the fly. This latest release finally builds the 'thm2pbb' functionality right into the program so that users can do everything with one utility. "Builder Builder" is still a work in progress. With God's help, we'll get there eventually.
THM2PBB is invoked from a single command in the "Project" menu on the main menu bar.

Since is a command that will only be used very occasionally (once per project) there is no toolbar button alloted for it. Executing the command will bring up a small dialog that will allow the user to choose whether to load a THML document from the web or locally from disk.

Selecting local will bring up a standard "Open File" dialog. The "Remote" button will open the Builder Builder "Load URL" dialog. If loading remotely from the CCEL site the user may need to log in to the site (Site registration is now required to gain access to CCEL files. There is no cost for registration). The user must also go through a CCEL donations page before access to the THML file is allowed. Both of these actions can be done from the "Load URL" window (if you have to log in you will need to navigate back to your document afterwards).
Open URL Dialog at the CCEL Donations Page

You must select a donation amount and click on "Continue" before you can get access to the THML file. If you click the "Load" button before the proper URL is loaded into the "URL" window you will end up loading the 'Donations' page or an error page into the XML document. When you get to the stage where a url with a "membership_type" is appended to the URL string in the "URL" window then you are ready to click "Load".
Correct URL - Click "Load"

Incorrect URL - Do not Click "Load"

Please consider making a donation to CCEL for the files that you download (if you can manage it). This archive has been a great benefit to the Christian community.
If the THML file is accessed remotely a copy of the file will be saved in the project directory. The saved file will have a file extension of ".thm" regardless of the extension used on the CCEL site. If the user needs to run thm2pbb more than once to generate an optimal PBB setup then the local file may be used in subsequent executions to avoid continually having to access the document through the browser.
Whatever method you use to access the THML document, once loading begins, a simple progress dialog will appear. There is no progress bar at this stage, just simple messages indicating the processing stages. Loading the document and calculating the initial parameters can take anywhere from a few seconds to a couple of minutes, depending on the access method and the size and complexity of the document.

Once the document is loaded the following THM2PBB Setup dialog will appear.
THM2PBB Setup Dialog


The "Project Name" is the name that will be placed in the "Title" element of each generated HTML file. It is also the name of the folder that will be created that will contain the generated project files. The project name always defaults to the file name of the THML file being parsed (without the extension) but it can be changed to anything the user wants.
The "Project Folder" contains the path of the folder that will be created to hold the generated files. The path to this folder defaults to the default "Builder Builder PBB" folder. The user can change this path by clicking on the button to the right of the 'Project Folder' edit box window. The folder that the user selects from the resulting "Select Folder" dialog is the folder in which the new 'Project Folder' will be created, 'not' the folder where the project files will be stored. If the user wants the generated files to go into an existing folder then the "Project Name" box must contain the name of that folder.
In the sample above the folder npnf106 would be created on my desktop and the project files placed in that. If I pressed the "Select Folder" button and selected "C:\mydocs" from the dialog then the Project folder window would be updated to read "C:\mydocs\npnf106" and that folder would be created. If I then changed the project name to read "My PBB Project" then the window would again be updated to read "C:\mydocs\My PBB Project" and the folder "C:\mydocs\My PBB Project" would be created.
The "Root File Name" is the name that will be used in all the generated file names. The generated files all have a 4 digit prefix followed by an underscore, the root file name and an 'html' extension. Therefore, if the starting file prefix number is 0 then the generated files in the instance above will be "0000_npnf106.html", "0001_npnf106.html", and so on. The root file name can be changed by the user but only alphanumeric characters and underscores will be accepted. If other characters are encountered in the file name when the edit box loses focus then it will revert back to the name that was previously in the box.
If the user sets the 'Project Folder' to an existing folder and thm2pbb detects valid PBB project files in that folder then a button will appear to the right of the 'Root File Name' box labled "Append to Project".

If this button is clicked then the "Root File Name" will change to whatever the root file name of the existing project is. Also, the starting file prefix (see thm2pbb parameters) will be updated to be the prefix of the last file in the project folder plus one.
E.G.
Suppose the user sets the "Project Folder" to an existing folder that contains project files with file names "0000_lxx.html", "0001_lxx.html", .. and so on. Further, suppose the last project file in this folder is "0122_lxx.html". In that case clicking the "Append to Project" button in the sample image above will change the "Root File Name" box contents from "npnf204" to "lxx" and the "Starting File Prefix" will be set to '123'. If "Run" is then executed new project files will be generated starting at "0123_lxx.html" and forward.
The "Append to Project" button is actually a toggle action. Clicking it will cause the button lable to change to "Create New Project". Clicking it again will reset the fields back to the original "Root File Name" and the "File Prefix = 0". The button lable will also be reset on the second click.
If the user runs the utility to an existing project folder with the same 'Root File Name" but doesn't "Append" the new files (e.g. running the command twice on the same thml file with identical parameters) then the old files will be overwriten. Old files that are not overwritten will not be deleted on successive executions of the command. The deletion of extraneous files must be managed by the user manually.

The Project Parameters determine file name numbering, file sizes and quantity.

The Starting file prefix simply specifies what number the output file name numbering will start at. If it is left at "0000" (default) then the file names will be something like "0000_project.html", "0001_project.html", .. etc. If this number is changed to something higher (like, for instance '0100') then the numbering will start at the value specified (e.g. "0100_project.html", "0101_project.html", .. ) .
|
![]() |
The File Division parameters determine how many files the project will be divided into and the sizes of the files. THML logically divides a document's contents with hierarchical DIV tags. These tags are numberd "Div1, Div2, Div3, .. and so on". The number of div levels contained in any THML document is arbitrary.
THM2PBB utilizes the div levels to determine the file boundaries of the output. Div0 means all of the HTML output will go into one large file (This is useful for small books without a lot of PBB markup. e.g. Chesterton's fictional works). If Div1 is set then every spot in the THML file where a Div1 tag is encountered will close the current output file and create a new one. If Div2 is selected then every instance of a Div2 tag will create a new file, and so on.
The "Equal to/Less Than or Equal To" combo box modifies the file division parameter. If it is set to "Equal To" (=) then the output file boundaries will only occur at div level tags specified by the "Div Level" selection. If it is set to "Less Than or Equal To" (<=) then a new file will be generated at points where any div level tag up to and including the selected div level tag are encountered. A low div level can produce files that are too large for a satisfactory PBB resource. A high div level with the "Equal to/Less Than or Equal To" combo box set to "<=" will produce a lot of small files, some of them empty (these can just be deleted from your project, however your page numbering could be affected). Even in the later case, however, some files may be generated that are larger than the user would like and would manually need to be divided into smaller files. Unfortunately this is something that is determined by the amount of content contained within the div tags in the THML document and can't be avoided.
The "Min File Size" box sets the minimum file size (in kilobytes) for thm2pbb output. During the parsing process if a div tag is encountered that matches the div level parameters it will be ignored if the current file size is less than the value set in this box. Setting this value to a quanity that is consistent with a good PBB file size will eliminate the problem of empty or very small files where a maximum file division level setup is selected.
Estimated file divisions and file sizes can be viewed in the thm2pbb Setup Dialog browser window. These views can aid the user in deciding an optimal file division setup for the current job. See the section entitled "THM2PBB Browser Window" for details.
The internal defaults for file division parameters are set as follows:
I have found that these settings are optimal for most circumstances. These defaults can be changed in the THM2PBB tab of the main preferences dialog. Some experimenting might discover default values that generally work better for the individual user. However this is not an exact science and there will be times when, depending on the content of the THML document being parsed, adjustments would need to be made to achieve an optimal distribution of files. On the other hand it may turn out to be easier in most cases for the user just to use the built in defaults (or personal defaults) and adjust the file sizes manually afterwards.

The distribution and hierarchy of HTML header tags in PBB source documents is very important in determining the structure of the contents pane in the PBB generated resource. Unfortunately, THML documents are not created with PBB in mind and the distribution of headers in those documents is very arbitrary. Some documents don't use headers at all because the creator prefers paragraph styles to format the visual output. Some use headers far too liberally and, if preserved, the extraneous headers would create a terrible jumble in the PBB generated contents pane.
THM2PBB provides two facilities to aid the user in correcting this problem. The "Remove Headers" button and the "Add Headers" button both generate dialogs that allow the user to manipulate which THML headers are kept, which are deleted and which are added. The "Remove Headers" dialog allows you to select which headers are removed from the output. The "Add Headers" dialog allows the user to specify paragraph styles (CSS Classes) that will be changed into HTML headers.

When the "Remove Headers" button is clicked the dialog pictured above will appear. This is a modal dialog so the user must click the "OK" button before proceding with any other actions. Just check the headers that should be removed from the PBB output and click "OK". The command line version of thm2pbb has a "Remove Headers" option that, if specified removes "all" THML embeded header tags from the HTML output. This version is enhanced a little bit to allow the user to selectively remove headers by type. For instance, the THML file might have a nice header distribution for PBB purposes but have H1 headers throughout as the document title is repeated every time a new section appears. Usually a PBB only wants one H1 header (or perhaps two) to designate the beginning and title of the resource. Removing just the "H1" headers will preserve the rest of the header distribution while ridding the document of extraneous "H1"s.
The Setup Dialog browser window will display samples of each header type that exists in the THML source as well as the quantity of each that exists. This information can aid the user in determining which headers, if any, should be removed from the HTML output. See the section "THM2PBB Browser Window" for details.

Clicking the "Add Headers" button will produce the "CSS Class -> Header" dialog pictured above. It is a non modal dialog. This allows the user to keep the dialog open so that the browser window may be scrolled to view the various CSS class samples (see "THM2PBB Browser Window") while "class -> header" transformations are entered in the dialog.
To enter a transformation select the paragraph class that is to be transformed from the "classes" dropdown, select the header type from the "headers" dropdown and click "Add". Each CSS class may only be transformed once but any number of classes may be changed to the same header type (i.e. both the classes 'bbook' and 'bref' may be transformed into "H3" elements if the user wishes. To remove a trasformation just click on the entry in the list box to select it and click the "Remove" button.
Admittedly the class samples in the browser window do not give the user a very good view of how things will turn out in the final output when these "paragraph class -> header" transformations are used. In most cases it may be prudent to generate the project once, view the various files in the editor, keeping track of which styles would best be changed to headers, and then run it again with the transformations entered. However, in some documents the transformations that are desirable are obvious and one time only should do the trick. In a lot of cases the user won't want to use this funtionality at all.
The "CSS Classes -> Headers" funtion may be a little antiquated now that the output is presented in the Builder Builder editor and the "Replace Dialog" can do the same thing while the user can see what's going on. On the other hand, Builder Builder is not finished and the planned "Project Management" stuff is missing. Therefore you can't do "Replace All" throughout all of the project files yet. It still may be easiest in some cases to do it with thm2pbb for the time being.

THM2PBB has four boolean options that determine what is preserved and what is removed from the final output when the document is processed. Defaults for these options can be set in the THM2PBB tab of the main preferences dialog.
| Option | Description |
|---|---|
|
Keep THML Paragraph Classes |
THML files ususally have style sheet information embedded in them. The document layouts that you see in the CCEL pages are determined by the classes defined in these style sheets. It may not be desirable for a PBB builder to preserve these styles. In those cases where the styles are not wanted the class attributes are just clutter. If this option is turned off then the class attributes that are in the THML file will not be transfered to the HTML output. |
|
Keep HTML Table and List Attributes |
If the THML contains tables and/or lists then turning off this option will remove the attributes that determine the formating of those elements. |
|
Keep HTML Font Tags |
This option will cause thm2pbb to preserve all font tags that are embedded in the THML document. Turn it off if you want to remove the default CCEL font styles or if you want all font styles to be controled by CSS classes only. |
|
Include Page Numbers |
Some THML documents have page number information in them and some don't. If they do then the page numbers will be transfered to the PBB HTML output as PBB page milestone tags. If this option is set then THM2PBB will place page number tags throughout the book, set at regular intervals (about the number of lines contained in an average hard cover book page), beginning at the page number specified in edit box to the right of the "Include Page Numbers" check box. If this option is set then the original page numbers embedded in the THML document will be ignored. The "Headers and Classes" page that is presented in the web view window in the dialog has a line near the top that will tell the user whether or not the THML file has page numbers. Don't worry if the page numbering starts with a roman numeral designation. These are recognized as valid page numbers in Libronix. |

Most of the real estate in the thm2pbb Setup dialog is taken up by a browser window. This window is a simple web browser and its purpose is to provide the user with information about the THML document that is being processed. In the left hand portion of the dialog just above the "Parameters" group there are two buttons, "View THML Headers & Classses" and "View THML File Divisions". The "Headers and Classes" view is the one that is displayed when the dialog first appears. Hitting these buttons toggles the browser between the two views.

The headers and classes view displays information about headers and CSS styles that are embedded in the THML document. This is the view that the user will see when the setup dialog first appears.
The first line under the main header tells the user whether or not the THML file contains page numbers and, if so, what the starting page number is. This aids the user in deciding whether or not to manually number pages in the HTML output files. In the example above the THML document contains page numbers starting a page "i". It is o.k. to retain the roman numeral style page numbers if you wish. Neither PBB nor Libronix mind lettered page numbers. Of course documents that have roman numeral style page numbers probably switch to regular arabic numerals at some point. Mixing like that is also o.k. It won't affect your PBB generated resource at all. You will simply see the page number style change from one to the other as you scroll through your resource.
The section under the page numbering info contains the "Header" information. Each header type that has at least one occurence in the THML document is listed here. A sample from the THML document and the total number of headers of that type is presented. The samples shown are taken right from the very first instance of the header type within the THML document. In the sample image above the first (and only) occurence of an H1 header is the heading for "Indexes". That should tell you that the author of the THML document only useds HTML headers sparingly and only towards the end of the document. This particular example uses mostly CSS paragraph and span classes to format the text.
Lower down in the "Headers and Classes View" is a section that contains the same sort of information for the CSS classes that are embedded in the THML document.

Each table in this section displaces one CSS class along with a sample of text formated with that class demonstrating how the class will look in a browser (or Libronix) window. Again, the sample text for each class comes from the first instance of that class found in the THML document.
It is possible with the convertor to remove one or more header type or change paragraphs into headers by utilizing the information on this page. See "Headers and CSS Classes".

The "File Divisions View" presents the user with information about approximate file sizes and quantity with each div level and "Equal to / Less Than or Equal To" setting combination. The file sizes and the number of files produced is only an estimate based on the contents withing the THML file (which in the end differs from the HTML output fairly significantly). So they are not accurate but they are close enough to give the user an idea of what will be produced with each configuration. This view will change under the following circumstances:
The file sizes are based on a "Minimum File Size" quantity of "0" and are never adjusted in the View panel. It is up to the user to estimate an optimal selection based on the "Min File Size" that is used. In the sample above you wouldn't use the "div level / Equal To or Less Than" selection shown if you could aviod it because there are some files that are rather large for PBB listed. On the other hand, if you were to use this selection, you wouldn't use a "Min File Size" of 100kb because, in that case, the smaller files would just be appended to the beginning of the larger ones. From the information viewable in the image above a "Minimum File Size" of around 30kb would keep the distribution fairly even.
For more information see the section on "File Divisions"

Once the user is satisfied with all of the options and parameters the conversion process is invoked by clicking the "RUN" button. Once the process starts you can't go back, however there is no reason why the THM2PBB process can't be run on the same THML file more than once. As long as the user uses the same file/folder options (and the "Append to Project" opion is not used) then subsequent conversions will just overwrite the existing files in the project folder. If the project settings are changed in later conversions such that a smaller number of files is created then the project folder will contain extranious files from earlier conversions. It is up to the user to remove unwanted files before compiling with PBB.
While the conversion is ongoing a progress dialog (complete with progress bar this time) will appear. I have found that even the largest THML files don't take much longer than a minute or so to complete. When the conversion is completed Builder Builder will open the first file from the new project into the editor window.
At this point your PBB project will be just about done. Most likely some work will need to be done on the title page. Some cleanup and rearrangement of headers throughout the project will also probably be in order. You may need to split a file or two into smaller files. All of this depends on the THML source that was used to generate the files and the parameters and options chosen during the THM2PBB setup.
During the PBB compile you may find some bible links that don't work (or don't link the way you want them to). The problem with non working bible links has been mostly eliminated with this version of THM2PBB. However this can still occur because of (for instance) book abbreviations that Libronix doesn't recognize. The "way" things link is mostly a function of how the THML author decided that they should be linked. For instance a reference in the text like "John 3:16, John 3:23" may look to you like the whole thing should link to a verse range. Sometimes it will and sometimes it won't (i.e. it will yield two references instead, one to each verse). It all depends on how the "Scripref" tag in the THML document is encoded. Unfortunately corrections like this can only be done one instance at a time with the editor. On the other hand, I think most PBB authors will find these circumstances too few and far between and the projects themselves a bit too large to worry about little things like this. Virtually all of your bible references are going to work exactly how you expect them to work, evry time.
For users who have used the command line version of thm2pbb there are some differences in this version that are worth noting.
Be Blessed.
John
http://www3.telus.net/jmccomb
jmccomb1@telus.net