The most recent version of this document can always be found online
The preprocessor was originally introduced to allow users more flexibility in the HTML they generate.
The pre-processor allows AscToHTM and AscToRTF to be used as an authoring tools, as opposed to a simple text conversion or migration tool.
Preprocessor lines are not normally output to the HTML or RTF generated. Instead they are used to modify the conversion process in a number of ways.
The pre-processor can be used to mark sections in your document so that the program will correctly process them as you wish.
Examples include :-
This directive is used to divide the document up into named sections that may then be conditionally included/excluded from a particular conversion.
BEGIN/END_TABLE
BEGIN/END_DELIMITED_TABLE
BEGIN/END_COMMA_DELIMITED_TABLE
BEGIN/END_USER_TABLE New in version 5.0These pairs of directives are used to bracket tables of various types in the source text. The software will attempt to detect plain text tables, but if this goes wrong adding these commands can correct the analysis
Within these tables you can use other TABLE pre-processor commands to tailor the HTML generated (see "The TABLE commands").
Used to mark up a contents list in the source document. The software will attempt to automatically detect the presence and location of any contents list in the document, but the algorithm can be problematic, and only really works for numbered headings.
Delimits a section of raw HTML code to be copied to the output file unchanged.
BEGIN/END_CODE
BEGIN/END_DIAGRAM
BEGIN/END_PREDelimits sections of pre-formatted text. CODE refers to software samples whilst DIAGRAM refers to ASCII art. PRE is the more general "pre-formatted" text, although currently all 3 have the same implementation.
Delimits text that should be ignored. This could be anything from comments to copyright statements in the original source file that shouldn't appear in the converted document.
Certain directives can be used to alter the document properties. Often these affect how the document will be searched and indexed.
In HTML these mostly lead to tags in the <HEAD>..</HEAD> of each page. Often these tags produce no visible effect.
In RTF these lead to field in the document properties being filled in.
Examples include :-
TITLE
DESCRIPTION
KEYWORDS
STYLE_SHEET (HTML Only)
The DESCRIPTION and KEYWORDS commands may be continued on subsequent lines provided they also begin with the same $_$_<command> directive.
A large number of one-line directives exist. Those for tables are listed the section on The TABLE commands. Others include
CONTENTS_LIST
HTML_LINE
INCLUDE
LINERULE
NAVIGATION_BAR
TOC
A large number of in-line tags are available. These can be used to produce a number of useful effects. They include :-
BR (line break)
GOTO
HYPERLINK
TIMESTAMP
SPACES
SUPER and SUB
VARIABLE
These directives are used to tailor the HTML generated in any tables the software creates. They are placed either
At the top of the file
Directives placed here become defaults for the whole file, and will replace any policies that have been set (see the section on "Table Generation" in the AscToHTM manual)Inside a BEGIN_TABLE ... END_TABLE section
Directives placed here will apply only to the table marked up by these commands (see 7.1.2).
The table commands are described (naturally enough) in the following table.
Directive | Value | Effect |
---|---|---|
TABLE_ALIGN | Align | Specifies the alignment of the whole table. |
TABLE_BGCOLOR | Colour | Colour of background |
TABLE_BORDER | Number | Size of border. 0 = None |
TABLE_BORDERCOLOR | Colour | Colour of border |
TABLE_CAPTION | Text | Table caption. Added centred at the top |
TABLE_CELL_ALIGN | Align | Specifies the default alignment of |
cells. Left, right or center | ||
TABLE_CELLSPACING | Number | Spacing between cells. |
TABLE_CELLPADDING | Number | Padding inside each cell |
TABLE_COLO(U)R_ROWS | (none) | If present this specifies that the |
odd and even rows of the table should | ||
be coloured differently. See also the | ||
"Colour data rows" policy. | ||
TABLE_CONVERT_XREFS | (none) | If present, indicates that any section |
cross-references in the table may | ||
be converted to hyperlinks | ||
(see also the policy line | ||
"Convert TABLE X-refs to links") | ||
TABLE_EVEN_ROW_COLO(U)R | Colour | When data rows are to be coloured |
this specifies the colour of the | ||
even numbered rows. | ||
TABLE_HEADER_ROWS | Number | Number of header rows. These |
will be placed in <TH> .. </TH> markup | ||
TABLE_HEADER_COLS | Number | Number of header columns. |
These will be marked up in bold | ||
TABLE_IGNORE_HEADER | (none) | If present, indicates that the first |
few line (i.e. the header) should be ignored | ||
when calculating the column structure of the table. | ||
See also policy "Ignore table header during analysis" | ||
TABLE_LAYOUT | Layout | Explicit structure of table in terms of |
number of columns and their widths. | ||
See also policy "Default TABLE layout" | ||
TABLE_MAY_BE_SPARSE | (none) | If present, indicates that the TABLE |
may be sparse (see also the policy | ||
"Expect sparse tables") | ||
TABLE_MIN_COLUMN_SEPARATION | Number | Number of spaces to be taken as a |
column separator when analysing the | ||
table (see also the policy | ||
"Minimum TABLE column separation"). | ||
TABLE_ODD_ROW_COLO(U)R | Colour | When data rows are to be coloured |
this specifies the colour of the | ||
odd numbered rows. | ||
TABLE_WIDTH | Text | The width of the table (see also the |
policy "Default TABLE width") |
Colours should be HTML Colours which will placed in the various attributes of the <BODY> tag and other. The program simply transcribes your value into the output file.
- NOTE
- This feature has the potential to cause mayhem, and as such is offered to users on a "as is" basis. That is, we offer no support for getting this feature to have the effect a user may desire. That said, it's one of the most useful tags we know :-)
This directive allows you change a particular policy in part of a document. This is a potentially powerful feature, allowing you to tailor the conversion of your file in different sections of that file, or to embed the policy particular to a file in commands inserted at the top of the file itself.
The syntax of the command line is
$_$_CHANGE_POLICY <Policy Line>
where <Policy_line> is a policy line as it would appear in a policy file, and (usually) as it appears in the Policy manual.
For example the following would all be valid directives
$_$_CHANGE_POLICY Background Colour : red $_$_CHANGE_POLICY Ignore multiple blank lines : Yes
Although how and when they would take affect will depend on the policy.
For example, the background colour would only take effect if splitting the file up, and only on the next file generation. This works, BTW, so if anyone wants to split a file into many pages, all different colours, then be my guest.
There are a many caveats to this behaviour :-
- not all policies are supported
Not all policies may be changed in this way. In particular policies that open other policy files are not supported. Even if a policy if "changed", it does not follow that changing the policy will have an effect.
- analysis policies
It is unlikely that this feature can be sensibly used to influence the analysis of file, other than when placed at the top of the file only. If such a manner it is simply an alternative to using a separate policy file.
- output policies
Output policies are referenced at different times. Only those that are referenced after the line is read from the source file may be influenced, thus things like output file name may have no effect.
- toggleable policies
Not all policies once changed, can be changed back. This is particularly of policies that contain values to be added to a list. This is an issue that may be addresses in later versions.
- unpredictable behaviour
Messing with policies can cause unpredictable behaviour. For example if you alter the section splitting parameters, then the chances of a section cross-reference elsewhere in the document being calculated as a correct hyperlink diminishes.
That's why this feature is offered UNSUPPORTED
- readahead buffer
To further complicate matters, the software uses a readahead, write behind buffer which means that you may need to experiment with the placing of your policy change to within 40 lines (the size of the buffer).
This problem is alleviated since version 3.2.
Using pre-processor tags you can define "blocks" of text known as "definition blocks".
Definition blocks allow blocks of output to be defined out of sequence, that is the content is defined in one location, and then may be instantiated at a number of different locations.
A definition block has the form
$_$_DEFINE_BLOCK <block name> .. text that forms the block .. $_$_END_BLOCK
The text inside the block may contain in-line tags, but it cannot contain any other tag directives.
To invoke a block use the EMBED_BLOCK or INSERT_BLOCK commands.
One tag that is particularly useful inside blocks is the VARIABLE tag. You can define variables throughout the document and then quote them inside a define block.
A possible example of use would be the addition of "page" footers. You could define the text that goes inside a page footer, and include in it a variable called PAGE_NUMBER. You can then re-define the PAGE_NUMBER and output a new page boundary with the commands
$_$_DEFINE_VARIABLE PAGE_NUMBER 21 $_$_INSERT_BLOCK PAGE_FOOTER
having previously defined a PAGE_FOOTER block.
It should perhaps be pointed out that "pages" are anathema to HTML, but should you want this feature this is a possible implementation.
Some tags accept colour values. These values should be HTML colours which - for example - may be placed in the various attributes of the <BODY> tag.
You can enter any value acceptable to HTML. Normally a value is expressed as a 6-digit hexadecimal value in the range 000000 (black) to FFFFFF (white), but certain colours such as "white", "blue", "red" etc may also be recognised by HTML. The software (AscToHTM) simply transcribes your value into the output file. The list of colours recognised in the HTML standard is
Colour HTML Hex value Black #000000 Silver #C0C0C0 Gray #808080 White #FFFFFF Maroon #800000 Red #FF0000 Purple #800080 Fuchia #FF00FF Green #008000 Lime #00FF00 Olive #808000 Yellow #FFFF00 Navy #000080 Blue #0000FF Teal #008080 Aqua #00FFFF
Only these values will be converted by the software to the equivalent names. Other names exist outside the standard which may not be universally supported.
As far as possible tags support both British English and American English spellings. This mainly occurs with the word "colour" (or "color"), so for example the directives
$_$_TABLE_ODD_ROW_COLOUR ....
and
$_$_TABLE_ODD_ROW_COLOR ....
are equivalent.
Converted from
a single text file by AscToHTM © 1997-2004 John A. Fotheringham |