Documentation for the AscToPDF conversion utility |
The latest version of these files is available online at http://www.jafsoft.com/doco/docindex.html
The pre-processor allows authors to add special lines to the source document to customise the conversion. This is usually used where someone intends regularly generating PDF from a master text document.
The Pre-processor is described more fully in the separate document the Tag manual.
The pre-processor works by giving the software hints and instructions on how to process the text. During the analysis process the software reads the source files line-by-line. The pre-processor recognises special keywords in two ways
In both cases the tag or directive cannot be split over multiple lines, that is directives must be on a line by themselves, and in-line tags must be wholly contained on a single line.
Contents of this section
Pre-processor DirectivesDocument commands
Pre-processor In-line tags
Pre-processor command: DESCRIPTIONSection delimiters
Pre-processor command: KEYWORDS
Pre-processor command: TITLE
Pre-processor command: ALLOW and DISALLOWOther commands
Pre-processor command: CONTENTS
Pre-processor command: CODE
Pre-processor command: COMMA_DELIMITED_TABLE
Pre-processor command: DELIMITED_TABLE
Pre-processor command: DIAGRAM
Pre-processor command: IGNORE
Pre-processor command: PRE
Pre-processor command: TABLE
Pre-processor command: SECTION
Pre-processor command: BR
Pre-processor command: CHANGE_POLICY
Pre-processor command: FILENAME
Pre-processor command: IGNORE_THIS
Pre-processor command: INCLUDE
Pre-processor command: PAGE
Pre-processor command: VERSION
"Directives" consist of a single line in the source file beginning with the string "$_$_" followed by a recognised keyword and any additional "attributes" that the directive supports.
In-line tags, as the name implies, can occur anywhere in the source lines. They are enclosed between the special strings "[[" and "]]". Between these strings the tag consists of a keyword and then any attributes that tag supports.
These commands are used to control tags placed in the document information section of the created PDF page(s).
Not yet implemented in this release.
DESCRIPTION | Add a description in the document properties |
KEYWORDS | Add keywords to the document properties |
TITLE | Add a Title to the PDF pages |
You can specify a description of your page to be added as a to the document information portion of your page by adding a line of the form
$_$_DESCRIPTION <rest of line is used as a description>
This takes precedence over any description added via a policy file.
Not yet implemented
You can specify keywords that describe the contents of your page to be added to the document information section of your file by adding a line of the form
$_$_KEYWORDS <rest of line is used as a list of keywords>
This takes precedence over any keywords added via a policy file.
Not yet implemented
You can specify the TITLE to be added to your PDF document in the document information section, by adding a line of the form
$_$_TITLE <rest of line is used as a title>
This title takes precedence over any title added via a policy file.
Not yet implemented
These commands mark the start and end of various sections in your document
ALLOW and DISALLOW | Enable and disable certain types of detection. |
CONTENTS | Mark a section as the contents list. |
CODE | Mark a section as C-like code sample |
DIAGRAM | Mark a section as a diagram or ASCII Art |
IGNORE | Ignore a section of the document |
PRE | Mark a section as pre-formatted text. |
TABLE | Mark a section as a table. |
COMMA_DELIMITED_TABLE | Mark a section as comma-delimited data table. |
DELIMITED_TABLE | Mark a section as a tab-delimited data table. |
SECTION | Mark the start of a user-specified section. |
AscToPDF will automatically try to detect various typographical features. You can turn this behaviour on and off in different sections by using the ALLOW and DISALLOW. This can be used, for example, to prevent a numbered list being wrongly detected as a numbered heading and vice versa.
The syntax for both commands is the same, namely
ALLOW/DISALLOW <comma-separated list of keywords>
Where the recognised keywords are as follows
- Headings
- This enables/disable the search for lines that could be treated as headings.
- Lists
- This enables/disables the search for lines that could be regarded as list items (either unordered bullets, or alphabetic or numeric list points)
All | Set (enable) all of the above | ||
Reset | Reset (disable) all of the above |
In each case the tag will simply add or subtract from the current list of allowable features. To aid control, two special keywords "all" and "reset" are available for inclusion in the list. "Reset" will disable all options, thus
$_$_ALLOW reset, Headings
will have the effect of disabling everything (the "reset") and then adding "Headings" to the allowed list. In this respect "ALLOW all" and "DISALLOW reset" are identical commands.
Below is an example in which the DISALLOW tag is used to prevent numbered lines being regarded as lists or headings. The ALLOW tag at the end switched back to default behaviour,, so if there are any lists of numbered headings elsewhere in the document they will still be detected.
$_$_DISALLOW headings ... 1. Whatever this line is, it isn't a heading ... $_$_DISALLOW headings,lists ... 2. Whatever this line is, it isn't a heading or a list item ... $_$_ALLOW reset
You can mark up a section of your document as a contents list. To do this use matching BEGIN_CONTENTS and END_CONTENTS command as follows:
$_$_BEGIN_CONTENTS ... $_$_END_CONTENTS
AscToPDF will then attempt to treat the enclosed text as a contents list.
See comments on contents policies
You can mark up a section of your document as being a piece of sample C-like code. To do this use matching BEGIN_CODE and END_CODE command as follows:
$_$_BEGIN_CODE ... $_$_END_CODE
AscToPDF will then mark up the enclosed text in fixed width fonts.
See comments on pre-formatted text
These commands delimit a table of comma-delimited data
Syntax:
$_$_BEGIN_COMMA_DELIMITED_TABLE ... (lines of comma-delimited data) ... $_$_END_COMMA_DELIMITED_TABLE
The BEGIN_COMMA_DELIMITED_TABLE ... END_COMMA_DELIMITED_TABLE directives can be used to delimit a series of comma-delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)
See comments in Pre-processor command: TABLE
These directives delimit a table of delimited data
Syntax:
$_$_BEGIN_DELIMITED_TABLE [<delimiter>] ... (lines of delimited data) ... $_$_END_DELIMITED_TABLE
where
<delimiter> The delimiter character to use. If omitted
the default is tab-delimited. The delimiter
can be any character except a comma. For
comma-delimited tables use the
COMMA_DELIMITED_TABLE Command instead
The BEGIN_DELIMITED_TABLE ... END_DELIMITED_TABLE directives can be used to delimit a series of delimited data values that should be interpreted as a table (e.g. data originally exported from a spreadsheet such as Excel)
See comments in Pre-processor command: TABLE
You can mark up a section of your document as being a diagram or a piece of ASCII art. To do this use matching BEGIN_DIAGRAM and END_DIAGRAM commands as follows:
$_$_BEGIN_DIAGRAM ... $_$_END_DIAGRAM
AscToPDF will then mark up the enclosed text in fixed width fonts.
See comments on pre-formatted text
You can mark up a section in your document that you want ignored in the output. This can be used to store change history information or whatever you want.
Syntax:
$_$_BEGIN_IGNORE ... (text to be ignored) ... $_$_END_IGNORE
This markup can be used to delimit a section to be wholly ignored. Any markup and tags in the ignored section will have no effect.
You can mark up a section of your document as being pre-formatted text. To do this use matching BEGIN_PRE and END_PRE commands as follows:
$_$_BEGIN_PRE ... $_$_END_PRE
AscToPDF will then mark up the enclosed text in fixed width fonts.
See comments on pre-formatted text
You can mark up a section of your document as being a text table. To do this use matching BEGIN_TABLE and END_TABLE Commands as follows:
$_$_BEGIN_TABLE ... $_$_END_TABLE
General comments on marking up tables
AscToPDF has some ability to auto-detect tables (see comments on pre-formatted text), but this can be error prone. Marking up tables removes a lot of the ambiguity and so can give better results.
For tables of delimited data (as opposed to plain text tables) you should use the DELIMITED_TABLE and COMMA_DELIMITED_TABLE commands.
Note in each case the presence of these directives overrides any value set in the poliy file, as those policy values only refer to the auto-detection of tables. Placing markup in the source forces the text to be treated as tables.
You can mark up sections of your document as being named sections. By default text belongs to a section called "all".
To do so insert SECTION command at the start of each section as follows:
$_$_SECTION <name> ...
All following text will be marked as belonging to the named section until another SECTION command is encountered. AscToPDF will only copy across those sections named in the allowable sections policy, and any text in "all" sections. In this way you can generate variants of your document for different audiences (e.g. Internet and Intranet).
If you want the rest of your document to be included in all conversions, insert an "all" SECTION command as follows:
$_$_SECTION all ...
BR | Insert a line break |
CHANGE_POLICY | Dynamically vary policies through the input file |
FILENAME | Output the original filename |
IGNORE_THIS | Ignore some text in the source |
INCLUDE | Include an external file into the source |
PAGE | Create a page boundary at this location |
VERSION | Output the program version used in this conversion |
This command tells the software to output a line break at this point. Usually the default is to let all lines flow together to form a paragraph. This commands can be used (e.g. in address lines to make sure lines are correctly placed on new lines).
This option allows you to embed policy lines in the source document. This can be used to avoid the need for separate policy files, or to change the policy at different locations within the document (although the effects can sometimes be unpredictable).
The syntax is
$_$_CHANGE_POLICY <policy line as in policy file>
For example placing
$_$_CHANGE_POLICY Convert mailto links : yes
would make all subsequent email addresses be converted into working hyperlinks. By adding several lines of this type you can toggle this behaviour on and off, controlling which email links become hyperlinks and which do not.
This in-line tag substitutes the name of the files being converted
Syntax:
[[FILENAME]]
The tag will be replaced by the name of the file being converted. This facilitates the construction of sentences like
"This file was converted from [[FILENAME]] at [[TIMESTAMP]] "
which becomes
"This file was converted from asctopdf.txt at 17-Apr-2006"
This is an in-line tag whose contents are ignored. Could be used for comments
Syntax:
[[IGNORE_THIS <anything_you_like>]]
This tag is ignored. It is replaced by a single space in the output stream. It could be used to add a brief comment to your source that would not appear in the output.
See also the IGNORE command
You can include one source file in another by using the include command as follows:-
$_$_INCLUDE filename
Make sure the file is accessible from wherever AscToPDF is run, or in the same directory as the original source file. AscToPDF will read the file on each pass, treating its contents as part of the main file for both analysis and conversion purposes.
Note, the include file should be plain text, which will be converted as normal for the document. It may include other pre-processor commands including further INCLUDE commands up to a limit of 9 levels. Be careful not to set up include loops (i.e. a includes b include c includes a etc).
Include files like this can be a useful way of embedding standard disclaimers etc, and compliment the use of header and footers.
The syntax is
$_$_PAGE
This signals a page boundary. In PDF generation a page break will be generated at this point.
This in-line tag adds a description of the program name/version used to convert the files (e.g. "AscToPDF 2.1")
Syntax:
[[VERSION]]
Outputs the version name of the conversion into the output file. For example "AscToHTM 5.0".
Converted from
a single text file by
AscToHTM © 2006 John A Fotheringham |