Documentation for the AscToHTM conversion utility : Policies supported by AscToHTM

Documentation for the AscToHTM Text to HTML converter

Policies supported by AscToHTM

This chapter details all the policy options supported by AscToHTM.

There is an Alphabetical list of policies available in AscToHTM available. The policies themselves are grouped together according to the area of functionality they support as follows :-

Analysis policies

'What to look for' policies
General analysis policies
Bullet policies
Contents list policies
File structure policies
Headings policies
Pre-formatted text policies
Table analysis policies

Output policies

Added HTML policies
Advanced HTML policies
Colour policies
CSS policies
Directory page policies
File generation policies
File splitting policies
Font policies
Frames policies
Hyperlink policies
Pre-processor policies
Style policies
Table generation policies
Configuration file policies

Other policies

Settings policies
Policies set elsewhere in the user interface
Policies that need to be added manually

Alphabetical list of policies available in AscToHTM

The following is a list of all the policies supported by this version of AscToHTM:-

Policy name	Default value	GUI location
Active Link Colour		Colour policies
Add <BR> to lines with URLs	No	Hyperlink policies
Add contents frame if possible	Yes	Frames policies
Add contents list	No	Contents List policies
Add emphasis and bold markup	Yes	Style policies
Add Frame border	Yes	Frames policies
Add mail headers to contents list	Yes	Contents List Policies
Add navigation bar	No	File Splitting policies
Add NOFRAMES links	Yes	Frames policies
Allow automatic centring	No	Style policies
Allow definitions inside PRE	Yes	Pre-processor policies
Allow email beginning with numbers	No	Hyperlink policies
Attempt TABLE generation	Yes	Table analysis policies
Author URL		Added HTML policies
Automatic centring tolerance	2	Style policies
Background Colour		Colour policies
Background Image		Added HTML policies
Base FONT Size in pts	10	(deprecated)
Bookmark URL		Added HTML policies
Break up long HTML lines	Yes	File generation policies
Bullet char		Bullet policies
Center first heading	No	Headings policies
Character encoding		Advanced HTML policies
Check domain name syntax	Yes	Hyperlink policies
Check indentation for consistency	Yes	Headings policies
Colour data rows	No	Table Generation Policies
Column boundaries have zero width	No	Table analysis policies
Column merging factor	5	Table analysis policies
Comment generation code	0	Advanced HTML policies
Contents frame background colour		Frame colours dialog
Contents frame text colour		Frame colours dialog
Contents Frame width	0	Frames policies
Contents style code	0	Policies that need to be added manually
Convert TABLE X-refs to links	No	Table Generation Policies
Copyright URL		Added HTML policies
Could be blank line separated	Yes	Table analysis policies
Create FTP links	Yes	Hyperlink policies
Create Gopher links	Yes	Hyperlink policies
Create hyperlinks	Yes	Hyperlink policies
Create mailto links	Yes	Hyperlink policies
Create NEWS links	Yes	Hyperlink policies
Create Telnet links	Yes	Hyperlink policies
Cross-refs at level	2	Hyperlink policies
Default Font	Times New Roman,,10	Font policies
Default TABLE alignment	(none)	Table Generation Policies
Default TABLE border colour		Table Generation Policies
Default TABLE border size	0	Table Generation Policies
Default TABLE caption		Policies that need to be added manually
Default TABLE cell alignment	(none)	Table Generation Policies
Default TABLE cell padding	2	Table Generation Policies
Default TABLE cell spacing	2	Table Generation Policies
Default TABLE colour		Table Generation Policies
Default TABLE delimiter character		Policies set elsewhere in the user interface
Default TABLE even row colour	E0F0E0	Table Generation Policies
Default TABLE header cols	0	Policies that need to be added manually
Default TABLE header rows	0	Table Generation Policies
Default table html attributes		Table Generation Policies
Default table html cell attributes		Table Generation Policies
Default TABLE layout		Table analysis policies
Default TABLE odd row colour	F0F0F0	Table Generation Policies
Default TABLE width		Table Generation Policies
Definition char	'-' (weak)	General analysis policies
Definitions file		Configuration file Policies
Delta FONT size in pts	0	(deprecated)
Directory Description		Directory Page Policies
Directory filename		Directory Page Policies
Directory footer file		Directory Page Policies
Directory header file		Directory Page Policies
Directory Keywords		Directory Page Policies
Directory return hyperlink text	Directory	Directory Page Policies
Directory Script file		Directory Page Policies
Directory Title		Directory Page Policies
Display messages	Yes	Settings Policies
Document Base URL		Advanced HTML policies
Document Description		Added HTML policies
Document Keywords		Added HTML policies
Document Style Sheet		CSS policies
Document Title		Added HTML policies
DOS filename root		File generation policies
Error reporting level	5	Settings Policies
Expect alphabetic bullets	Yes	Bullet policies
Expect blank lines between paras	Yes	General analysis policies
Expect Capitalised Headings	Yes	Headings policies
Expect code samples	No	File Structure policies
Expect Contents List	Yes	Contents List policies
Expect embedded Headings	Yes	Headings policies
Expect multiple COLSPAN values	No	Table analysis policies
Expect Numbered bullets	Yes	Bullet policies
Expect Numbered Headings	Yes	Headings policies
Expect Roman Numeral bullets	Yes	Bullet policies
Expect Second Word Headings	No	Policies that need to be added manually
Expect sparse tables	No	Table analysis policies
Expect Underlined Headings	Yes	Headings policies
External contents list filename		Contents List policies
First frame page number	1	Frames policies
First Section Number	1	Policies that need to be added manually
Fixed Font	Courier,,8	Font policies
Footer frame background colour		Frame colours dialog
Footer Frame depth	0	Frames policies
Footer frame text colour		Frame colours dialog
Fragments file		Configuration file Policies
Generate diagnostics files	No	File generation policies
Generate external contents file	No	Contents List policies
Hanging paragraph position(s)	0	General analysis policies
Header frame background colour		Frame colours dialog
Header Frame depth	0	Frames policies
Header frame text colour		Frame colours dialog
Heading Font	Arial,,10	Font policies
Heading key phrases		Headings policies
Headings Colour		Style policies
Highlight Definition Text	No	Style policies
HTML footer file		Advanced HTML policies
HTML header file		Advanced HTML policies
HTML Script file		Advanced HTML policies
Hyperlinks on numbers	No	Contents List policies
Ignore multiple blank lines	No	Style policies
Ignore table header during analysis	No	Table analysis policies
Include document section(s)	(none)	Pre-processor policies
Indent headings in Directory	Yes	Directory Page Policies
Indent position(s)	0	General analysis policies
Input file contains DOS characters	No	File Structure policies
Input file contains Japanese characters	No	File Structure policies
Input file contains Mime encoding	No	File Structure policies
Input file contains PCL codes	No	File Structure policies
Input file contains UNICODE characters	No	File Structure policies
Input file has change bars	No	File Structure policies
Input file has page markers	No	File Structure policies
Input file is double spaced	No	File Structure policies
Keep it simple	No	File Structure policies
Keep original table lines in output	No	Table Generation Policies
Largest allowed <Hn> tag	2	Style policies
Lines to ignore at end of file	0	File Structure policies
Lines to ignore at start of file	0	File Structure policies
Link Dictionary file		Configuration file Policies
Look for this and that emphasis	Yes	'What to look for' policies
Look for bullets	Yes	'What to look for' policies and Bullet policies
Look for character encodings	Yes	'What to look for' policies
Look for diagrams	Yes	'What to look for' policies
Look for hanging paragraphs	Yes	'What to look for' policies
Look for horizontal rulers	Yes	'What to look for' policies
Look for indentation	Yes	'What to look for' policies
Look for MAIL and USENET headers	Yes	'What to look for' policies
Look for preformatted text	Yes	'What to look for' policies
Look for quoted text	Yes	'What to look for' policies
Look for short lines	Yes	'What to look for' policies
Look for underlined text	Yes	'What to look for' policies
Look for white space	Yes	'What to look for' policies
Make Directory	No	Directory Page Policies
Max length of candidate table lines	256	Table analysis policies
Max length of heading links	0	Hyperlink policies
Maximum level to show in contents	0	Contents List policies
Maximum number of files per index page	0	Directory Page Policies
Min chapter size	8	General analysis policies
Min HTML file size	-1	File Splitting policies
Minimise HTML file size	No	File Splitting policies
Minimum automatic <PRE> size	-1	Pre-formatted text policies
Minimum ruler length	4	'What to look for' policies
Minimum TABLE column separation	1	Table analysis policies
New browser window name	other	Hyperlink policies
New frame link window name	_top	Frames policies
New Paragraph Offset	-1	General analysis policies
NOFRAMES link URL		Frames policies
Number of levels in contents frame	0	Frames policies
Number of words to include in filename	1	File Splitting policies
Omit <HEAD> and <BODY> from output	No	Advanced HTML policies
Only allow explicit FTP links	Yes	Hyperlink policies
Only allow pages to be viewed in frames	Yes	Frames policies
Only use known groups	Yes	Hyperlink policies
Open frame links in new window	Yes	Frames policies
Open links in new browser window	No	Hyperlink policies
Output file extension		File generation policies
Output frame name	*_frame.html	Frames policies
Output policy file	No	File generation policies
Output policy filename		File generation policies
Page marker size (in lines)	0	File Structure policies
Page Width	80	General analysis policies
Place document in frames	No	Frames policies
Preserve file structure using <PRE>	No	File generation policies
Preserve line structure	No	File generation policies
Preserve New Paragraph Offset	Yes	General analysis policies
Preserve underlining of headings	Yes	Style policies
recognize '-' as a bullet	Yes	Bullet policies
recognize 'o' as a bullet	Yes	Bullet policies
Recognised USENET groups		Hyperlink policies
Search for definitions	Yes	General analysis policies
Short line length	-1	General analysis policies
Shortcut Icon URL		Added HTML policies
Show file titles in directory	Yes	Directory Page Policies
Smallest allowed <Hn> tag	5	Style policies
Split at page markers	No	File Splitting policies
Split level	-1	File Splitting policies
Suppress all colour markup	No	Colour policies
Suppress INFO messages	No	Settings Policies
Suppress program ERROR messages	No	Settings Policies
Suppress TAG ERROR messages	No	Settings Policies
Suppress URL messages	Yes	Settings Policies
Suppress WARNING messages	No	Settings Policies
TAB size	8	General analysis policies
Table extending factor	5	Table analysis policies
Table Font	Times New Roman,,10	Font policies
Table of contents Font	Arial,,10	Font policies
Text Colour		Colour policies
Text command file	Yes	Configuration file Policies
Text justification	LEFT	File Structure policies
Treat each line as a paragraph	No	File generation policies
Unvisited Link Colour		Colour policies
Use <CODE>..</CODE> markup	No	Policies that need to be added manually
Use <DL> markup for defn. paras	Yes	Style policies
Use and markup	Yes	Style policies
Use <P> markup for paragraphs	Yes	Style policies
Use any existing contents list	Yes	Contents List policies
Use CSS to implement fonts	Yes	Font policies
Use DOS filenames	No	File generation policies
Use first heading as Title	No	Added HTML policies
Use .HTM extension	No	File generation policies
Use first line as heading	No	Headings policies
Use first line as Title	No	Added HTML policies
Use main footer in footer frame	Yes	Frames policies
Use main header in header frame	Yes	Frames policies
Use numbered sequence in file names	No	File Splitting policies
Use Preprocessor	Yes	Pre-processor policies
Visited Link Colour		Colour policies

Analysis policies

Menu location: Configuration Options -> Analysis Policies

Analysis policies are usually calculated by AscToHTM by making a first pass through your document. The resulting policies are then used during the second, conversion pass to categorise all input lines so that they may be correctly converted to HTML.

You should only need to change these policies should the analysis fail.

'What to look for' policies
Bullets
Contents
File generation
Headings Policies
General Analysis
Pre-formatted text
Table analysis

'What to look for' policies

Menu location: Configuration Options -> Analysis Policies -> What to look for

These policies act as "broad stroke" policies enabling or disabling areas of functionality within the software by telling it what to look for and to try to detect.

For example you can tell the program whether or not to bother looking for patterns of indentation, bullets, or numbered lists. In many cases if you enable a policy you can further fine tune the conversion details on other policy sheets.

Look for indentation
Look for paragraphs
Look for hanging paragraphs
Look for short lines
Look for horizontal rulers
- Minimum ruler length
Look for bullets and numbered lists
Look for definitions
Look for quoted text
Look for emphasis
Look for underlined text
Look for mail and USENET headers
Look for character encodings
Look for regions of preformatted text
Look for diagrams

Look for indentation

Menu location: Configuration Options -> Analysis Policies -> What to look for

AscToHTM can attempt to detect the indentation pattern of your document and replicate it in the output file. If you chose to disable this policy, all your text will be output with no indentations at all.

If the program is wrongly indenting your files, you can try adjusting the pattern of indentation on the
General Analysis tabbed policy sheet.

Look for white space

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will attempt to look for paragraphs in your source. Usually this is signaled by a blank line between paragraphs, a leading indent on the first line of each paragraph, or (in extreme cases) a short line at the end of a paragraph.

If you don't want AscToHTM to detect paragraphs, disable this policy.

If AscToHTM is wrongly detecting paragraphs, try adjusting the paragraph analysis policies on the General Analysis tabbed policy sheet.

Look for hanging paragraphs

(New in version 5.0)

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will attempt to look for paragraphs whose first line is offset relative to the rest of the paragraph. Where this happens, lines at different indentations will be treated as a single paragraph. Where this fails to happen a first line of a paragraph may be treated as a separate indented paragraph.

If you don't want AscToHTM to detect these offsets, disable this policy.

See the discussion in Hanging paragraph indent detection

Look for short lines

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will attempt to detect short lines and preserve their structure by adding a line break. Disabling this will cause short lines to be merged into the surrounding paragraph's text.

If AscToHTM is wrongly handling your short lines, you can adjust the short line cutoff point or the page width (which is used in short line detection) in the Sizes section of the General Analysis tabbed policy sheet.

Look for horizontal rulers

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will treat a series of hyphens, minus signs, equal signs on the same line as a horizontal rule. (On occasion it might be regarded as underlining a heading on the previous line).

You can disable this is you wish, or you can specify how many "line" characters it takes to make a horizontal rule.

Minimum ruler length

Menu location: Configuration Options -> Analysis Policies -> What to look for

Specifies the minimum length of any line to be considered as a horizontal rule when the program is looking for such lines (see Look for horizontal rulers)

Lines in the source text that consist only of "rule-like" characters, and which exceed this minimum length will be marked up as Rules, rather than as the original character sequence.

Look for bullets and numbered lists

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will try to detect bullet points and numbered lists. This can sometimes go wrong if you have lines that look to the program like bullet points.

You can disable this behaviour should you wish. Alternatively you can fine tune the detection of bullets on the bullet analysis tabbed policy sheet.

Look for definitions

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will try to detect definitions and notes, usually in the form of a single word and a hanging paragraph.

This can often go wrong, so you can use this policy to disable this feature.

Look for quoted text

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will try to identify "quoted" lines. Quoted lines are lines that have had a single character (often ">" or "!") inserted at the start. This is common practice when quoting email in a reply. AscToHTM places such text in italics.

You can disable this behaviour should you wish.

Look for this and that emphasis

Menu location: Configuration Options -> Analysis Policies -> What to look for

AscToHTM will try to look for text that has been marked up with underscores and asterisks to signify bold an italic text. For example

*This is bold* and _this is italic_

becomes

This is bold and this is italic

This appears on-screen as "Look for emphasis"

Look for underlined text

Menu location: Configuration Options -> Analysis Policies -> What to look for

AscToHTM will try to detect where a line of text has been "underlined" by following it by a same length row of dashes, hyphens, equal signs etc. This text will then be regarded as a candidate for being an underlined heading or - if those are not allowed - underlined text.

If you have tables and reports, you may want to switch this policy off since the line at the end of a table may appear to under- or over-line the last line of text in the table.

Look for mail and USENET headers

Menu location: Configuration Options -> Analysis Policies -> What to look for

AscToHTM will try to look for email and USENET headers. Where these are recognised they can be simplified so that only the To, Form and Subject lines are shown in the output.

You can disable this behaviour should you wish.

Look for character encodings

Menu location: Configuration Options -> Analysis Policies -> What to look for

Specifies whether or not the software should attempt to detect alternative character sets, such as those used for languages such as Greek, Turkish, Chinese etc.

The software does this by doing a statistical analysis on the characters used in the source file. This process isn't perfect, and when it fails you will need to manually set the correct character set using the Character encoding policy.

If you find the program is wrongly detecting the character set, disable this policy and/or manually set it using the Character encoding policy

Look for preformatted text

Menu location: Configuration Options -> Analysis Policies -> What to look for

By default AscToHTM will try to identify regions of preformatted text. Once identified AscToHTM will try to decide if it's a diagram, table or some other form of preformatted text. If it thinks it's a table it will attempt to place the text in an appropriate table structure.

You can disable the search for preformatted text, or if you allow preformatted text, disable table generation. (This may be appropriate if you have a large number of ASCII diagrams in your text).

The search for preformatted text can be refined via the Pre-formatted text and Table analysis
tabbed policy sheets.

The output of tables can be fine-tuned via the output policy Tables tabbed policy sheet.

Look for diagrams

Menu location: Configuration Options -> Analysis Policies -> What to look for

Specifies whether or not regions of preformatted text that are detected should be considered as candidate diagrams. Text that contains numbers of characters such as "|", "-", ">" and "<" may be considered to be an ASCII diagram.

If you find the program is wrongly treating tables as diagrams then disable this policy.

General analysis policies

Menu location: Configuration options -> Analysis Policies -> General Layout

These policies aid AscToHTM's analysis by describing in detail what the contents of the document being converted are

Sizes

Page Width
TAB Size
Short line length
Min Chapter Size

Paragraphs

Blank lines between paragraphs
New paragraph offset

Definitions

Search for definitions in source text
Definition paragraph indent levels
recognize hyphen characters
recognize colon characters
Other definition characters

Layout

Indentation levels

Page Width

Menu location: Configuration options -> Analysis Policies -> General Layout

This indicates the width (in characters) of your nominal output page. This width is calculated from the observed line lengths in the original document.

This width is used in short line calculation, and determining whether a given line contains a definition term or not (definition character near the start of the line).

In documents that contain line feeds this should be automatically detected.

In other documents you may need to set this manually.

TAB size

Menu location: Configuration options -> Analysis Policies -> General Layout

This indicates the size (in characters) of your tabs. AscToHTM converts all tabs to spaces on conversion before analysis. By default a tab size of 8 characters is assumed.

The tab size can influence the analysis of paragraph indentations and other layout. Provided they are used consistently there shouldn't be a problem. However where tabs and spaces are used in combination, mistakes can arise.

This is particularly true in tables of data. AscToHTM does not expect tab-separated table cells, instead converting the tabs to spaces and analysing the results.

If your source document has been created with an editor with a different tab size, you should change this value should you start to experience strange layout conversion problems.

Short Line Length

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy is used to determine what is a "short line". Short lines are treated specially by AscToHTM by adding a paragraph marker on the end. They can also be used to detect ends of paragraphs in those documents that don't have blank lines between paragraphs.

Normally AscToHTM will determine whether or not a line is short by comparing it to the page width, given the current context.

The default value is 0 characters (indicating a comparison to Page Width should be used). Set this to any value you like. A value of 80 is likely to make every line in your original document have a <BR> tag on the end.

Min Chapter Size

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy tells AscToHTM what the smallest chapter size may be. This is used when trying to determine if a numbered line is a chapter heading. AscToHTM tries to avoid treating numbered lists as a series of small chapters using this policy.

The default value is 8 lines. Change this only if you suspect small chapters are being ignored, or large list items are being treated as chapter headings.

Expect blank lines between paras

Menu location: Configuration options -> Analysis Policies -> General Layout

AscToHTM can detect whether or not it should expect blank lines between paragraphs. Documents without blank lines between paragraphs will be harder to convert, and errors are more likely. Unfortunately text documents exported from Word for Windows often have this property.

Where there are no blank lines, AscToHTM relies of spotting the last line of a paragraph (usually shorter), and (in some documents) the presence of a hanging indent at the start of each new paragraph.

This should be automatically detected.

New Paragraph Offset

Menu location: Configuration options -> Analysis Policies -> General Layout

Some documents start the first line of a new paragraph with an offset of a number of characters. This is especially true in text files saved from Word for Windows documents.

AscToHTM can sometimes confuse such paragraphs as being two different levels of indentation. Use this policy to eliminate such confusion.

This should be automatically detected

Search for definitions

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy can be used to disable the search for definitions. Sometimes this leads to unexpected results with text that is not part of a definition being treated as such. In such cases you can adjust the definition policies, but if this still fails, use this to disable the search completely.

Hanging paragraph position(s)

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy identifies the indentations used for the follow-on text in definition paragraphs. These indentation levels need not be the same as the indentation levels used for normal text, though of course often they are.

This should be detected automatically, but if your document has only a few examples it's possible AscToHTM will ignore them. In such cases you may need to set this policy manually.

This appears on-screen as "Definition paragraph indent levels"

Recognize hyphen characters

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy specifies whether or not hyphen (-) characters are used in one-line definitions.

If the hyphen character only occurs in definitions, then set the nearby always flag, otherwise AscToHTM will have to guess whether a particular character is part of a definition or not. This is sometimes a source of conversion errors.

If this policy is selected, it will result in a suitable Definition Char line being added to the policy file.

This should be detected automatically.

Recognize colon characters

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy specifies whether or not colon (:) characters are used in one-line definitions.

If the colon character only occurs in definitions, then set the nearby always flag, otherwise AscToHTM will have to guess whether a particular character is part of a definition or not. This is sometimes a source of conversion errors.

If this policy is selected, it will result in a suitable Definition Char line being added to the policy file.

This should be detected automatically.

Definition Char

Menu location: Configuration options -> Analysis Policies -> General Layout

This policy specifies a character that may be used in one-line definitions.

This may be detected automatically, but more likely you'll need to specify it yourself.

This specifies the characters used to detect "definitions". A definition line is a single line that appears to be defining something. Usually this is a line with either a colon (:) or an equals sign (=) in it. For example

        IMHO = In my humble opinion
        Address : Somewhere over the rainbow.

The character can be marked as "Strong" or "weak". Strong means such characters always signal a definition. Weak means they only sometimes do this, depending on the position relative to the Page Width.

For example for the above example you'd expect policy lines in the form

        Definition Char : '=' (strong)
        Definition Char : ':' (strong)

This appears on-screen as "Other definition characters", and each character entered into that field will get it's own "Definition Char" policy line

The user interface presents this more cleanly than the text in the policy file does, and the latter may be changed in later versions.

Indent position(s)

Menu location: Configuration options -> Analysis Policies -> General Layout

AscToHTM recognises multiple levels of indentation. This policy shows the character levels at which indentation has been detected.

AscToHTM converts all tab characters into multiple spaces in input. These indentation positions are the positions that result after that conversion. Depending on your tab settings these might not be exactly the positions you would expect.

Normally these levels are correctly detected automatically, but should you wish to set them manually you may need to experiment slightly to see how AscToHTM has handled your tabs.

Bullet policies

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

AscToHTM should be able to detect the use of bullets on a reasonably sized document. These policies describe the type of bullets expected.

Automatically detect bullets and numbered lists

Expected Bullet types

numbered bullets
alphabetic bullets
roman numeral bullets

Bullet characters

recognize hyphen character as a bullet point
'recognize an "o" character as a bullet point'
Other bullet point characters

See also the discussion on Bullets and list detection

Look for bullets

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not the program should attempt to automatically detect bullets and numbered lists. This should normally be left on unless your document has no such features, but the program (wrongly) thinks it has.

This appears on the Bullets dialog as "Automatically detect bullets and numbered lists", but is identical to the "Look for bullets" policy on the 'What to look for' policies tabbed property sheet

Expect Numbered bullets

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not numbered bullet points are expected. The numbered bullets can be followed by any punctuation, thus 1., 2) and (3) will all be recognised, but HTML will not necessarily support this in the markup produced.

This should be automatically detected.

Expect alphabetic bullets

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not alphabetic bullet points are expected. The numbered bullets can be followed by any punctuation, thus a., b) and (c) will all be recognised, but HTML will not necessarily support this in the markup produced.

Both upper and lower case bullets are recognised (and supported in the markup).

This should be automatically detected

Expect roman numeral bullets

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not roman numeral bullet points are expected. The numbered bullets can be followed by any punctuation, thus i., ii) and (iii) will all be recognised, but HTML will not necessarily support this in the markup produced.

Both upper and lower case bullets are recognised (and supported in the markup), although the range of roman numeral values supported is limited.

This should be automatically detected.

Recognize '-' as a bullet

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not bullet points starting with the hyphen character '-' are expected.

This appears on-screen as "recognize hyphen character as a bullet point"

This should be automatically detected.

Recognize 'o' as a bullet

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy states whether or not bullet points starting with the lower case 'o' are expected.

This appears on-screen as "recognize 'o' character as a bullet point"

This should be automatically detected.

Bullet Char

Menu location: Conversion Options -> Analysis Policies -> Bullets and lists

This policy lists any other characters that are to be recognised as bullet characters.

This appears on-screen as "Other bullet point characters"

Each bullet character entered will appear in the policy file as it's own "Bullet Char" line as follows:-

        Bullet Char : '<'

This should be automatically detected, but may sometimes need to be manually entered.

Contents list policies

Menu location: Conversion options -> Analysis Policies -> Contents List
Menu location: Conversion options -> Output Policies -> Contents List

This dialog shows both analysis and output policies connected with contents list detection and generation.

Diagnostics

Expect contents list

Output

Add contents list
Number of levels to show in contents list
Calculate contents list :-
- Generate contents from observed headings
- Generate contents from document's contents list
Place contents in an external file
- External contents list filename
Contents list hyperlinks
Add mail headers to contents list

Expect contents list

Menu location: Conversion options -> Analysis Policies -> Contents List

This policy specifies whether or not the document already contains a contents list. If it does, AscToHTM will attempt to convert the existing list into a series of hyperlinks.

This should be detected automatically, but occasionally you will need to set this policy manually.

See the discussion on contents list generation

Add contents list

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies whether or not AscToHTM should generate a contents list. This lost will be generated from either an existing list in the source documents, or from the observed headings.

A link to "contents list" will be added to the main HTML file.

See the discussion on contents list generation

Use any existing contents list

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies whether or not you wish to use any existing contents list found in the source document. If you disable this option, any contents found will be discarded. A contents list will only be added if you select the Add contents list policy

This option is not fully supported, as it depends on the software correctly detecting the contents list, and then matching the contents lines with headings in the document.

If this option is disabled (the default) any contents list will be created from the headings detected in the document (this is the recommended option)

On-screen this appears as a choice between calculating the contents from observed headings, or from the contents list in the document

See the discussion on contents list generation

Generate external contents file

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies that an external contents list file should be generated. This is not always possible.

See the discussion on contents list generation

External contents list filename

Menu location: Conversion options -> Output Policies -> Contents List

When Generate external contents file is enabled, this specifies the name of the external contents list file.

The default will be "contents_<input_filename>.html"

Hyperlinks on numbers

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies whether the hyperlink in a contents list entry should be placed on the section number, or the section title.

This appears on-screen as a choice between placing the link on the number or the title, but in the policy file this is simply a "Yes/No" option on placing the link on the number.

This is only relevant for documents with numbered section headings.

Maximum level to show in contents

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies whether the hyperlink in a contents list entry should be placed on the section number, or the section title.

This is only relevant for documents with numbered section headings.

Add mail headers to contents list

(New in version 5.0)

Menu location: Conversion options -> Output Policies -> Contents List

This policy specifies whether or not mail headers should be included in any generated contents list. For a mail digest this should be yes, but for larger documents with only a couple of mail headers the answer may be no.

File structure policies

Menu location: Conversion Options -> Analysis Policies -> File Structure

These policies aid AscToHTM's analysis by describing some of the file structure that would affect the analysis.

Expect only a simple layout

Expected File contents

'Expect "C"-code samples'
Contains DOS characters
Contains UNICODE characters
Contains PCL printer codes
Contains non-European (e.g. Japanese) characters
Contains mime-encoded quotable characters
File has change bars
File has Page markers
Page marker size (in lines)

Text Attributes

Text justification
File is double spaced

Text to ignore

Number of lines to ignore at start of document

Number of lines to ignore at end of document

Keep it simple

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM puts a lot of effort into detecting overall structure such as headings etc.

In documents that don't have any such structure, AscToHTM is liable to convert any line with a number at the start into a heading.

To prevent this, you can mark the document as simple, that is with no global structure. In a simple document AscToHTM will attempt far less analysis.

This appears on-screen as "Expect only a simple layout"

AscToHTM attempts to automatically identify simple documents, but you may still need to set this policy manually.

Expect Code samples

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM can markup C-like code fragments in <PRE>...</PRE> tags to preserve the layout and readability of the quoted code.

This may be automatically detected, but occasionally needs to be manually corrected.

Input file contains DOS characters

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM can convert files that use the DOS (OEM) character set. By default the file is assumed to be in the ANSI character set, but some files may have originated under DOS.

This may be automatically detected, but usually needs to be manually set.

Input file contains UNICODE characters

(New in version 5.0)

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM has some ability to detect files that contain Unicode characters or which are themselves encoded using Unicode. Where Unicode is detected on input, and UTF-8 file will be created as output, and extra care will be taken when processing file not to split any of the multi-byte character values.

This should be automatically detected, but may sometimes needs to be set manually.

See Working with Unicode

Input file contains PCL codes

(New in version 5.0)

Menu location: Conversion Options -> Analysis Policies -> File Structure

Indicates that the input file contains PCL printer codes. When set, the program will make whatever sensible use it can of these codes, otherwise they will be removed.

Please note that the PCL printer codes offer a rich command language that may be used to drive graphical printers. As such the emulation possibilities in a text converter are limited, and it is quite likely that files that make heavy use of such codes will fail dramatically to convert.

That said, those codes that are not recognised will be eliminated from the output.

Input file contains Japanese characters

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM is designed to convert ASCII files, and is biased to parsing the structure of files written in European languages.

However, this policy can tell the program that there are some non-ASCII characters (Japanese, Chinese, Korean etc) present. When this is the case certain conversions of exotic characters to HTML entities are disabled so that there is some prospect of the HTML still being readable in a browser.

There is a limited ability for this policy to be auto-detected.

Input file contains MIME encoding

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM can convert mime-encoded quotable characters. These will usually appear in files that were originally part of an email message. Such files use the "=" character to escape special characters. So for example "=20" should be interpreted as a space.

This appears on-screen as "Contains mime-encoded quotable characters"

This may be automatically detected in files where the "=" is used to break up long lines, but more usually you will need to manually set this.

Input file has change bars

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM can strip out change bars in documents that contain them. Change bars are usually a vertical bar '|' placed in the leftmost or rightmost column.

Currently this is not automatically detected, and so will need to be manually switched on.

Input file has page markers

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM has a limited ability to remove page markers. These are normally a few lines following a form feed (FF) character, containing page numbers etc. This will commonly occur with files generated from older software packages.

Page marker size (in lines)

Menu location: Conversion Options -> Analysis Policies -> File Structure

The number of lines after each form feed (FF) that should be ignored. These lines will not be copied to the output.

Text Justification

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM recognises documents that are left justified (default), right justified, centred or both left and right justified (confusingly known as "justified").

The program cannot currently mark up the text in a matching style, but this policy is important in the analysis. For example "justified" documents are padded with extra white space which could be interpreted as pre-formatted text where the document not recognised as being justified.

Normally this policy is correctly detected automatically.

Input file is double spaced

Menu location: Conversion Options -> Analysis Policies -> File Structure

AscToHTM will normally treat a blank line as a break between paragraphs. Some files have extra CR/LF characters (usually if they've come from a different computer, or from a printer package). In such cases AscToHTM will see every second line as blank, and this will affect the analysis, usually by turning each line of data into a separate paragraph.

If you have such a file, use this policy to mark the file as double spaced to get better results.

Lines to ignore at start of file

(New in version 5.0)

Menu location: Conversion Options -> Analysis Policies -> File Structure

This specifies how many lines from the input files should be ignored at the start of the file. These lines will be discarded from the output.

This can be useful when converting file copied from a news feed or whatever that adds a small data header to the file.

Lines to ignore at end of file

(New in version 5.0)

Menu location: Conversion Options -> Analysis Policies -> File Structure

This specifies how many lines from the input files should be ignored at the end of the file. Up to 40 lines may be ignored in this way. These lines will be discarded from the output.

This can be useful when converting file copied from a news feed or whatever that adds a small data footer to the file.

Headings policies

Menu location: Conversion options -> Analysis Policies -> Headings

These policies determine the headings structure that the document is expected to have. Normally these are calculated correctly by AscToHTM, but due to the complexity of heading detection, you may sometimes need to correct the analysis.

At the top of the dialog you can specify what type of headings you expect to see. Any combination is allowed, although usually documents use just one type of heading.

Expect Numbered headings
Expect Underlined headings
Expect Capitalised headings
Expect Embedded headings
Heading Key phrases
Use first line as heading
Center first heading
Check indentations of headings are consistent

If numbered headings are expected, it may be possible to expect headings at multiple levels, and to also expect a contents list. Each level of heading will have it's own set of policies which are shown on this dialog. The policies are shown in text form, but are edited via the Advanced Headings Policies Dialog

Note: This area of functionality is continually under review.

See also the discussion in detecting headings and section titles.

Expect numbered headings

Menu location: Conversion options -> Analysis Policies -> Headings

This policy specifies whether or not numbered headings are expected in the document.

Numbered headings may be found at multiple levels, and their details may be edited via the Advanced Headings Policies Dialog

This should be calculated correctly by AscToHTM. But is prone to error, getting confused by numbered bullets and the like. In such cases you may need to set this policy manually.

Expect underlined headings

Menu location: Conversion options -> Analysis Policies -> Headings

This policy specifies whether or not underlined headings are expected. Note, where the headings themselves are numbered, the underlining will be taken into account, and you should set the expect numbered headings policy instead.

AscToHTM uses the character in the underlining to determine the heading level, thus text underlined with equals signs is given prominence over text with single underline characters such as minus signs, tildes or underscores.

Expect capitalised headings

Menu location: Conversion options -> Analysis Policies -> Headings

This policy specifies whether or not CAPITALISED headings are expected. Note, where the headings themselves are numbered, this policy need not be set, and instead you should set the expect numbered headings policy instead.

Expect Embedded headings

(New in version 5.0)

Menu location: Conversion options -> Analysis Policies -> Headings

This policy specifies whether or not "embedded" headings are expected, i.e.. the heading is "embedded" in the first paragraph. Such headings are expected to be a complete sentence or phrase in UPPER CASE at the start of a paragraph.

This policy specifies whether or not embedded headings are expected. An embedded heading is one where the heading is the first sentence in the first paragraph of the section itself. The program can only detect this if the heading is in capitals.

An example might be :-

        RULE 15.6 – TERMS AND CONDITIONS.  The terms and conditions that apply
        are described in this section.  They will apply...

In this case the text "RULE 15.6 – TERMS AND CONDITIONS" will be treated as the embedded heading.

At present such headings are not auto-detected... you need to switch this policy.

Heading Key phrases

(New in version 5.0)

Menu location: Conversion options -> Analysis Policies -> Headings

If specified, then any line that begins with one of the key phrases will be regarded as a heading. The syntax is

      <details>, <details>...

where each set of details is

      <details> = <phrases>, [<heading_level>]

and

      <phrases> = <phrase_1> [|<phase_2>]

That is, each set of <details> can optionally specify a <heading_level>. If omitted this will default to 1,2,3 for the first, second, third set of details etc. Note, this is a logical heading level, and will be apparent in the contents list.

Each set of <details> must supply a set of <phrases>, and each set of phrases would must have at least one phrase with extra phrases added if wanted, separated by vertical bars.

So for example

      Part, Chapter, Section

would treat lines beginning with the words "Part", "Chapter" and "Section" as level 1,2, and 3 headings.

The key phrases are case-sensitive in order to reduce the likelihood of false matches with lines that just happen to have these phrases at the start of the line. So

      PART|Part, Chapter, Section

Would allow either "PART" or "Part" to be matched.

      "PART|Part,1" , "Chapter,2" , "Section,2"

Would make lines beginning with "Part" level-1 headings, while both "Chapter" and "Section" would become level 2. This would be the same as

      "PART|Part,1" , "Chapter|Section,2"

Note, spaces may form part of a match phrase, but because of their use in the tag syntax commands and vertical bars may not.

If false matches occur, (e.g. the word "Part" appears in the body of the text) edit the source text so that the offending word is no longer at the start of the line.

Use first line as heading

(New in version 5.0)

Menu location: Conversion options -> Analysis Policies -> Headings

When this option is selected, the first line in the document will be treated as a heading. This can be a useful option to select when the first line of your document is a document title line, but doesn't conform to the headings style used in the rest of the document.

Center first heading

(New in version 5.0)

Menu location: Conversion options -> Analysis Policies -> Headings

When this option is selected, the first heading in the document is centred. This may be an appropriate choice when the first heading is in fact to be treated as a document title.

Check indentation for consistency

Menu location: Conversion options -> Analysis Policies -> Headings

The program performs a number of consistency checks when detecting headings. Amongst these is a check that all headings of the same type occur at the same indentation. This check can help distinguish between numbered headings and numbered lists.

However, if you have numbered headings that are different indentations - e.g. because they are centred on the page - then this check will cause them to be rejected as headings. In such cases you can manually disable this check.

This appears on-screen as "Check indentations of headings are consistent"

Advanced headings policies dialog

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

This dialog is reached through one of the edit buttons on the main Headings Policies dialog. This allows you to edit details of a particular type or level of heading.

Position of section number on the line

Indentation of heading lines
Heading prefix words

Section number formatting

Heading numbering scheme
Heading separator characters
Heading trailing letters

Bracketing

Heading bracket characters

Indentation of heading lines

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

AscToHTM uses checks on indentation levels to reject lines with numbers on that could be confused with headers.

This is the indentation level (in characters) that heading of this types are expected to be found at.

Heading prefix words

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

Some documents put words like "chapter", "subject" and "section" in front of the section number. These are known as prefix words.

Heading numbering scheme

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

This is the numbering scheme expected for headings at this level. At present AscToHTM can't cope with mixed types like "II-2.b".

This may be addressed in later versions.

Heading separator characters

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

This shows the separator expected between parts of the heading number.

Not currently supported

Heading trailing letters

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

This shows whether we expect trailing letters after the section number, as in "1.1b".

Not currently supported

Heading bracket characters

Menu location: Conversion options -> Analysis Policies -> Headings -> Advanced Options

This shows what bracket characters (if any) we expect before and after the section number as in "[2.2]" or "3.2.1)".

Not currently supported

Pre-formatted text policies

Menu location: Conversion options -> Analysis Policies -> Pre-formatted text

These policies specify how AscToHTM detects pre-formatted text.

Detecting pre-formatted regions

Minimum automatic <PRE> size

See the section on pre-formatted text for more details.

These policies specify how AscToHTM detects pre-formatted text.

Minimum automatic <PRE> size

Menu location: Conversion options -> Analysis Policies -> Pre-formatted text

This policy specifies the minimum number of consecutive pre-formatted lines that must be detected before <PRE>...</PRE> markup is applied.

AscToHTM detects heavily formatted lines, and then looks at their neighbours to see if they too could be part of a pre-formatted text.

Once a group of lines is identifies, it will only be marked up as pre-formatted if the minimum is exceeded.

The default value is 0. Set this value larger if AscToHTM is marking text as pre-formatted when it shouldn't do.

Table analysis policies

Menu location: Conversion options -> Analysis policies -> Tables

These policies specify how AscToHTM detects possible tables and analyses the data in them into columns and rows.

Attempt TABLE generation

Detection

Extend preformatted regions
Maximum length of line to consider as a candidate table line

Analysing rows

Could table have blank lines between rows

Analysing columns

Table Layout
- Column boundaries have zero width
Is the table expected to have sparse columns
Ignore table header when analysing columns
Expect multiple COLSPAN values
Minimum number of spaces between table columns
Merge together "poor" columns

See the section on pre-formatted text for more details.

Attempt TABLE generation

Menu location: Conversion options -> Analysis policies -> Tables

This policy specifies whether or not you want HTML table generation attempted for regions of apparently pre-formatted text. AscToHTM will attempt to analyse such regions, preferring to fit them into a HTML table. However, if this is not possible, or if AscToHTM decides the pre-formatted region is something else (like a diagram or a piece of code) then a HTML table will not be generated.

Disabling this policy tells AscToHTM not to attempt this analysis, usually leading to pre-formatted text being placed in simple fixed width font markup instead.

Table extending factor

Menu location: Conversion options -> Analysis policies -> Tables

When the program encounters a strongly formatted line, it examines the adjacent lines to see if they too could form part of the same preformatted region.

This policy specifies the extend to which strongly preformatted lines should be used to "extend" to include adjacent lines as part of the same preformatted regions. If set to 10, then all adjacent lines up to the next page break or section heading will be treated as part of the same region. When set to 1 only those lines that are clearly heavily formatted themselves will be included.

This appears on-screen as "Extend preformatted regions"

Max length of candidate table lines

(New in version 5.0)

Menu location: Conversion options -> Analysis policies -> Tables

This specifies the maximum length of line that could be considered as a possible table line. Since some files output one paragraph per line, it is possible for some text to contain a lot of the "markers" used to detect table lines, and to be wrongly diagnosed as part of a table. This value is set to prevent that happening.

The value should be set as low as reasonable for your files, however the current default value of 256 has been set to cope with observed legacy reports which were often designed to be printed on line printer paper

Could be blank line separated

(New in version 5.0)

Menu location: Conversion options -> Analysis policies -> Tables

This option specifies whether or not tables are expected to have blank lines between rows. If they are, the software will be more likely to merge the text for adjacent source lines into a single row in the output table.

Default Table Layout

(New in version 5.0)

Menu location: Conversion options -> Analysis policies -> Tables

This option allows you to specify the default table layout for all tables in the document. The layout specifies the number of columns and their end positions. If there are no gaps between columns in the data then you will need to enable the Column boundaries have zero width options.

This is the default layout and will normally be applied to all tables the document. If a document has multiple tables you are better off either using the pre-processor to mark up the source text and supplying TABLE_LAYOUT commands, or supplying a "Layout" component in a Table Definition File.

The format of the Table Layout policy is the same as that described in the discussion of the TABLE_LAYOUT pre-processor command.

See also TDF line: Layout

Column boundaries have zero width

(New in version 5.0)

Menu location: Conversion options -> Analysis policies -> Tables

When a Default TABLE layout is set, this specifies whether or not the boundaries between columns have zero size. By default the software assumes there is a single-character boundary (either a space or a delimiter character) between columns.

Sometimes data comes from a source with no character between data columns. This policy should be set in such cases.

Expect sparse tables

Menu location: Conversion options -> Analysis policies -> Tables

This policy is used to tell AscToHTM that you expect your tables to be quite sparse in places. This can affect AscToHTM's analysis, as the algorithms are liable to merge "empty" columns with their less empty neighbours.

Enabling this policy will usually result in your tables having more, emptier, columns.

Ignore table header during analysis

Menu location: Conversion options -> Analysis policies -> Tables

This policy specifies that the table header should be ignored when analysing the column structure of the table.

In some tables (usually "reports") the header can be quite complex, with titles spanning multiple columns, whereas the body of the table is much more structured.

In such cases including the table header in the analysis can lead to errors, so enabling this policy can simplify the analysis giving better chances of success.

Expect multiple COLSPAN values

(New in version 5.0)

Menu location: Conversion options -> Analysis policies -> Tables

When auto-detecting the column structure of tables, the software has to decide where the column boundaries are, and in doing this it looks for character positions that appear to contain mostly blank or non-text characters.

The analysis can have problems if the table has a lot of data that spans multiple columns (in HTML this would require a COLSPAN value to merge the cells together). Tables with complex headers (e.g. financial tables merging results from many sources) or witchy internal headers often have this feature.

If you have such tables and find that the analysis is failing to detect your columns (commonly merging columns together), try enabling this option which will relax the definition of what constitutes such a table.

In such cases you should also review your use of the Column merging factor and Ignore table header during analysis policies.

This appears on-screen as "Expect cells that span multiple columns"

Minimum TABLE column separation

Menu location: Conversion options -> Analysis policies -> Tables

This policy specifies the minimum number of spaces that should be interpreted as a gap between columns in a potential table. The default value is 1, but this value can sometimes lead to too many columns, especially in small tables. Larger values may lead to columns being merged together.

This appears on-screen as "Minimum number of spaces between table columns"

Column merging factor

Menu location: Conversion options -> Analysis policies -> Tables

Once the program has detected the column layout of a table, it reviews how well the data can be fitted into these columns. If too many cells in a column are empty, or if too many cells "span" multiple columns, then the columns are deemed to be "poor", and may be merged together to form fewer, wider columns.

This factor determines the extent to which columns should be merged. A value of 10 means columns should be merged together whenever there is any doubt. Use this if you are getting too many columns. A value of 1 means columns should never be merged. Use this if you are getting too few columns.

This appears on-screen as "Merge together "poor" columns"

Note, this policy can't guarantee you will the correct column structure, but it does give you a chance to influence the logic.

Output policies

Menu location: Conversion options -> Output policies

These policies are used to control the output to HTML. Generally these policies allow you to decide how the resulting HTML should look in a manner that cannot be inferred from the original document.

File generation
Title description etc.
Document colours
Contents lists
Frames
Headers footers etc
Tables
Hyperlinks
Pre-processor
CSS
Fonts
HTML styling

Added HTML policies

Menu location: Conversion options -> Output policies -> Title description etc.

These policies allow you to "add value" to the HTML produced. That is, to specify details that cannot be inferred from the original text.

Title

Use first heading as title
Use first line as title
Default document title

Meta tags

Document description
Document keywords
Author URL
Bookmark URL
Copyright URL
Shortcut Icon URL

Other

Background image

NOTE:: Although these policies allow you to specify special header and footer files, in later versions of the software we recommend the use of Using HTML fragments. HTML fragments are usually placed in a separate file, which can be selected from the Advanced HTML options dialog by pressing the button further down this screen.

A button will take you to the Advanced HTML options dialog

Use first heading as title

Menu location: Conversion options -> Output policies -> Title description etc.

When this option is selected, the first heading detected in the document will also be used as the document's title, i.e. as the text that goes into the <TITLE>...</TITLE> tags in the <HEAD>...</HEAD> section of the generated HTML pages.

Note, this relies on the program correctly detecting headings, and in particular the first heading. If the first heading is also the first line, you may want to instead just consider using the use first line as heading and/or use first heading as title policies.

If no heading is encountered the document title is used instead

Use first line as title

Menu location: Conversion options -> Output policies -> Title description etc.

When this option is selected, the first line in the document will be treated as the document title, i.e. as the text that goes into the <TITLE>...</TITLE> tags in the <HEAD>...</HEAD> section of the generated HTML pages.

This can be a useful option to select when the first line of your document is a document title line.

If you also want the first line to appear in the output as a heading, select the use first line as heading option

This will only take effect if the first heading is not to be used, or no heading has been found.

Document Title

Menu location: Conversion options -> Output policies -> Title description etc.

This specifies the text to be used by default in the <TITLE>...</TITLE> tags in the <HEAD>...</HEAD> section of the generated HTML pages. The default value is

Converted from [[filename]]

where [[filename]] gets replaced by the original filename (see Pre-processor command: filename).

This value will only be used if the first heading is not to be used (or is not found), and if the first line is not to be used.

If you want this to be the title, make sure the "Use first heading" and "use first line" options are disabled.

Document Description

Menu location: Conversion options -> Output policies -> Title description etc.

This specifies text that will be inserted into a <META> tag in the <HEAD> portion of your HTML page. Search engines use this description when indexing your page, and may be shown as the summary of your page when people search for it.

This text is not normally visible in a browser.

Document Keywords

Menu location: Conversion options -> Output policies -> Title description etc.

This specifies text that will be inserted into a <META> tag in the <HEAD> portion of your HTML page. Search engines will use these keywords when indexing your page. You should add here any keywords that summarise the contents of your page, and in particular include words not actually in the document, or alternative spellings (e.g. American versus UK English).

This text is not normally visible in a browser.

Author URL

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Title description etc.

Identifies the URL of the author of this document. A META tag will be added to the HTML so that those browsers that can display this information can use it.

Bookmark URL

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Title description etc.

Identifies the URL to be used as the bookmark for this document. A META tag will be added to the HTML so that those browsers that understand this information can use it.

Copyright URL

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Title description etc.

Identifies a URL describing the copyright notice for this document. A META tag will be added to the HTML so that those browsers that can display this information can use it.

Shortcut Icon URL

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Title description etc.

Identifies the URL of any icon to be used as a shortcut for this page. A META tag will be added to the HTML so that those browsers that support shortcut icons can use it.

Background Image

Menu location: Conversion options -> Output policies -> Title description etc.

This specifies the image file to be drawn as background on your HTML page. This is the text inserted into the <BODY BGIMAGE=value> tag in the generated HTML.

The value entered should be the absolute or relative URL of the file to be used. Usually this is simply the name of a .GIF or .JPG file that will be placed in the same directory as the created .HTML files.

Advanced HTML policies

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

These policies allow you to control more advanced features that influence HTML generation.

HTML Options

Omit <HEAD> and <BODY> from output
Allow "generated by AscToHTM" comments
Base HREF URL

Character Set

Character Encoding

Include files
These policies allow you insert files into your output HTML at particular locations. This can be useful to give your pages a common "look and feel".

HTML Header file
HTML Footer file
Script file (copied to the <HEAD> section)

Omit <HEAD> and <BODY> from output

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

This policy specifies that the <HTML>, <HEAD> and <BODY> tags should be omitted from the HTML generated. This will end up producing HTML that is not, technically, a correct standalone document, although many browsers will display it correctly.

You should only select this option if your intention is to use the generated HTML as part of a larger document, in which case these tags shouldn't be duplicated. This policy is enabled automatically whenever output to Windows clipboard is selected.

Comment generation code

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

Used to control the placing of comment in the code advertising the fact that AscToHTM was used to generate the file. By default the program will add a META tag marking ASCTOHTM as the generator and comments at the top and bottom of the file identifying the program.

The value should be set as follows

0 META tag and comments added

1 META tag added

2 neither added

This on-screen as 'Allow "generated by AscToHTM" comments'. When enabled this sets a value of 0. When disabled the value is 2.

NOTE:: This feature is only enabled in the registered version of the software. You cannot use it in the shareware version.

Document Base URL

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

This specifies the URL to be placed in any <BASE> tag. This URL is used as a baseline for resolving any relative links. In HTML you might want this if the page is going to refer to images, style sheets or cgi scripts using URLs that are correct relative to location 1, but incorrect when the page is moved to location 2, e.g. to be used as a mirror on a second site.

Of course, if you were to also move the images, style sheets and cgi-scripts to the second site so that all relative links remained correct, you wouldn't need this policy.

Character encoding

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

The software has the ability to detect if the source file uses a character set other than ASCII. This can then be used to set the character set of the HTML generated so that it will display correctly in a browser.

The character set should be detected automatically, but if it isn't, or if it is detected wrongly, you can use this to change it.

If left blank, the default will be "ascii".

Character set Language Auto-detected?

ascii English/Western Yes

big5 Chinese Yes

csGB2312 Chinese -

csShiftJIS Japanese -

csbig5 Chinese -

euc-kr Korean -

gb2312 Chinese Yes

iso-2022-kr Korean Yes

iso-8859-1 Western Europe -

iso-8859-10 Scandinavia -

iso-8859-2 Eastern Europe -

iso-8859-3 Turkish -

iso-8859-4 Baltic states -

iso-8859-5 Cyrillic -

iso-8859-6 Arabic -

iso-8859-7 Greek -

iso-8859-8 Hebrew Yes

iso-8859-9 Latin + Turkish -

koi8-r Cyrillic Yes

ms_Kanji Japanese -

shift_jis Japanese -

tis-620 Thai Yes

UTF-8 Unicode -

windows-1251 Cyrillic Yes

windows-1253 Greek Yes

windows-1254 Turkish Yes

windows-1256 Arabic Yes

x-euc-jp Japanese -

x-euc-tw Chinese -

x-sjis Japanese Yes

Character set	Language	Auto-detected?
ascii	English/Western	Yes
big5	Chinese	Yes
csGB2312	Chinese	-
csShiftJIS	Japanese	-
csbig5	Chinese	-
euc-kr	Korean	-
gb2312	Chinese	Yes
iso-2022-kr	Korean	Yes
iso-8859-1	Western Europe	-
iso-8859-10	Scandinavia	-
iso-8859-2	Eastern Europe	-
iso-8859-3	Turkish	-
iso-8859-4	Baltic states	-
iso-8859-5	Cyrillic	-
iso-8859-6	Arabic	-
iso-8859-7	Greek	-
iso-8859-8	Hebrew	Yes
iso-8859-9	Latin + Turkish	-
koi8-r	Cyrillic	Yes
ms_Kanji	Japanese	-
shift_jis	Japanese	-
tis-620	Thai	Yes
UTF-8	Unicode	-
windows-1251	Cyrillic	Yes
windows-1253	Greek	Yes
windows-1254	Turkish	Yes
windows-1256	Arabic	Yes
x-euc-jp	Japanese	-
x-euc-tw	Chinese	-
x-sjis	Japanese	Yes

HTML Header file

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

This specifies the name of a text file whose contents will be copied into the generated HTML page(s). The contents of this file will be inserted just after the <BODY> tag, and before any of the converted files contents are copied across to the HTML file.

This file should be accessible at conversion time, and will normally contain some standard text or HTML that you want including at the top of the visible page.

Note:: Although still supported, in later versions it is recommended that you instead define a HTML_HEADER HTML fragment in an external definition file.

HTML footer file

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

This specifies the name of a text file whose contents will be copied into the generated HTML page(s). The contents of this file will be inserted just before the closing </BODY> tag, and after all of the converted files contents have been copied across to the HTML file.

This file should be accessible at conversion time, and will normally contain some standard text or HTML that you want including at the top of the visible page.

Note:: Although still supported, in later versions it is recommended that you instead define a HTML_FOOTER HTML fragment in an external definition file.

HTML Script file

Menu location: Conversion options -> Output policies -> Title description etc. (and then press the "Advanced options >>>" button

This specifies the name of a text file whose contents will be copied into the generated HTML page(s). The contents of this file will be inserted into the <HEAD>...</HEAD> section of the generated HTML file.

This file should be accessible at conversion time, and will normally contain some HTML that you want including at the HTML <HEAD>...<HEAD> section.

Note, tags inserted at this point are not normally visible, and as such this file should only contain relevant HTML such as <META> tags or JavaScript... hence the name.

Note:: Although still supported, in later versions it is recommended that you instead define a HTML_SCRIPT HTML fragment in an external definition file.

Colour policies

Menu location: Conversion Options -> Output Policies -> Document Colours

These policies allow you to choose the colour of various components of your HTML pages.

HTML allows you to control the colours of several components by specifying attributes in the <BODY> tag. These include :-

Text colour (TEXT attribute)
Background colour (BGCOLOR)
Hyperlink colour (LINK)
Visited hyperlinks (VLINK)
Active hyperlinks (ALINK)

These policies should be given a valid HTML colour code value

You can also choose

Suppress all colour markup

Suppress all colour markup

(New in version 5.0)

Menu location: Conversion Options -> Output Policies -> Document Colours

This option may be of interest to those generating pages to be viewed by the visually impaired who can use their browser to set the colours that suit them best.

HTML Colours

Some policies specify colours of various HTML components, described in the <BODY> tag and others in the generated file.

You can use the browse buttons to choose from a colour palette. Colours in HTML are specified as 6-character RRGGBB values, with each component taking a hexadecimal value in the range 00-FF. Thus 000000 is black, FFFFFF is white and 550000 is a dark red.

A few colours (some of the combinations of "00", "80" and "FF" values) may be specified by name. The list of colours recognised in the HTML standard is

Colour HTML Hex value

Black #000000

Silver #C0C0C0

Gray #808080

White #FFFFFF

Maroon #800000

Red #FF0000

Purple #800080

Fuchia #FF00FF

Green #008000

Lime #00FF00

Olive #808000

Yellow #FFFF00

Navy #000080

Blue #0000FF

Teal #008080

Aqua #00FFFF

Colour	HTML Hex value
Black	#000000
Silver	#C0C0C0
Gray	#808080
White	#FFFFFF
Maroon	#800000
Red	#FF0000
Purple	#800080
Fuchia	#FF00FF
Green	#008000
Lime	#00FF00
Olive	#808000
Yellow	#FFFF00
Navy	#000080
Blue	#0000FF
Teal	#008080
Aqua	#00FFFF

Only these values will be converted by the software to the equivalent names. Other names exist outside the standard which may not be universally supported.

Active Link Colour

Menu location: Conversion Options -> Output Policies -> Document Colours

Identifies the colour of "active" hyperlinks, that is the colour of the hyperlink just as it is being selected. This value is a HTML colour that is used to set the ALINK attribute of the <BODY> tag.

Background Colour

Menu location: Conversion Options -> Output Policies -> Document Colours

Identifies the background colour of the HTML page(s) created. This value is a HTML colour that is used to set the BGCOLOR attribute of the <BODY> tag. If omitted, AscToHTM defaults to a white background (I find Gray too dull)

Text Colour

Menu location: Conversion Options -> Output Policies -> Document Colours

Identifies the text colour of the HTML page(s) created. This value is a HTML colour that is used to set the TEXT attribute of the <BODY> tag.

Unvisited Link Colour

Menu location: Conversion Options -> Output Policies -> Document Colours

Identifies the colour of "unvisited" hyperlinks, that is the colour of the hyperlink before it is selected for the first time. This value is a HTML colour that is used to set the LINK attribute of the <BODY> tag.

Visited Link Colour

Menu location: Conversion Options -> Output Policies -> Document Colours

Identifies the colour of "visited" hyperlinks, that is the colour of the hyperlink after is has been selected at least once. This value is a HTML colour that is used to set the VLINK attribute of the <BODY> tag.

CSS policies

Menu location: Conversion options -> Output policies -> Cascading style sheets (CSS)

Cascading Style Sheets (CSS) allow styles to be added to HTML pages. They are supported in the Version 4 browsers, although the support can be variable.

Document Style Sheet

Document Style Sheet

Menu location: Conversion options -> Output policies -> Cascading style sheets (CSS)

Style sheets are a feature recently added to HTML. They are supported by Internet Explorer and Netscape 4.0 and above. They may not be supported by earlier browsers.

Style sheets allow you to specify the colours and fonts that you want added to your paragraphs, headers etc in your HTML. Normally you store all this detail in a cascading style sheet (.css) file and use a line of HTML in your <HEAD> section. In this way a uniform style can be added to all pages on your site and easily changed.

This policy allows you to specify the URL of any style sheet you want to use. AscToHTM will then add the requisite HTML to the <HEAD> portion of your document.

Directory page policies

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

When converting many files at once, AscToHTM can generate a Directory Page, which lists all the files converted, complete with hyperlinks to the files and their contents.

Generate a directory page
- Indent contents lines
- Show file titles in directory
Directory filename
Split directory into pages
- Maximum number of files per index page

Details
These details will go into the <HEAD>...</HEAD> of the directory page.

Directory title
Directory description
Directory keywords
Text on hyperlink back to directory page

Include files
These policies allow you insert files containing HTML into your output directory page at particular locations.

Header HTML file
Footer HTML file
Script file (copied to the <HEAD> section)

Split directory into pages

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

When enabled the directory "page" will be split into several smaller, linked pages. This can be useful when building an index for a large number of files all being converted at once.

This option will only take effect if the Maximum number of files per index page is set to be greater than 0. There is no policy line for this option as such, instead the value of Maximum number of files per index page is simply set to 0 if this option is disabled.

Maximum number of files per index page

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies how many files should have their contents added to a directory page before a new page is started. A value of 0 indicates that no splitting of the directory page is wanted.

So for example if you elected to build a directory whilst converting 100 files in a single pass, a value of 20 would mean that you'd end up with 5 index pages all linked together, each detailing the contents list of 20 files.

Make Directory

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This policy specifies whether or not you want a directory page created when converting multiple files.

This appears on-screen as "Generate a directory page"

Indent headings in Directory

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This policy specifies whether or not AscToHTM should indent the hyperlinks to the sections contained in the documents being converted. If this is disabled, then all the contents hyperlinks will be indented one level relative to file titles.

This appears on-screen as "Indent contents lines"

Directory filename

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This policy specifies the name of the index file. If a file of this name already exists it will be overwritten.

Show file titles in directory

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This policy specifies whether or not the directory page should include a title line for each file being converted. This will either be the title of each file, or the filename.

Directory title

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies the text to be used by default in the <TITLE>...</TITLE> tags in the <HEAD>...</HEAD> section of the generated HTML pages.

Directory description

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies text that will be inserted into a <META> tag in the <HEAD> portion of your directory page. This description is used by search engines when indexing your page, and may be shown as the summary of your page when people search for it.

This text is not normally visible in a browser.

Directory keywords

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies text that will be inserted into a <META> tag in the <HEAD> portion of your directory page. These keywords will be used by search engines when indexing your page. You should add here any keywords that summarise the contents of your page, and in particular include words not actually in the document, or alternative spellings (e.g. American versus UK English).

This text is not normally visible in a browser.

Directory return hyperlink text

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This is the text shown on the hyperlink back to the directory page. This link is inserted into each HTML document generated to allow navigation back to the directory index.

This appears on-screen as "Text on hyperlink back to directory page"

The default value is "Directory".

Directory header file

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies the name of a text file whose contents will be copied into the directory page. The contents of this file will be inserted just after the <BODY> tag, and before any generated contents of the directory page.

This file should be accessible at conversion time, and will normally contain some standard text or HTML that you want including at the top of the visible page.

If omitted the same header file as used for the individual HTML files will be used.

Directory footer file

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies the name of a text file whose contents will be copied into the directory page. The contents of this file will be inserted just before the closing </BODY> tag, and after all of the generated contents of the directory page.

This file should be accessible at conversion time, and will normally contain some standard text or HTML that you want including at the top of the visible page.

If omitted the same footer file as used for the individual HTML files will be used.

Directory script file

Menu location: Conversion options -> Output policies -> File Generation -> Directory Page

This specifies the name of a text file whose contents will be copied into the directory page. The contents of this file will be inserted into the <HEAD>...</HEAD> section of the directory page.

This file should be accessible at conversion time, and will normally contain some HTML that you want including at the HTML <HEAD>...<HEAD> section.

Note, tags inserted at this point are not normally visible, and as such this file should only contain relevant HTML such as <META> tags or JavaScript... hence the name.

File generation policies

Menu location: Conversion options -> Output policies -> File generation -> General

These policies determine some general options for the HTML file generation.

DOS Compatibility

Generate DOS-compatible filenames
DOS filename root

File extensions

File extensions
Alternative file extension

Diagnostics Files

Generate log files
Generate sample policy file
- Output policy filename

Line and file structures

Preserve file structure using <PRE>
Preserve line structure
Treat each line as a paragraph
Break up long HTML lines

Preserve file structure using <PRE>

Menu location: Conversion options -> Output policies -> File generation -> General

This policy can be used to place the whole file inside <PRE>...</PRE> markup. This will use a mono-spaced font that preserves the line structure and the relative spacing of characters.

When this is enabled almost all of the program's other conversion options will be disabled. You should only really use this if your document has a lot of formatting that the program is failing to understand.

This policy needs to be set manually where wanted.

Preserve Line structure

Menu location: Conversion options -> Output policies -> File generation -> General

This policy specifies that the line structure of the original document should be preserved, rather than just the paragraph structure.

If enabled the lines in the output document will match those of the original document, and the text will not automatically be adjusted if you widen your window. On large monitors this will give the text an "A4" look and feel.

This policy needs to be set manually where wanted.

Treat each line as a paragraph

Menu location: Conversion options -> Output policies -> File generation -> General

Some files do not break large paragraphs into smaller lines, but instead place the whole paragraph on a single line. This is especially true if the source file was created by a text editor that relied on word wrap (such as Notepad or Word).

These files often have no blank lines between paragraphs, which makes detecting where paragraphs begin and end more difficult.

In such files this policy can be enabled so that each "line of text" in the source file will be treated as a separate paragraph.

This policy cannot be automatically detected, and so needs to be set manually where wanted.

Break up long HTML lines

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File generation -> General

Specifies that long HTML lines should be broken into smaller ones. This attempts to make the HTML more readable, should you need to edit it afterwards.

However, there is the possibility that the insertion of newline characters into the output could affect how the HTML is displayed. This is because most of the browsers have bugs in their parsing of newlines. These problems are most pronounced when using non-standard (12pt) font sizes.

If you experience such problems, try disabling this policy.

Generate diagnostics files

Menu location: Conversion options -> Output policies -> File generation -> General

This policy allows you to specify the generation of some diagnostics files. AscToHTM will generate 3 files with the following extensions:

.lis1	A line-by-line summary of how AscToHTM analysed the source file during the analysis pass
.lis	A line-by-line summary of how AscToHTM analysed the source file during the output pass
.stats	A statistics file

The .lis file will give the best description of how the source file has been converted. The differences between the .lis1 and .lis files can be slight, and are down to fact that on the output pass more rigorous attention is applied to the policies.

Any error messages generated during the conversion are inserted into the .lis file at the offending line. This will help you determine how relevant they are.

This appears on-screen as "Generate log files"

Output policy file

Menu location: Conversion options -> Output policies -> File generation -> General

This policy allows you to generate a policy file containing all the policies used during the output pass. This will help you understand how AscToHTM has interpreted your document, and may help in determining where the analysis may have gone wrong and need correcting.

Note, this file will contain all the policies used, and as such is probably not suited for use as an input policy.

This appears on-screen as "Generate sample policy file"

See the discussion in Generating a policy file for your document

Output policy filename

Menu location: Conversion options -> Output policies -> File generation -> General

When output policy file is selected, this allows you to specify the name of the policy file you want created.

Use .HTM extension

Menu location: Conversion options -> Output policies -> File generation -> General

This specifies the type of file extension to be used for the generated files. Note, if DOS-compatible filenames are chosen ".HTM" will be used, otherwise the options are ".html", ".htm" or other

On-screen this appears as a radio button for selecting file extensions. If .HTM is selected this policy is set to true, otherwise it is false. If "Other" is selected, then the Other value is set via the Output file extension policy is used (which appears on-screen as "Alternative file extension").

Output file extension

Menu location: Conversion options -> Output policies -> File generation -> General

This specifies that an alternative file extension (e.g. ".shtml") should be used for the generated files.

This appears on-screen as "Alternative file extension"

Use DOS filenames

Menu location: Conversion options -> Output policies -> File generation -> General

By default AscToHTM uses the input filename to generate the output filename. Often this name is not DOS-compatible.

If you set this flag AscToHTM will ensure all filenames are DOS compatible as follows:

A five-letter filename root will be calculated. This will either be the first 5 letters of the input filename, or the DOS file root specified in policy. If generating many files, the section numbers N, N.N, N.N.N are appended in the form _N, _N_N etc.

If the resulting filename exceeds 8 characters a unique name is generated consisting of 3 letters and 5 numbers. The 3 letters are the first three letters of the DOS root. The extension .htm is added.

This appears on-screen as "Generate DOS-compatible filenames"

DOS filename root

Menu location: Conversion options -> Output policies -> File generation -> General

If DOS compatibility is selected, this is the up-to-5 letter DOS file root used in generating filenames.

If omitted the first 5 characters of the input filename is used instead.

File splitting policies

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files These policies determine how - if at all - the source document is to be converted into one or more output HTML files. AscToHTM can split the output at heading boundaries, and ensure that hyperlinks work across the suite of output HTML files.

In addition, a navigation bar can be added to each page, taking you to the next and previous HTML document, and back to the contents list.

Split into several HTML files

HTML file creation

Split at which heading level
Minimum output file size
Minimise HTML file size
Add navigation bar
Split at page markers

File naming

Number of words to include in filename
Use numbered sequence in file names

Split level

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

If the output file is being split into several HTML files, this policy determines at which heading level the file is split. Splits will occur at each heading at the specified level or above.

For example, a value of 2 will create a new page starting on section 1 and section 1.1, but section 1.1.1 will not create a new page, and will thus be copied into the page containing section 1.1.

The HTML pages created will have the section number appended to the file root, thus in this case a file source.txt would create files source_1.html, source_1_1.html etc.

This appears on-screen as the two options "Split into several HTML pages" and "split at what heading level". If the first is unchecked, the value is set to 0, meaning no splitting will occur.

This should be set manually, or loaded from a policy file.

See the discussion in Splitting large files into a linked set of smaller files

Min HTML file size

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

If the output file is being split into several HTML files, this policy determines the minimum number of lines to be added to each HTML file. This is sometimes useful when a sub-section follows immediately after a heading. By setting this value large enough you can avoid having an HTML created with only the major section heading in it.

This should be set manually, or loaded from a policy file.

This appears on-screen as "Minimum output file size"

See the discussion in Splitting large files into a linked set of smaller files

Add navigation bar

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

If the output file is being split into several HTML files, this policy determines whether or not a navigation bar should be added at the foot of each HTML page generated. This navigation bar will include

a link to the previous HTML page (if any)
a link to the next HTML page (if any)
a link to the contents list (if any) at the start of the next section

This should be set manually, or loaded from a policy file.

See the discussion in Splitting large files into a linked set of smaller files

Minimise HTML file size

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

This policy indicates that the created HTML files should be optimised for size. This is done by eliminating unnecessary white space in the output. AscToHTM adds white space to make the HTML more readable.

The choice here is to sacrifice the readability and edibility of the HTML to produce a file that will download faster. Typically the files are around 10% smaller, but in a file that consists of many tables the savings can be more like 30%.

Split at page markers

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

Specifies that the document should be split into smaller HTML pages at the page markers.

The first file created normally has a name that matches the source file. Subsequent files append the page number (_1, _2...)

Number of words to include in filename

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

When splitting a large document into files a new file will be created each time a suitable heading is reached. If the heading is underlined or capitalised (as opposed to numbered), the first few words of the heading will be included in the filename.

This policy determines how many words from the title should be used.

Use numbered sequence in file names

(New in version 5.0)

Menu location: Conversion options -> Output policies -> File generation -> Splitting large files

When splitting a large document into files this option allows you to name your output files using a numbering sequence. Previously the file names would attempt to come from the first heading and it was possible to get duplicates.

Enabling this option means that when converting filename.txt, it will be split into filename_1.html, filename_2.html etc, etc, giving more consistent and orderly filenames.

Font policies

Menu location: Conversion options -> Output policies -> Fonts

Fonts

Normal text Default font

Headings Heading Font

Text in tables Table font

Table of contents TOC Font

Fixed-pitch text Fixed font

Implementation

Font Implementation

Default font

Menu location: Conversion options -> Output policies -> Fonts

This specifies the default font to be used. It may be edited via a normal Windows Font selection dialog.

Heading font

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Fonts

This specifies the default font to be used for headings. The actual headings will be based on this font family, but will be made larger and or italic according to the level of heading applied to a given heading.

It may be edited via the Font selection dialog.

Table font

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Fonts

This specifies the default font to be used inside tables. This will default to the Default Font, but you may want to set it smaller in order to fit wide tables on the page.

It may be edited via the Font selection dialog.

Table of contents Font

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Fonts

This specifies the default font used in any generated Table of Contents. The font family specified will be used, but the different levels of heading in the list will be given different sizes and italics, just as in a default Word document.

It may be edited via the Font selection dialog.

This policy appears on-screen as "TOC font"

Fixed font

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Fonts

This specifies the default font to be used for ASCII art and diagrams and other portions of text where the spacing is to be preserved. For this a mono-spaced font such as Courier is usually used. The Font size is also usually set a bit smaller at 8pt. This is to ensure that an 80-character "line" in the original document will fit on a page in the output document.

It may be edited via the Font selection dialog.

The Font Selection Dialog

Each of the font values may be chosen using the font selection dialog. The selected font is shown as a comma-separated list containing :-

The font name
The font characteristics ("Regular", "bold", "italic" or "bold italic")
The font size (in pts)

Depending on the font concerned some of these attributes may be ignored. For example when selecting the font for headings the sizes of different headings are different, based on the selected value.

Use CSS to implement fonts

Menu location: Conversion options -> Output policies -> Fonts

This specifies whether or not CSS should be used to implement any selected fonts for the document. This is enabled by default.

This appears on-screen as a choice of "implementation" as follows

Using Cascading Style sheets (CSS). In this case the desired font is implemented by adding a <STYLE>..</STYLE> section to the head of each HTML page generated.
Using <FONT>...</FONT> markup.

In HTML 4.0 onwards the use of CSS is encouraged (indeed in Strict HTML 4.0 the <FONT> tag is no longer permitted). There are many advantages of CSS chief amongst which are reduced file sizes and the clean separation of style from content, allowing users to substitute their own styles (e.g. larger fonts, clearer colours etc).

CSS is supported to various degrees by all of the version 4 and above browsers. The CSS generated by AscToHTM lies well within the capabilities of all these browsers.

By contrast using <FONT> tags leads to larger files, less user control and is actually not permitted in strict HTML 4.0 and above. That said it is supported by earlier browsers.

We advise you use CSS (the default behaviour), and only consider switching to using <FONT> tags if you experience problems, or have a significant proportion of your target audience using older (V3 and earlier) browsers.

Base font size (in pts)

Deprecated in version 5.0

This specifies the default font size to be used for the document. The size is specified in points, with 12pt corresponding to HTML size 3.

Delta font size (in pts)

Deprecated in version 5.0

This specifies an increment in font sizes to be applied to the document. Each <FONT> tag generated will have its size adjusted by this amount. This can be used to make text universally larger or smaller.

Frames policies

Menu location: Conversion options -> Output policies -> Frames

From version 4 onwards AscToHTM supports placing the generated HTML in FRAMES. These policies can be used to fine-tune the process.

Policies that apply to frames generation in general

Place document in frames
File frame name
Open frame links in new window
- New window name
Add frame border
Add NOFRAMES links
- NOFRAMES link URL

Frame colours

A button will take you to the Frame colours dialog.

"Header" frame policies

Frame depth
Use any header defined for main pages

"Contents" frame policies

Add contents frame if possible
Number of contents levels
Frame width

"Main" frame policies

Split into several HTML files
Split at what level
First page to show in main frame

"Footer frame policies

Frame depth
Use any footer defined for main pages
Only allow pages to be viewed in frames

For a fuller discussion see Generating a set of Frames.

Place document in frames

Menu location: Conversion options -> Output policies -> Frames

This policy allows you to specify that you want your document output in HTML frames. The program will create a HTML page with the necessary <FRAMESET> tags.

Output frame name

Menu location: Conversion options -> Output policies -> Frames

This specifies the name of the HTML file which will contain the FRAMESET tags that define the layout and contents of the FRAMES. If omitted the default behaviour is to append "_frame" to the filename. For example myfile.txt would generate

myfile_frame.html FRAMESET file

myfile_contents_frame.html (optional) contents frame

myfile_header_frame.html (optional) header frame

myfile_footer_frame.html (optional) footer frame

and then

        myfile.html                     first main frame page
        myfile_1.html                   (optional) second main frame page
        ...                             ...

Open frame links in new window

Menu location: Conversion options -> Output policies -> Frames

This specifies that whenever you click on an "External" hyperlink inside a set of generated FRAMES, a new window should be opened, rather than displaying the selected page inside the current FRAMESET.

Note, the program distinguishes between "internal" links to other parts of the same converted document (be they in a different HTML page or whatever) and "external" links to URLs nothing to do with the current conversion. Thus a link to www.jafsoft.com will create a new window, but a link to section 3.2 will not.

New frame link window name

Menu location: Conversion options -> Output policies -> Frames

When an "external" hyperlink is clicked on from inside a generated FRAMESET and the policy Open frame links in new window is enabled, this is the name of the browser window to be used as the target for the link.

You can specify any name you like, but the default is the reserved browser window name "_top" which will cause the current window to be used, thereby replacing all the FRAMESET currently on display (this will re-appear if you press the BACK button)

Add frame border

Menu location: Conversion options -> Output policies -> Frames

This specifies whether or not you want visible borders on your frames. If the borders are visible then users will be able to select and move them to resize the frames to a size that suits them.

Add NOFRAMES links

Menu location: Conversion options -> Output policies -> Frames

Specifies that a NOFRAMES link should be added to the Contents frame. You can choose the target of this link using the NOFRAMES link URL policy.

The NOFRAMES link will target the "_top" browser window, thereby replacing the current FRAMESET by the single page selected without creating new browser windows.

NOFRAMES links are useful courtesy to users who dislike FRAMES, and they are also search-engine friendly.

Note, this link is a visible NOFRAMES link. In addition to this the software will in any case create a non-visible <NOFRAMES> tag that will allow users whose browsers do not support FRAMES to access your content.

NOFRAMES link URL

Menu location: Conversion options -> Output policies -> Frames

When the Add NOFRAMES links policy is enabled, this is the URL that is used in the visible NOFRAMES hyperlink added to the contents frame.

If left blank (the default) this link will point to the first HTML page that would normally be displayed in the main frame.

You should only really need to use this if you have created an independent set on non-FRAMES pages that you'd like to link to instead.

Header frame depth

Menu location: Conversion options -> Output policies -> Frames

If a header frame is generated, this specifies its depth. The default is 100 pixels. To specify a percentage, add the % sign on the end, e.g. "20%"

Use main header in header frame

Menu location: Conversion options -> Output policies -> Frames

When this policy is set the software will attempt to move any HTML header defined for each page into a header frame. This allows the same policy file to be used for both FRAMES and non-FRAMES generation.

In non-FRAMES generation the supplied header will be included at the top of each page generated.

If this policy is enabled, then in FRAMES generation the supplied header will be placed in a header frame, and omitted from each page generated. This means the header will always be visible in the header frame, as the generated pages are displayed and scrolled in the main frame.

Further modification of this behaviour is possible by using the HTML fragments feature to define specific HTML fragments to be used in particular frames.

Add contents frame if possible

Menu location: Conversion options -> Output policies -> Frames

This specifies that, if possible, a contents frame should be placed on the left of the screen to hold a generated contents list. Alternatively a CONTENTS_FRAME HTML HTML fragments can be defined

Generated contents lists are only possible when the program recognises headings inside your text file, so you may need to get that working first.

If no content for the contents frame is possible, a WARNING message is generated.

Number of levels in contents frame

Menu location: Conversion options -> Output policies -> Frames

When a generated contents list is placed in a contents frame, this policy specifies how many levels of heading you want shown there.

In large, complex documents there can be up to 4 levels of heading. Whilst in a normal contents list you'd want to see most of these, in a FRAMES document space (width) is tight and you don't really want to have to scroll the left frame. In such cases you may want to limit the amount of detail in the contents frame.

The default value "0" is interpreted as "all", i.e. a full contents list will be generated.

Contents frame width

Menu location: Conversion options -> Output policies -> Frames

If a contents frame is generated, this specifies its width. The default is 200 pixels. To specify a percentage, add the % sign on the end, e.g. "30%"

First frame page number

Menu location: Conversion options -> Output policies -> Frames

This specified the "page number" of the main document that you initially want shown in the main frame of the FRAMESET.

When converting large documents you can choose to split them into many smaller pages. Often such documents have a contents list as the sole content of the first page, and inside a FRAMES document his has the unfortunate side-effect of displaying a contents list in the main frame next to a contents list in the contents frame. In such cases you might choose to start by initially displaying the second page in the main frame. That's what this policy is for.

This appears on-screen as "First page to show in 'main' frame"

Footer frame depth

Menu location: Conversion options -> Output policies -> Frames

If a footer frame is generated, this specifies its depth. The default is 100 pixels. To specify a percentage, add the % sign on the end, e.g. "20%"

Use main footer in footer frame

Menu location: Conversion options -> Output policies -> Frames

When this policy is set the software will attempt to move any HTML footer defined for each page into a footer frame. This allows the same policy file to be used for both FRAMES and non-FRAMES generation.

In non-FRAMES generation the supplied footer will be included at the top of each page generated.

If this policy is enabled, then in FRAMES generation the supplied footer will be placed in a footer frame, and omitted from each page generated. This means the footer will always be visible in the footer frame, as the generated pages are displayed and scrolled in the main frame.

Further modification of this behaviour is possible by using the HTML fragments feature to define specific HTML fragments to be used in particular frames.

Only allow pages to be viewed in frames

Menu location: Conversion options -> Output policies -> Frames

This specifies that you only want your frame pages to be viewed inside frames. If this is selected some JavaScript is added to the page to force a refresh in frames if a browser should come direct to one of the pages. You might want this if a search engine indexes your individual pages. If that happens visitors may come to a single page and not see the surrounding frames.

Frame colours dialog

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

This dialog allows you to choose the colours used in each frame. By default all the frames will default to the colours of the main frame, which in turn will use the colours usually set for non-frames production.

Contents frame background colour
Contents frame text colour
Footer frame background colour
Footer frame text colour
Header frame background colour
Header frame text colour

Contents frame background colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

See discussion in Header frame background colour.

Contents frame text colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

See discussion in Header frame background colour.

Footer frame background colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

See discussion in Header frame background colour.

Footer frame text colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

See discussion in Header frame background colour.

Header frame background colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

The program allows you to independently set the background and colour of each frame in a FRAMESET. The default behaviour is that each frame will match the colours set for the main document using the policies

Background Colour
Text Colour

You can override this behaviour using the frame colouring policies

Header frame background colour
Header frame text colour
Contents frame background colour
Contents frame text colour
Footer frame background colour
Footer frame text colour

Header frame text colour

Menu location: Conversion options -> Output policies -> Frames
(then press the "Edit frame colours >>>" button)

Only applies to FRAMES production when HTML is being generated

See discussion in Header frame background colour.

Hyperlink policies

Menu location: Conversion options -> Output policies -> Hyperlinks

Add hyperlinks

http:// and www references
Convert Email references
- Allow email addresses that begin with a number
Check domain name syntax
Convert Gopher references
Convert Telnet references
Convert FTP references
- Convert "weak" FTP references
Convert USENET newsgroup references
- Convert only recognised USENET newsgroups
- Additional hierarchies to recognize
Add <BR> to lines with URLs
Max length of heading links

whole first paragraph would be treated as a title/heading.

Hyperlinks to other section numbers

Convert cross-references to other sections

Browser behaviour when hyperlink selected

Open links in new browser window
New browser window name

See also the comments in the adding hyperlinks section.

Create hyperlinks

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all valid "http" and www references that are found should be turned into active hyperlinks.

Such hyperlinks may sometimes get confused by surrounding punctuation characters.

This appears on-screen as "http:// and www references"

Create mailto links

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all valid email addresses that are found should be turned into active "mailto" hyperlinks.

AscToHTM has no way of checking email addresses, so "made up" addresses will also get converted, although the domain name will be validated.

An extra option allows email addresses beginning with a number to be accepted. Often USENET message IDs have an email format, but start with a number, so by Default these are not converted to email hyperlinks.

This appears on-screen as "Convert email references"

Allow email beginning with numbers

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies whether or not email addresses that begin with numbers are allowed or now.

The program has no way of validating email addresses. Often documents - especially Usenet posts and the like - contain message Ids that look like email addresses but aren't. These usually become with a number, and so by default the program will ignore "addresses" in this form.

On the other hand some ISPs (e.g. older CompuServe accounts) allow email addresses that start with numbers. You should toggle this policy according to which is more appropriate for your documents.

This appears on-screen as "Allow email addresses that begin with a number"

Create FTP links

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all FTP addresses that are found should be turned into active hyperlinks.

These will usually start with "ftp://" or be a domain name starting "ftp.".

However quite often FTP sites have domain names that don't start with "ftp." But do end in a recognised domain type such as ".com". An extra option allows the program to convert such "weak" or implicit FTP references into FTP links. See Only allow explicit FTP links

This appears on-screen as "Convert FTP references"

Only allow explicit FTP links

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all "internet" addresses which don't start with "www." or "ftp." Should be regarded as FTP sites.

Often FTP sites have domain names that don't start with "ftp." But do end in a recognised domain type such as ".com". For example rtfm.mit.edu is a well known archive.

This appears on-screen as "Convert "weak" FTP references"

Create Gopher links

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all gopher addresses that are found should be turned into active hyperlinks.

These will usually start with "gopher://".

This appears on-screen as "Convert Gopher references"

Create Telnet links

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that all telnet addresses that are found should be turned into active hyperlinks.

These will usually start with "telnet://".

This appears on-screen as "Convert Telnet references"

Check domain name syntax

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies whether or not potential URLs should have their "domain name" checked against the known domain name structures, (i.e. ends in .com, .org, .co.uk etc). Having this switched on reduces the likelyhood of invalid URLs being turned into clickable links that don't go anywhere. Note, the software doesn't check the domain exists, only that the domain name obeys the known rules.

You might want to switch this off if your document contains URLs that don't use standard domain names (e.g. they are inside an Intranet).

Create NEWS links

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that AscToHTM should attempt to identify Usenet newsgroup names and turn them into active "news" hyperlinks.

AscToHTM has no way of checking newsgroup names, so by default it will only convert names in recognised hierarchies such as alt.,comp.,rec. etc.

This appears on-screen as "Convert USENET newsgroup references"

Only use known groups

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that when detecting Usenet newsgroup names, AscToHTM should only convert names in recognised hierarchies such as alt.,comp.,rec. etc.. You can get the program to recognize additional hierarchies.

This appears on-screen as "Convert only recognised USENET newsgroups"

Recognised USENET groups

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that when detecting Usenet newsgroup names, AscToHTM should additionally allow "newsgroups" in these hierarchies in addition to the standard hierarchies such as alt.,comp.,rec. etc.

This appears on-screen as "Additional hierarchies to recognize"

Cross-refs at level

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies whether the level of heading at which AscToHTM should turn references to section numbers in the main text to hyperlinks to those sections.

This is only possible for numbered sections.

If selected, you should specify the level at which such cross-references should start. A value of "1" will attempt to convert all numbers N, N.N... to hyperlinks. A value of two will attempt to convert N.N, N.N.N... etc.

This policy is quite prone to error (e.g. Windows 3.1 often becomes a hyperlink to section 3.1). Consequently lower values are more error prone. A value of "2" is set by default.

Later versions may address this problem.

This appears on-screen as "Convert cross-references to other sections"

Add <BR> to lines with URLs

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that when a line is found to have a URL on it, a <BR> will be added to the output line. This will preserve the line structure of the original document. This can be useful where you have a list of URLs, one-per-line. It's less useful for a document where URLs are embedded in normal text.

Max length of heading links

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Hyperlinks

This option sets a limit on the length of headings used as titles for the generated HTML pages, or as hyperlinks in generated contents lists or directory pages.

When set to a non-zero value the heading text will be truncated back to the last end sentence period, or failing that space. When links are truncated, this is attempted to be at a word boundary, and the ellipsis "..." is added to indicate a truncated link.

This option can be useful when headings are very long, or when options such has Use first heading as title or use first line as heading or Use first line as Title are used on a file that places each paragraph on a single line (i.e. no line breaks). In such cases the

Open links in new browser window

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies that where hyperlinks are added to the HTML, the target window for these links should be a new window. This means that when you click on these links a new window is opened, and the original window continues to display the original document. If you don't select this policy, the new document replaces the old document in the current browser window.

New browser window name

Menu location: Conversion options -> Output policies -> Hyperlinks

This specifies whether or not AscToHTM should turn references to section numbers in the main text to hyperlinks to those sections.

Pre-processor policies

Menu location: Conversion options -> Output policies -> Pre-processor

A few policies can be used to control the use of the pre-processor. For details of how the pre-processor works and the tags available see the Tag manual that comes as part of the HTML documentation

Use Preprocessor

Include document section(s)

Allow definitions inside a user <PRE> section

Use Preprocessor

Menu location: Conversion options -> Output policies -> Pre-processor

When enabled the pre-processor is activated. You would only ever want to de-activate it to see what difference not processing an pre-processor commands would make.

Include document section(s)

Menu location: Conversion options -> Output policies -> Pre-processor

This is a comma-separated list of which SECTION's you want included in your document. This only applies if you've made use of the SECTION command to mark up parts of
your document to be conditionally output during the conversion.

Allow definitions inside PRE

Menu location: Conversion options -> Output policies -> Pre-processor

This options specifies that where user specified pre-formatted text are used, should AscToHTM search for definitions lines.

This is only relevant should you wish definitions to be highlighted in such cases.

Style policies

Menu location: Conversion options -> Output policies -> HTML styling

Miscellaneous

Allow automatic centring
- Centring Tolerance (in characters)
Ignore multiple blank lines

Definitions

Highlight definition terms
Use <DL> markup for defn. paras

Headings

Largest allowed <Hn> tag
Headings colour
Preserve underlining of headings

Emphasis

Markup emphasised text
- Use <EM> and <STRONG> markup

Paragraphs

Use <P> markup for paragraphs
Preserve any first line indentation

See the discussions on detecting paragraphs, definitions, headings, and emphasis characters.

Style sheet URL

Menu location: Conversion options -> Output policies -> HTML styling

Style sheets are a feature recently added to HTML. They are supported by Internet Explorer and Netscape 4.0 and above. They may not be supported by earlier browsers.

This policy allows you to specify the URL of any style sheet you want to use. AscToHTM will then add the requisite HTML to the <HEAD> portion of your document.

Highlight definition text

Menu location: Conversion options -> Output policies -> HTML styling

This policy specifies whether or not the definition term in a definition should be highlighted in bold.

See definitions for more details

Use <DL> markup for defn. paras

Menu location: Conversion options -> Output policies -> HTML styling

This policy specifies whether or not definition paragraphs should be marked up using <DL>...</DL> markup.

<DL> markup puts the definition term on one line, and the definition text as an indented paragraph immediately below.

See definitions for more details

Largest allowed <Hn> tag

Menu location: Conversion options -> Output policies -> HTML styling

This policy (together with Smallest allowed <Hn> tag) specify the range of values that may be used in <Hn> markup on headings. In HTML, <H1> gives the largest heading, <H2>... give smaller headings, thus the "largest" <Hn> value will actually be smaller than the "smallest" value.

These policies only apply to numbered headings that are detected.

Smallest allowed <Hn> tag

Menu location: Conversion options -> Output policies -> HTML styling

See discussion in Largest allowed <Hn> tag

Headings colour

Menu location: Conversion options -> Output policies -> HTML styling

This policy identifies the HTML COLOUR to be used for all headings in the document.

Preserve underlining of headings

Menu location: Conversion options -> Output policies -> HTML styling

This policy specifies that where headings are underlined in the source text then the heading should be underlined in the generated HTML.

Sometimes headings are underlined for emphasis in plain text, but once recognised as a heading you don't want the underlining itself preserved.

Allow automatic centring

Menu location: Conversion options -> Output policies -> HTML styling

When enabled the software will attempt centred text detection.

This appears on-screen as "Enable automatic centring"

Automatic centring tolerance

Menu location: Conversion options -> Output policies -> HTML styling

When centred text detection is enabled, this specifies how much off-centre text can be and still be considered as centred text. Text is compared to the page width, taking into account any left hand indentation.

If you make this value larger, more text will be considered to be centred and will be centred in the output, although only blocks of text that are wholly centred (all lines fall within the specified tolerance) will be regarded as centred text in the output.

Ignore multiple blank lines

Menu location: Conversion options -> Output policies -> HTML styling

When enabled multiple blank lines in the input will not be converted to multiple blank lines in the output. This can be desirable when converting a document that has been "paged" and so had extra blank lines added to space out the sections, and this spacing makes no sense and is unwanted in the HTML.

If not selected, AscToHTM will add extra <BR> markup for each additional blank line.

Add emphasis and bold markup

Menu location: Conversion options -> Output policies -> HTML styling

This policy determines whether or not AscToHTM should attempt to convert any potential emphasis characters encountered.

This appears on-screen as "Markup emphasis characters"

See the emphasis section for more details.

Use <EM> and <STRONG> markup

(New in version 5.0)

Menu location: Conversion options -> Output policies -> HTML styling

Specifies whether <EM> and <STRONG> markup should be used in preference to <I> and <B> when placing text in italic or bold markup.

Strictly speaking <EM> and <STRONG> should be used as these are "logical" as opposed to "physical" markup.

The preference has become a near religious decision which I avoid, by giving you this policy so you can choose (if you really care)

Use <P> markup for paragraphs

Menu location: Conversion options -> Output policies -> HTML styling

This policy determines whether <P> or <BR> markup should be used for paragraphs. <P> markup is the default, and is normally interpreted in browsers as occupying more white space.

If you don't want so much white space used, disable this policy.

Preserve new paragraph offset

Menu location: Conversion options -> Output policies -> HTML styling

When enabled, any first-line indentation detected for paragraphs will be preserved. Often paragraphs indent the first line by a few spaces. Where the software detects this you have the choice as to whether an indentation should be preserved in the output.

This appears on-screen as "Preserve any first line indentation"

Table generation policies

Menu location: Conversion options -> Output policies -> Tables

AscToHTM can attempt to generate HTML tables automatically wherever a section of pre-formatted text is encountered. This feature can be disabled in those cases where the "tables" generated are unappealing. In such cases the pre-formatted text will be marked up using <PRE>...</PRE> tags instead.

See also the discussion about pre-formatted text

This page allows you to change the following:-

Default <TABLE> tag attributes

If HTML TABLE generation is attempted, these attributes are used as defaults for the tables that are created. These defaults can be overridden on a table-by-table basis by using the various pre-processor TABLE command in the source document.

Number of header rows
Table border size
Table width
CELLPADDING
CELLSPACING

Colours

Table colour
Border colour
Colour odd and even rows differently
- Odd row colour
- Even row colour

Alignment

Table alignment
Default TABLE cell alignment

Other Conversion options

Convert cross-references inside tables to hyperlinks
Keep row separator lines in the output
Default table html attributes
Default table html cell attributes

Default TABLE header rows

Menu location: Conversion options -> Output policies -> Tables

This policy specifies how many lines should be regarded as the header of a table. AscToHTM can attempt to detect this, and it may not be the case that all tables in the same file have the same header size.

In such cases leave this value as 0.

This appears on-screen as "Number of header rows"

Default TABLE border size

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be given to the BORDER attribute of the <TABLE> tag.

This attribute determines the size of the table borders. If set to 0, no border will be added to the table.

Default TABLE border colour

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be used in the BORDERCOLOR attribute of any created <TABLE> tags. Not all browsers support these attributes.

The colours should be specified as valid HTML colours, or selected from the colour palette using the browse buttons.

See the discussion about HTML colours

Default TABLE colour

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be used in the BGCOLOR attribute of any created <TABLE> tags. Not all browsers support these attributes.

The colours should be specified as valid HTML colours, or selected from the colour palette using the browse buttons.

See the discussion about HTML colours

Colour data rows

Menu location: Conversion options -> Output policies -> Tables

These policies allow you to specify whether or not you want odd and even data rows of the table coloured differently. The table header will be coloured either according to the document's colour, or - if you've specified it - the table colour.

See the discussion about HTML colours

Default TABLE even row colour

Menu location: Conversion options -> Output policies -> Tables

When Colour data rows is enabled, this specifies the colour to be used as background on even numbered data rows in the table.

Default TABLE odd row colour

Menu location: Conversion options -> Output policies -> Tables

When Colour data rows is enabled, this specifies the colour to be used as background on odd numbered data rows in the table.

Default TABLE width

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be used for the WIDTH attribute of each generated <TABLE> tab. Table widths are specified in pixels, or as a percentage of the total screen width, thus "350" and "75%" would both be valid values.

Note, this value will be used for all tables generated in the document, so it is only suitable for use when you have a number of similar width tables.

Default TABLE cell padding

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be used in the CELLPADDING attribute of any created <TABLE> tags.

The CELLPADDING attribute specifies the amount of white space to be added to each table cell as padding round the cell's contents.

Default TABLE cell spacing

Menu location: Conversion options -> Output policies -> Tables

This policy specifies the value to be used in the CELLSPACING attribute of any created <TABLE> tags.

The CELLSPACING attribute specifies the amount of white space to be added between cells.

Default TABLE alignment

Menu location: Conversion options -> Output policies -> Tables

By default the software will attempt to automatically calculate the alignment of a table within a document, and in most cases will simply left align the table, possibly with a margin where one is detected.

This policy can be used to overrule that process and set the alignment for all tables in the document (e.g. to centre all tables).

To exert more control over particular columns in particular tables you should consider using a Table definition file

Default TABLE cell alignment

Menu location: Conversion options -> Output policies -> Tables

By default the software will attempt to automatically calculate the alignment of data inside each cell of a table. This will look at the placement of the data, and the type of data (e.g. numerical data is right justified).

This policy can be used to overrule that process and force a particular alignment. When set it will apply to all cells in all detected tables.

To exert more control over particular columns in particular tables you should consider using a Table definition file

Convert TABLE X-refs to links

Menu location: Conversion options -> Output policies -> Tables

This policy determines whether or not numbers in a table should be regarded as references to numbered sections in the document. If they are, then the number will be converted to a hyperlink to the numbered section. If they are not, the number will be left unconverted.

This appears on-screen as "Convert cross-references inside a table"

Keep original table lines in output

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Tables

The converter can look for horizontal lines in the input text of a table as signs of row separators. Normally when these are detected, the program will use them to decide how to allocate the input text into rows, but will then discard the lines themselves from the output.

Sometimes, however such lines are useful, and on occasion these lines may be being misinterpreted. either way, this policy allows you to specify that any horizontal lines found in the input should be replicated in the output, and not discarded.

This appears on-screen as "Keep row separator lines in the output"

Default table html attributes

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Tables

This option specifies any HTML attributes that should be added to each <TABLE> tag that is created. This can be used to customize the table's behaviour and appearance.

For example you can add the necessary JavaScript attributes to add the ability to sort the table.

Default table html cell attributes

(New in version 5.0)

Menu location: Conversion options -> Output policies -> Tables

This option specifies any HTML attributes that should be added to each cell in the tables that are create, that is, to the <TH> and <TD> tags that are created.

This can be used to customize the table's behaviour and appearance. For example you can add the necessary JavaScript attributes to add the ability to sort the table.

Configuration file policies

Menu Location: Conversion options -> Configuration files

These policies identify the locations of external configuration files used by AscToHTM.

HTML fragments file
Link Dictionary file
Table Definitions file
Text command file

Fragments file

Menu location: Conversion options -> Configuration files -> HTML fragments file

The HTML fragments feature allows you to customise the HTML generated by the software by defining pieces of HTML that the software can copy at various points into the output HTML. These "fragments" can be placed in-line within the source file, but are usually best placed in a separate file, in which case this policy is used to identify the location of this "HTML fragments file".

Link Dictionary File

Menu location: Conversion options -> Configuration files -> Link Dictionary

The Link Dictionary feature allows you to define text that should be converted into hyperlinks during the conversion.

This policy specifies the name of a dictionary file to be used. Once a files has been selected, the "Edit" button on this dialog will take you to the Link Dictionary Edit Dialog.

Definitions file

Menu location: Conversion options -> Configuration files -> Table Definitions File

The Table Definition File (TDF) feature allows you to define certain types of table and how they should be processed during the conversion.

This policy identifies the external configuration file that holds all your Table definition details and TDF commands.

Text command file

Menu location: Conversion options -> Configuration files -> Text Commands File

The Text Commands feature allows you to define simple text manipulations that can be performed on the text on input into the conversion.

This policy identifies the external configuration file that holds all your Text Command definitions.

Other policies

In addition to the policies set via the "Conversion Options" menu, there are policies that can set which correspond to options elsewhere in the user interface, and a few that can't be accessed via the user interface, but which can be edited manually into any policy file that is used.

Policies set via the Settings Menu
Policies set elsewhere in the user interface
Policies that need to be added manually

Settings policies

Menu location: Settings -> Diagnostics

These policies are equivalent to options accessible view the Settings menu

Error reporting level
Suppress INFO messages
Suppress program ERROR messages
Suppress TAG ERROR messages
Suppress URL messages
Suppress WARNING messages

Display Messages

Menu location: Settings -> Diagnostics

Specifies that all display and informational messages should be suppressed. If selected, you will get no messages displayed at all, although these can still be directed to a .log file (e.g. by using the /LOG command qualifier)

This policy can't be set via the user interface, but can be edited into a policy file should you wish.

This policy is equivalent to the /SILENT command qualifier.

Error reporting level

Menu location: Settings -> Diagnostics

Specifies the level of error reporting you want during the conversion. The program can generate a variety of messages of varying severity to inform you of the decisions it's made. These messages can be useful in explaining why a conversion has gone wrong, but are less interesting at other times.

Whilst all of these messages are copied into any diagnostic .lis files created (see Generate diagnostics files) regardless of severity, you can use this policy to choose the level of reporting you want to see on your screen.

The value is nominally in the range 1-10 with a value of 1 showing few messages and a value of 10 showing almost all messages. The default value is 5.

Suppress INFO messages

Menu location: Settings -> Diagnostics

Suppresses all "INFO" messages from the display. These are messages deemed to be "informational"

The severity (reporting level) will match how important/useful the seriousness of the warning.

Together with the Error Reporting Level policy, similar suppression policies and the /SILENT command qualifier, you have a fair degree of control of the level of reporting made visible during conversion.

Suppress TAG ERROR messages

Menu location: Settings -> Diagnostics

Suppresses all "TAG ERROR" messages from the display. These are messages reporting failure to implement inline tags and directives (if any) placed in your source text.

However, you should really fix up all your tag errors if possible.

Suppress URL messages

Menu location: Settings -> Diagnostics

Suppresses all "URL" messages from the display. These are messages that report the detection of URLs in the document being converted.

Suppress WARNING messages

Menu location: Settings -> Diagnostics

Suppresses all "WARNING" messages from the display. These are messages usually reporting that the program has possibly done something you may not have wanted it to, or not done something you might have wanted it to. For example if lines that could contain emphasis or headings are rejected, a warning will be issued.

The severity (reporting level) will match the seriousness of the warning.

Suppress PROGRAM ERROR messages

Menu location: Settings -> Diagnostics

Suppresses all "PROGRAM ERROR" messages from the display. These are messages indicating that the program itself thinks it's done something wrong.

There's not much you can do about such errors, other than ignore them, or report them to the author so he can fix the software.

Policies set elsewhere in the user interface

Some policies are saved into the policy fie, but are not accessed via the "Conversion Options" menu

These include:-

Default TABLE delimiter character

Default TABLE delimiter character

(New in version 5.0)

This policy is used to save the value of the delimiter character selected on the main screen whenever the input file type has been set to "other-delimited table", indicating that the input file is to be treated as a single, character-delimited, data table.

Policies that need to be added manually

Some policies cannot currently be set via the User Interface, instead these policies can only be added by opening a policy file in a text editor and adding the options by hand.

See Changing policies by editing a Policy file

Such policies include:

Tables

Default TABLE caption
Default TABLE header cols

Code samples

Use <CODE>..</CODE> markup

Contents lists

Contents style code

Headings

Expect Second Word Headings
First Section Number

Default TABLE caption

Specifies the caption to be applied to generated tables. However, since this will be applied to all generated tables, this is less useful that placing individual TABLE_CAPTION directives in your source text.

Default TABLE header cols

This tells AscToHTM how many columns in each table should be highlighted as "header" columns using <B> ... </B> markup inside the table cells.

Normally this is 0.

Use <CODE>..</CODE> markup

This policy specifies that where a code fragment has been detected or marked up, what sort of HTML markup should be used.

HTML provides a special <CODE> .. </CODE> markup that is suited to variables being quoted in text, but is not well suited for marking up several lines. In particular, you have to add non-breaking spaces and <BR> markup if you wish to correctly lay out several lines. This makes the resultant HTML harder to read and much larger.

For this reason, AscToHTM defaults to <PRE> .. </PRE> markup for code fragments. The only reason for reversing this might be if you needed the <CODE> tag for a particular style sheet to take effect.

Contents style code

Specifies a "style" to be applied to the contents list. There aren't too many options at present.

Possible values are

0 AscToHTM "Classic". Contents line is bolded

1 As above, but not bolded.

Expect Second Word Headings

This is a reserved policy that may be removed at some time in the future. when enabled, this policy indicates that the heading may be a numbered heading, but that the number is in the second (or perhaps third) word in the heading.

For example in the case of the following heading

        Section 2.3.4 - This is the next section

the section number "2.3.4" isn't at the start of the line, but instead follows the word "Section". This can still be counted as a numbered heading, with result that it can be checked to see that it is in sequence etc, etc.

First Section Number

Indicates the chapter number of the first numbered section in a file containing numbered sections (see Expect numbered headings).

Normally this starts at 0 or 1, but if it starts higher, then the program will reject headers as being out of sequence, and fail to detect to presence or absence of contents lists correctly.

Only for the advanced/foolhardy user.

Back to Contents List

Normal text	Default font
Headings	Heading Font
Text in tables	Table font
Table of contents	TOC Font
Fixed-pitch text	Fixed font

myfile_frame.html	FRAMESET file
myfile_contents_frame.html	(optional) contents frame
myfile_header_frame.html	(optional) header frame
myfile_footer_frame.html	(optional) footer frame

0	AscToHTM "Classic". Contents line is bolded
1	As above, but not bolded.