Documentation for the AscToHTM conversion utility : Change History

Documentation for the AscToHTM Text to HTML converter

Change History

Contents of this section

Version 5.0 (November 2004)

New functions in version 5.0

Ability to fully control table generation
Ability to "tag" your own tables for greater accuracy
Input text manipulation and labelling using "Text commands"
Support for non-ASCII character types and character encodings
Support for Definition Blocks
Support for comma-delimited and tab-delimited tables

New policies in version 5.0

New added HTML policies
New general analysis policies
New configuration file policies
New contents list policies
New file, page, paragraph and line structure policies
New file splitting policies
New font policies
New heading policies
New hyperlink policies
New styling policies
New table analysis policies
New table generation policies
New 'what to look for' policies
Other new policies

New programs in version 5.0

API version now available
New utility A2HDETAG

Other changes in version 5.0

New Pre-processor tags
Changes to the Windows version
Changes to the command line version
Changes to document analysis
Changes to documentation
Other new options

Version 4.1 (August 2001)

New functions in version 4.1
Other Changes in version 4.1

Version 4.0 (May 2001)

New functions in version 4.0
Other changes in version 4.0

Version 3.3 (June 2000)

New functions in version 3.3
Other changes in version 3.3

Version 3.2 (October 1999)

New functions in version 3.2
Other changes in version 3.2

Version 3.0 (August 1998)

New functions in version 3.0
Other changes in version 3.0

Version 2.3 (April 1998)

New functions in version 2.3
Other changes in version 2.3

Version 2.2 (Feb 1998)

New functions in version 2.2
Other changes in version 2.2

Version 2.1 (never officially released)

New functions in version 2.1
Other changes in version 2.1

Version 2.0 (October 1997)

New functions in version 2.0
Other changes in version 2.0

Version 1.1 (August 1997)

New functions in version 1.1
Other changes in version 1.1

Version 1.05 (July 1997)

New functions in version 1.05
Other changes in version 1.05

Version 1.04 (July 1997)

New functions in version 1.04
Other changes in version 1.04

Version 1.01 (April 1997)

Version 5.0 (November 2004)

Version 5.0 is the first major update to AscToHTM in 3 years. As such it contains a large number of enhancements and changes from the previous version 4.1.

New functions in version 5.0. Several major new features are added in this release to support more advanced conversions.
New policies in version 5.0. A large number of new policy options have been added in this version, and other previously not accessible via the user interface now are, and are fully documented for the first time
API version now available. An API interface is now available under separate license to developers wishing to add conversion abilities to their own software.
Other changes in version 5.0

New functions in version 5.0

Several major new features are added in version 5.0.

Ability to fully control table generation. This includes the ability to define alarm states for individual cells in a table, and the ability to represent the data in a column as a series of bar graphs.
Ability to tag your own tables for greater accuracy. Ideal for interfacing to software generating the text files which knows the table structure, as this bypasses all of AscToHTM's analysis (world-beating though that is!)
Input text manipulation and labelling using "Text commands. Perform text manipulation on the input text to label lines as headings or achieve other affects.
Support for non-ASCII character types and character encodings
Support for Definition Blocks. Define standard block of text to place into your document at key points (e.g. page footers, disclaimers etc).
Support for comma-delimited and tab-delimited tables

Ability to fully control table generation

To aid in processing tables, the program now allows you to identify various table structures by specifying various match conditions. Each time the software encounters a candidate table, it tests this against the match conditions to see if the "table" is of a known type.

For each table you can specify its structure, and various formatting rules to be used in its conversion. These structure and formatting definitions can be shared between multiple table types for your convenience.

All the table type, structures and formatting rules should be placed in an external text file, known as a Table Definition File (or TDF for short). A new policy allows you to identify which Table Definition File is to be used, and you can select this from the new Config File Location menu.

When a table matches a known table definition it is possible to:-

specify the layout of the table to ensure it is correctly converted into columns
customise the <TABLE> tag, and each cell in specified columns to have the CSS ID and HTML attributes of your choice added. This facilitates the fine tuning of appearance and the use of JavaScript on an almost cell-by-cell basis.
define "alarm" conditions for cells in a specified column, and have cells that are in alarm change colour, or to be placed in a different CSS class, so that they can be marked out through your stylesheet definitions. See Defining table alarms
define a "barchart" for cells in a given column, so that each cell's value is displayed as a bar sized to represent the value of the cell. See Defining table graphs and barcharts
selectively extract data form the table for use elsewhere in the HTML. See Locating data in a table

For full details see Using Table Definition Files (TDF).

Ability to "tag" your own tables for greater accuracy

The program now supports Tagged Table commands. These commands allow you to completely markup a table, specifying the column details, the row details and the contents of each table cell.

This approach can be used by those who want complete control over how their tables are constructed, or who are generating text files from a source which knows the table layout and can explicitly state it.

By using the tagged approach, you avoid the prospect of the program making mistakes when analysing the layout of the table.

As an example of using tagged table commands, the following sequence in the source file

        $_$_BEGIN_USER_TABLE C,1 in
        $_$_COLUMN_DETAILS 1,,,L, 2 in
        $_$_COLUMN_DETAILS 2,,,C, 1 ins
        $_$_TABLE_BORDER 1

        $_$_NEW_ROW HEAD
        $_$_NEW_CELL
        Substance (units)
        $_$_NEW_CELL
        Year
        Sampled

        $_$_NEW_ROW DATA
        $_$_NEW_CELL
        Alpha emitters (pCi/L)
        $_$_NEW_CELL
        1999

        $_$_NEW_ROW DATA
        $_$_NEW_CELL
        Asbestos (MFL)
        $_$_NEW_CELL
        1993
        $_$_END_TABLE

becomes

Substance (units)	Year Sampled
Alpha emitters (pCi/L)	1999
Asbestos (MFL)	1993

Input text manipulation and labelling using "Text commands"

The program now allows you to apply "text commands" to the input text, before it is converted. There are several commands possible, which allow you to identify lines in the input text that should be ignored, and text in the input file that should be removed or replaced.

You can also use commands to tell the software how to interpret certain types of line. For example to say which lines are headings, and which should be regarded as bullet points etc. The Text Commands to be used should be placed in an external Text Command File. A new policy allows you to identify which Text Command File is to be used, and you can select this from the new Config File Location menu.

For full details see Using Text Command Files

Support for non-ASCII character types and character encodings

Non-latin and Unicode character sets

Some support has been added for non-latin character sets. The character set names are based on those used in HTML charsets.

Support has been added for auto-detecting the character set used, but this is far from foolproof. If you are using non-latin character sets you may need to set the character set manually.

It is not possible at present to support multiple character sets in one document (unless you are using Unicode)

To support this feature the following policies have been added

the character encoding policy to allow the character encoding of a document to be set. The software has limited ability to detect Japanese ("x-sjis") and Cyrillic ("koi-8") text, but in some cases this will need to be set.

The auto-detect of character sets can be switched off by using the Look for character encodings policy. You might want to do this if the software wrongly suspects your document is a non-latin character set.

See Working with Unicode.

other special characters

Added support for parsing files with some Mime-encoded quotable strings in them. The new policy Input file contains MIME encoding can be found under Analysis->File structure. At present there is some (very limited) auto-detect for this feature.

Added support for documents with change bars. By default change bars are stripped out, and the changed text coloured red this behaviour may be changed in later versions. Added the new policy Input file has change bars which can be found under Analysis->File Structure.

Added support for converting DOS characters. The new policy Input file contains DOS characters can be found under Analysis->File Structure.

There is a limited auto-detect of DOS characters when diagrams are present.

Added Input file contains PCL codes policy. Again there is a limited ability to detect these codes. A few of the PCL codes are interpreted. Most are just discarded.

Improved handling of VT escape characters. These are either removed from the output or converted to "line" characters

Support for Definition Blocks

Definition blocks allow you to define blocks of text that you may then insert at any point in the text (e.g. to give an "end of page" effect). You can also "define variables" whose value is then inserted wherever a VARIABLE tag is used.

This feature, though supported by the core analysis engine, is expected to be used more by the users of the AscToHTM converter.

The pre-processor commands involved are:

Support for comma-delimited and tab-delimited tables

Pre-processor commands have been added to allow you to mark up a section of comma-delimited or tab-delimited data you want turning into a table.

The new pre-processor directives are the COMMA_DELIMITED)TABLE command and the DELIMITED_TABLE command

New /COMMA and /TABBEDcommand line qualifiers that allow comma delimited and tab delimited files be converted into tables.
New /TABLE command line qualifier that allows the input file to be treated as a single plain text table

In addition to this, the software now has the ability to automatically detect tab-delimited data tables.

New policies in version 5.0

A large number of new policy options have been added in version 5.0, and several that weren't previously accessible via the user interface can now be accessed.

New added HTML policies
New configuration file policies
New contents list policies
New file, page, paragraph and line structure policies
New file splitting policies
New font policies
New heading policies
New hyperlink policies
New table analysis policies
New table generation policies
Other new policies

New added HTML policies

The "added HTML" section acquires a number of new policies that allow you to create appropriate META tags in the HTML <HEAD> sections

Added Author URL policy to allow the author URL for this document to be defined.
Added Bookmark URL policy to allow the bookmark URL for this document to be defined.
Added Copyright URL policy to allow the URL of any copyright notice to be defined.
Added Shortcut Icon URL policy to allow the URL of any shortcut icon to be defined.

New general analysis policies

Added Hanging paragraph position(s) policy. This allows the indentation positions used for follow-on paragraphs in definitions to be specifies.

New configuration file policies

In version 5.0 there is a new menu option under Conversion options pointing to Configuration Files, that is files loaded in addition to the policy file to control different aspects of the conversion. The choices made on that menu are also saved as option in your policy file.

The Text command file allows you to define text manipulations that should be done to the source text as it is read during the conversion. This allows you to ignore certain lines, mark them out as headings, or perform substitutions on the input text. See Using Text Command Files.
The Definitions file allows you to define table definitions. Tables that match the supplied definitions can have their layout and format tightly controlled. See Using Table Definition Files (TDF)
The Fragments File allows you to define snippets of HTML that will be copied into the output at key points. See Using HTML fragments
The Link Dictionary File allows you to define text that should be converted into hyperlinks. See Using link dictionary files

New contents list policies

Added policy Add mail headers to contents list to control whether or not email headers which are turned into headings should be included in the contents list

New file, page, paragraph and line structure policies

File structure

Added Lines to ignore at start of file and Lines to ignore at end of file policies to allow lines at the start and end of the source file to be discarded. This can be useful if you source text is coming from a third party source that adds extra, unwanted, lines.

Added auto-detect of double spaced files (files where every second line is blank). This will set the Input file is double spaced policy whenever double-spaced text is detected (unless the policy has already been set).

Added Input file contains UNICODE characters policy. When enabled the program will create a UTF-8 output file.

Page structure

Added PAGE command. This marks a page boundary. In the HTML this creates a <HR> page separator

Page markers

Added Input file has page markers and Page marker size (in lines) policies. These allow you to identify that the file has page markers containing form feeds and that the first so many lines after the form feed should be discarded.

Paragraph structure

Added Preserve new paragraph offset policy. In documents where a first line offset is detected at the start of each paragraph you can elect to have this preserved in the output.

Line structure

Added options to allow more control over how the original document's file structure should be preserved

Added Treat each line as a paragraph policy. If this option is selected, every line in the source file is treated as a paragraph. This may be suitable if the file has been authored using an editor that wraps the lines (i.e. doesn't put in hard breaks) and which doesn't add blank lines between paragraphs.

Added Preserve line structure policy. If this option is selected a line break is added to every line, thereby preserving the line structure of the original.

File generation

Added Break up long HTML lines policy. If this option is selected, the output HTML will be broken into smaller lines to make it more readable.

New file splitting policies

Added policies to allow greater control over splitting large files into a set of smaller linked HTML pages

Added Number of words to include in filename policy to allow some control over file naming when deriving filenames from underlined headings.
Added Use numbered sequence in file names policy to allow generated files to be numbered, rather than named.
Added Split at page markers policy to allow filers to be split at page markers

New font policies

Added policies to allow different fonts to be applied to different types of text as follows

Normal text Default font

Headings Heading Font

Text in tables Table font

Table of contents Table of contents Font

Fixed-pitch text Fixed font

New heading policies

There are two new heading types that can be supported :-

Added support for embedded headings with the Expect embedded headings policy (see Embedded heading detection). These are "headings" that are embedded as the first sentence in a paragraph.
Added support for headings that start with particular words or phrases via the Heading key phrases policy (see Key phrase headings for more on this).

Also added :-

Added the policy Check indentation for consistency so that it could be disabled in documents where headings were centred (and thus all at different indentations)
Added support for headings that span up to 3 lines, previously this was only 2.

New hyperlink policies

Added Max length of heading links to limit the display length of any hyperlink derived from an observed heading.
Added the policy Create Gopher links to toggle the conversion of gopher links into hyperlinks.
Added the policy Create Telnet links to toggle the conversion of telnet links into hyperlinks.
Added Check domain name syntax policy to toggle the checking of domain name syntax in detected URLs. You can now switch this off to allow intranet links to be accepted.
Changed hyperlink detection to only allow explicit FTP URLs and email addresses that don't start with numbers. These behaviours can be reversed using the new policies Only allow explicit FTP links and Allow email beginning with numbers, both of which are on the Output->Hyperlinks tab.
Added support for new top level domains (.info, .biz etc)
Added support for the "snews://" secure news server protocol
URLs of the form http://username@domain_name/... are now supported
Added support for "obfuscated" URLs such as

http://3640005069/
http://7934972365/
http://0330.0366.0021.0315/
http://%6c%6f%63%6b%65%72%67%6e%6f%6d%65%2e%63%6f%6d/

Although the display text is left unchanged, the hyperlink will point to a non-obfuscated URL (either the domain name, or an IP address). This is because the obfuscated URLs such as there are often used by spammers, and the author has no intention of allowing his software to aid spammers in their goals.

If someone cares to give me a valid reason for using such URLs I may reconsider this behaviour.

New styling policies

Added Allow automatic centring and Automatic centring tolerance policies. These allow you to look for text that is centred and to specify a tolerance used in this detection.
Added Use <EM> and <STRONG> markup policy to allow a choice of markup for phrases highlighted in asterisks or underscores.

New table analysis policies

Added Default TABLE layout policy (also the pre-processor TABLE_LAYOUT command) This allows you to specify the number of columns in each table, and the attributes of each column, specifically the character position that marks the end of each column. Rather than use this policy, it is probably better to use the related pre-processor TABLE_LAYOUT command in the source text on a per-table basis.
Added Ignore table header during analysis policy (also pre-processor TABLE_IGNORE_HEADER command) Specifies that table headers should be ignored when columns are being auto-detected. Some tables have complex headers that confuse the analysis. This policy can be used to help them be ignored.
Added Table extending factor policy. This controls the degree to which pre-formatted lines should be expanded into adjacent text.
Added Column merging factor policy which controls the degree to which columns which don't appear to be very clear should be "merged" together
Added Could be blank line separated policy Indicates that tables could be using blank lines to separate rows of data. This affects the analysis and detection of the tables extent.
Added Column boundaries have zero width policy for tables that have no separator character between columns. Can be useful for some tables generated by software
Added Look for diagrams policy. Can be used to stop complex tables being wrongly interpreted as "diagrams"
Added Default TABLE cell alignment and Default TABLE alignment policies to allow you to set the default alignments of data within table cells and of tables on the page.
Added Column boundaries have zero width policy to allow for tables generated with a fixed layout an no gap between columns.
Added Default TABLE delimiter character policy to allow the delimiter used for delimited data tables to be specified (when not comma or tab)
Added Expect multiple COLSPAN values policy to allow you to identify that tables may have many cells which span columns, and to therefore relax the criteria when searching for columns in a table.
Added Max length of candidate table lines policy to allow you to limit the length of input line that could be considered for inclusion in a candidate table.

New table generation policies

Added Keep original table lines in output policy to allow lines that separated rows in the input data to be maintained in the output.
Added Default table html attributes policy to allow the definition of HTML attributes to be added to the <TABLE> tags that are generated
Added Default table html cell attributes policy to allow the definition of HTML attributes to be added to the <TH> and <TD> tags that are generated

New 'what to look for' policies

Added Look for underlined text policy. This allows text detected as underlined (other than headings) to be underlined in the output.
Added Look for hanging paragraphs and Minimum ruler length policies. these allow horizontal lines in the soured to be turned into <HR> tags.

Other new policies

The following policies can't be accessed via the User Interface, but are listed here for completeness.

New programs in version 5.0

As well as the Windows and console versions, AscToHTM is also available (under separate licence) in API form so that developers can harness the power of AscToHTM's conversion abilities for use in their own software.

Also the New utility A2HDETAG allows documents marked up using the AscToHTM pre-processor to be converted into plain text files (e.g. for sending out as newsletters).

API version now available

As with all JafSoft converters, AscToHTM is available under separate license as an Application Programming Interface (API). This API allows software developers to harness the powerful abilities of AscToHTM from within their own software products.

The API is written in C++, and is supplied as either a library or a DLL under Windows. As such it can easily be invoked from C, C++ and Visual Basic software and has also been successfully invoked from inside Java and C# programs.

New utility A2HDETAG

For users who register, there is a new, separate command line utility called A2HDETAG available so they can "de-tag" their source files of all AscToHTM pre-processor tags, leaving a plain text fit for publishing, e.g. on Usenet.

In conjunction with this new BEGIN/END_ASCII ... END_ASCII pre-processor tags have been added. These identify text that will be copied to the output of A2HDETAG. It is ignored in all other conversions, and is intended to allow alternative text to be placed in text and HTML versions of a document.

Other changes in version 5.0

Other changes include :-

New Pre-processor tags
Changes to the Windows version
Changes to the command line version
Changes to document analysis
Changes to documentation
Other new options

New Pre-processor tags

Added several new pre-processor in-line tags :-

FILENAME	outputs name of file being converted
FRACTION	outputs a fraction
VERSION	outputs AscToHTM program name and version number
IGNORE	multi-line text to be ignored
IGNORE_THIS	in-line text to be ignored

Changes to the Windows version

The main screen now allows access to Policy file selection. Previously this was only available on the menu structure. The Menu structure has been left unchanged, meaning you now have two ways of choosing your policy files.

The main screen now allows you to search sub folders when using wildcards.

The main screen also allows you to specify the File conversion type. You can choose to treat the input file as a number of different table types (e.g. tab-delimited data).

You no longer get prompted to "save policy" just because you pressed OK on one of the policy sheets. Now this only happens when something has been changed.

The main menu now has a "check for updates" option. If you select this you'll be taken to the JafSoft website where you'll be told if any newer versions of the software have been released.

Program now remembers positions of windows from one invocation to the next.

The user interface is now available in Italian, French and Swedish.

Changes to the command line version

Command line now allows multiple filespecs, separated by spaces. Policy file must now be a .pol file, rather than the second argument.

Changes to document analysis

More changes on bullet characters, in particular to disallow 'O' (upper case) from becoming a bullet character through analysis. This really doesn't work in Portuguese documents :-) 'o' (lower case) may still be detected. If upper case 'O' is wanted this can still be manually switched on.

Horizontal lines are now implemented as line rules whose length attempts to approximate the original (e.g. 50% or whatever). Previously lines would become full width.

Bookmark names from filename are now lower case (to reduce possible mismatches)

Shareware version now expires after 30 days + 5 uses. This will allow people to use the software on 5 different days after the first 30 days, giving people more time to evaluate the software at their leisure.

Now strip out leading and trailing "---" from heading text to make them more presentable in HTML

Changed emphasis handling to allow hyphenated parts to be emphasised independently, e.g. pre-formatted or pre-formatted.

Fine-tuned the detection of whether or not a file has an in-situ contents list

The LINKPOINT pre-processor tag can now be used as a directive as well as an in-line tag. (see the Tag manual for details).

Increased maximum width allowed for input lines in tables to 5.0 (after encountering a sample at 165). Lines longer than this are still disregarded as candidate table lines.

Improved analysis for tables using bar ('|') column separators

Improved detection of ASCII art diagrams.

Improved handling of heavily indented blocks of text. Previously these were (poorly) rendered as tables. Now the tables more accurately preserve the large indentation (see Text block detection).

The software will now automatically detect where a table is in fact tab-delimited data. Where detected it will then and use that tab structure to calculate columns.

Changes to documentation

This document has been completely re-written. It is converted from a single text file into the HTML pages, an RTF file and the Windows Help file using the AscToHTM and AscToHTM programs. You can view the source file for this document as file "AscToHTM.txt".

The Tag manual describes the tagging systems available to JafSoft conversion utilities. Note that not all of the tags described there are relevant (or supported) in HTML generation. However many are common between the converters, should you wish to convert the same text file into other formats

A "Table manual" is under production to explain how to get the most from tables in your conversions. This is expected to appear some time after AscToHTM 5.0 is released.

Other new options

Added the "Suppress URL messages" option to the Diagnostic settings. When enabled all URLs, email addresses etc will be listed in the log file. Since this file can be saved to disk, this is one way of identifying all the candidate hyperlinks from your text file.
The new ALLOW and DISALLOW tags allow you to enable/disable the search for headings and lists to sections of the document. This helps with eliminating faulty analysis that confuses numbered lists with headings, or lines of text ALL IN CAPITALS as headings.

Version 4.1 (August 2001)

Version 4.1 was a major update from the previous release 4.0.

New functions in version 4.1

New /TABLE command line qualifier that allows the input file to be treated as a single plain text table

Added support for HEAD_SCRIPT HTML fragment. This allows HTML to be defined that can be copied into the <HEAD> of a document. This can included <META> tags of <SCRIPT>...</SCRIPT> sections.

Added Swedish interface. Many thanks to Dan Sverraby.

Added new policy Only allow pages to be viewed in frames

New utility A2HDETAG is available to registered users so they can "de-tag" their source files to remove all AscToHTM pre-processor tags, leaving a plain text fit for publishing, e.g. on Usenet.

Added BEGIN_ASCII ... END_ASCII pre-processor tags. These identify text that will be copied to the output of A2HDETAG. It is ignored in all other conversions, and is intended to allow alternative text to be placed in text and HTML versions of a document.

Added character encoding policy to allow the character encoding of a document to be set. The software has limited ability to detect Japanese ("x-sjis") and Cyrillic ("koi-8") text, but in some cases this will need to be set.

The auto-detect of character sets can be switched off by using the Look for character encodings policy

Added policies to allow different fonts to be applied to different types of text as follows

Normal text Default font

Headings Heading Font

Text in tables Table font

Table of contents TOC Font

Fixed-pitch text Fixed font

The "Default Font" policy existed previously, the other four policies are new in this version.

Added PAGE directive. This marks a page boundary. In HTML this simply results in a <HR> tag, since HTML doesn't really support pages. This may be expanded in future to allow page numbers and the like to be displayed.

Other Changes in version 4.1

Windows version

Loading a policy file with "place policy in frames" policy will now toggle the Conversion type

You no longer get prompted to "save policy" just because you pressed OK on one of the policy sheets. Now this only happens when something has been changed.

The main menu now has a "check for updates" option. If you select this you'll be taken to the JafSoft website where you'll be told if any newer versions of the software have been released.

Documentation

The list of bug fixes is removed from this document and is now to be found on-line at http://www.jafsoft.com/doco/asctohtm_bug_history.html

All versions

Added support for HTML fragment files to $_$_INCLUDE other HTML fragment files. This allows common fragments to be shared.

Fine-tuned the detection of whether or not a file has an in-situ contents list

When Frames generation is selected the default "Split level" is set to 1 instead of 2. This means you'll get fewer files generated and - depending on the type of headings you have - no splitting may occur unless you manually increase the split level.

The LINKPOINT pre-processor tag can now be used as a directive as well as an in-line tag. (see the Tag manual for details).

Added a "Range" attribute to the CONTENTS_LIST tag. This allows mini-contents lists to be generated which contain only entries for a part of the document, rather than the whole document, e.g. for just a single chapter. This should help those who want to split large files into pages and to have a mini-contents list for each section.

Improved handling of VT escape characters. These are either removed from the output or converted to "line" characters

Added auto-detect of double spaced files (files where every second line is blank). This will set the Input file is double spaced policy whenever double-spaced text is detected (unless the policy has already been set).

Version 4.0 (May 2001)

Version 4.0 represents a major update over the previous version 3.3.

New functions in version 4.0

API version

For those wishing to call AscToHTM programmatically, an API has been developed. This is sold under separate license. Contact info<at>jafsoft.com (replace "<at>" by "@") if you're interested.

Linux version

A Linux command line version will soon be available. Beta versions have been tested, and I hope to do a Linux command line release just after version 4 is released.

Windows version

You can now choose from the main screen whether you want your HTML output as one or more HTML file(s), sent to the Windows Clipboard (see Output to the Windows clipboard, or turned into a set of HTML frames (see Frames).

Program now remembers positions of windows from one invocation to the next.

The user interface is now available in Italian.

All versions

Version 4 introduces frames support (see Frames). This introduces a large number of supporting policies :-

Place document in frames
Output frame name

Header Frame depth
Footer Frame depth
Contents Frame width

Use main header in header frame
Use main footer in footer frame

Add contents frame if possible
Add Frame border

Open frame links in new window
New frame link window name

Add NOFRAMES links
NOFRAMES link URL

Number of levels in contents frame

First frame page number

Header frame background colour
Header frame text colour
Contents frame background colour
Contents frame text colour
Footer frame background colour
Footer frame text colour

Added HTML fragments feature, with HTML fragments file policy and DEFINE_HTML_FRAGMENT, RESET_HTML_FRAGMENT pre-processor commands. This allows you to define HTML fragments that can be used to replace the standard HTML generated by the program. This allows you to customize headers, footers, horizontal rules, contents lists, navigation bars and more.

Added support for URL parsing, including :-

new top level domains (.info, .biz etc) are supported

the "snews://" secure news server protocol type is now supported

URLs of the form http://username@domain_name/... are now supported

Added Check domain name syntax policy

Added Create Telnet links policy

Added support for "obfuscated" URLs such as

http://3640005069/
http://7934972365/
http://0330.0366.0021.0315/
http://%6c%6f%63%6b%65%72%67%6e%6f%6d%65%2e%63%6f%6d/

Although the display text is left unchanged, the hyperlink will point to a non-obfuscated URL (either the domain name, or an IP address). This is because the obfuscated URLs such as there are often used by spammers, and the author has no intention of allowing his software to aid spammers in their goals.

If someone cares to give me a valid reason for using such URLs I may reconsider this behaviour.

Added support for embedded headings with the Expect embedded headings policy (see embedded heading detection). These are "headings" that are embedded as the first sentence in a paragraph.

Added support for headings that start with particular words or phrases via the Heading key phrases policy (see section on detecting key phrase headings).

New /COMMA and /TABBED command line qualifiers that allow comma delimited and tab delimited files be converted into tables.

Added Check indentation for consistency policy to allow checking of headings to be relaxed (e.g. when they're centred on the page).

Added Look for diagrams policy

Added Input file contains PCL codes policy

Added Input file contains Japanese characters support.

Added Preserve new paragraph offset policy

Added Omit <HEAD> and <BODY> from output policy

Added Document Base URL policy

Added Comment generation code policy

Added Number of words to include in filename policy to allow filenames to be generated from the first few words of the title when splitting documents with underlined or capitalised headings at each heading.

Added Lines to ignore at end of file and Lines to ignore at start of file policies to

allow lines at the start and end of the source file to be discarded. This can be useful if you source text is coming from a third party source that adds extra, unwanted, lines.

Added Suppress all colour markup policy

Added Column boundaries have zero width policy

Other changes in version 4.0

Windows version

On some systems DDE doesn't always work properly. This would cause the program to hang when it attempted to display results. In such cases you would need to stop the program from the task manager. In version 4 the program will now detect when this has happened and disable use of DDE next time it runs.

NOTE: DDE won't work with Netscape 6.0 (it doesn't support it)

Added the policy Suppress URL messages to the Settings | Diagnostics menu option. When enabled all URLs, email addresses etc will be listed in the log file. Since this file can be saved to disk, this is one way of identifying all the candidate hyperlinks from your text file.

All versions

Improved analysis for tables using bar ('|') column separators

Improved detection of ASCII art diagrams.

Improved handling of heavily indented blocks of text. Previously these were (poorly) rendered as tables. Now the tables more accurately preserve the large indentation (see Text blocks).

The first three words of an underlined heading are now used to generate the filename. Previously only the first word was used, leading to less meaningful names, with more chances of duplication.

VMS command line now allows multiple filespecs, separated by spaces. Policy file must now be a .pol file, rather than the second argument.

Anchor names from filename are now lower case (to reduce possible mismatches)

Shareware version now expires after 30 days + 5 uses. This will allow people to use the software on 5 different days after the first 30 days, giving people more time to evaluate the software at their leisure.

Now strip out leading and trailing "---" from heading text to make them more presentable in HTML or RTF

Added support for headings that span up to 3 lines, previously this was only 2.

Changed heading to allow <H4> markup to be used. Previously "level 4" headings would get <H3> markup since anything smaller would end up smaller than the main text. With the advent of CSS style sheets This should be less of a problem.

Changed emphasis handling to allow hyphenated parts to be emphasised independently, e.g. pre-formatted or pre-formatted.

Version 3.3 (June 2000)

The AscToHTM 3.3 release follows 6 "micro-releases" announced via the updates page on the Web. As such it will appear as a small step forward over 3.2.06, but in fact it offers a fair amount of new functionality over version 3.2

Major changes in version 3.3 include :-

Support for fonts. You can now choose a font for the whole document. By default this is implemented using CSS, but you can elect to use <FONT> tags should you prefer.

Enhanced Language support The Spanish and German interfaces added in the last version have had Portuguese added. Also a new feature allows you to save the interface to a "language skin" text file which may be edited and then reloaded. Using this feature we can now offer

American English (simply a spell-checked UK English file)

"Babelfish" French. A French translation from http://babelfish.altavista.com/

"Babelfish" Italian. An Italian translation from Babelfish.

If anyone wants to correct these files and send them back to me, feel free.

More table generation controls. Several new controls have been added to give you more control over the detection, analysis and generation of tables in the text.

Support for comma and tab delimited tables. Pre-processor commands have been added to allow you to mark up a section of comma-delimited or tab-delimited data you want turning into a table.

Support for preserving file/line structures. You can now elect to preserve the original line structure of a file, or to place the whole file in <PRE> markup (which is a little defeatist, but has its uses)

Support for non-standard characters. The program can now recognize, to a limited extent, DOS line-drawing characters, MIME-encoded text and text documents with "change bars" in them.

New "Tag manual". The Using the pre-processor and in-line tags sections of this document have now been re-merged and their contents largely moved to a new document called the Tag manual.

New functions in version 3.3

Fonts

The default font for the whole document can now be set via the Default font policy. Headings will also adopt the selected

font, and will scale with the selected font size, although the <H1> headers are slightly smaller than the default.

You can choose to have the fonts implemented using <FONT> tags or CSS (e.g. according to your target audience) using the Use CSS to implement fonts policy.

Tables

Added several new policies and tags to help with table analysis. Policies added include

Default TABLE layout
(also pre-processor tag TABLE_LAYOUT)

This allows you to specify the number of columns in each table, and the attributes of each column, specifically the character position that marks the end of each column. Rather than use this policy, it is probably better to use the related directive $_$_TABLE_LAYOUT in the source text on a per-table basis.

Default TABLE alignment
(also pre-processor tag TABLE_ALIGN)

Allows the alignment of the table to be specified (left, right, center)

Ignore table header during analysis (also pre-processor tag TABLE_IGNORE_HEADER)

Specifies that table headers should be ignored when columns are being auto-detected. Some tables have complex headers that confuse the analysis. This policy can be used to help them be ignored.

Table extending factor

Controls the degree to which pre-formatted lines should be expanded into adjacent text.

Column merging factor

Controls the degree to which columns which don't appear to be very clear should be "merged" together

Could be blank line separated

Indicates that tables could be using blank lines to separate rows of data. This affects the analysis and detection of the tables extent.

Added support for embedding comma-delimited and tab-delimited table data in your source file (e.g. data exported from Excel and the line).

The new pre-processor directives :-

BEGIN/END_COMMA_DELIMITED_TABLE
BEGIN/END_DELIMITED_TABLE

Other

Added options to allow more control over how the original document's file structure should be preserved

Treat each line as a paragraph

If this option is selected, every line in the source file is treated as a paragraph. This may be suitable if the file has been authored using an editor that wraps the lines (i.e. doesn't put in hard breaks) and which doesn't add blank lines between paragraphs.

Preserve line structure

If this option is selected a <BR> is added to every line, thereby preserving the line structure of the original and giving the resulting HTML file an "A4 look" that hugs the left margin regardless of how wide the window is made.

Preserve file structure using <PRE>

If this option is selected the whole document is placed in <PRE> markup, and very few conversions are attempted. This is really a "last resort" option that you may want to use if the file has complex structures which the program is failing to understand. This option was added for a customer who wanted to convert all 2800 RFCs without having to manually correct each one.

Added support for parsing files with some Mime-encoded quotable strings in them. The new policy Input file contains mime encoding can be found under Analysis->File structure. At present there is some (very limited) auto-detect for this feature.

Added support for documents with change bars. By default change bars are stripped out, and the changed text coloured red this behaviour may be changed in later versions. Added the new policy Input file has change bars which can be found under Analysis->File Structure.

Added support for converting DOS characters. The new policy Input file contains DOS characters can be found under

Analysis->File Structure.

There is a limited auto-detect of DOS characters when diagrams are present.

Changed hyperlink detection to only allow explicit FTP URLs and email addresses that don't start with numbers. These behaviours can be reversed using the new policies Only allow explicit FTP links and Allow email beginning with numbers, both of which are on the Output->Hyperlinks tab.

Added the policy Create gopher links to toggle the conversion of gopher links into hyperlinks.

Added the policy Check indentation for consistency so that it could be disabled in documents where headings were centred (and thus all at different indentations)

Added several new pre-processor in-line tags :-

FILENAME - output name of converted file
FRACTION - output a fraction
VERSION - output program version number

IGNORE_THIS - for comments in the source code

Added policy to allow selection of which version of HTML should be generated. Policy is "HTML version to be targeted".

Only "HTML 3.2" and "HTML 4.0 Transitional" are currently supported.

Other changes in version 3.3

Windows

The main screen now allows access to Policy file selection. Previously this was only available on the menu structure. The Menu structure has been left unchanged, meaning you now have two ways of choosing your policy files.

All

The contents list styling has been changed slightly. For example only the major section headings are now shown in bold. People were complaining :-)

Now add BORDER=0 attribute to tables with no border, rather than just omitting the attribute. This is a workaround for a bug in Netscape where a gap appears where a border would be when coloured rows are selected.

Support for IE 3.0 as the browser of choice is added, by allowing the filename rather than file URL to be passed to the browser. To do this disable the "file://localhost/" option on the Settings->Viewers dialog screen.

More changes on bullet characters, in particular to disallow 'O' (upper case) from becoming a bullet character through analysis. This really doesn't work in Portuguese documents :-) 'o' (lower case) may still be detected. If upper case 'O' is wanted this can still be manually switched on.

Increased maximum width allowed in tables to 200 (after encountering a sample at 165). Lines longer than this are disregarded as candidate table lines.

Introduction of German and Portuguese user interface, with extension of the Spanish user interface.

Horizontal lines are now implemented as <HR> tags whose length attempts to approximate the original (e.g. 50% or whatever). Previously lines would become full width.

Chapters 7 and 8 of this document were merged into a single chapter 7 (about the pre-processor). Most of that material has now been moved to the new Tag manual. Subsequent chapters have thus been renumbered which may lead to invalid references to chapter 11... especially if you keep old versions of the doco lying around.

Also reversed the order of sections in this "Change History" section

Version 3.2 (October 1999)

(Version 3.1 was never released, but a release of AscToTab occurred sometime after version 3.0, and so in keeping with the policy of synchronizing version numbers that was labelled version 3.1)

Over a year after the last release, version 3.2 is a major upgrade, but is only given a minor version number change because the remainder of the functionality produced in that time will be revealed in version 4.0.

Version 3.2 starts to prepare the groundwork for Cascading Style Sheet (CSS) and general font support that will be introduced in version 4.0. This has required a fairly radical change to the type of HTML code generated and how this is put together.

For example the HTML is now more standards compliant (this is now a stated goal of the software, although I can't always promise full compliance see Standards compliance), and as an aid towards CSS support "optional" end tags such as </P> are now being placed in the generated HTML.

Note that the use of the <FONT> tag is deprecated in HTML 4.0, and if you choose to add FONT markup to your pages they'll become much bigger, especially if they contain tables. This is because the HTML standard requires the FONT tag to continually be re-expressed to achieve the right appearance in all browsers (believe me, I only accepted this through bitter experience and grudgingly).

Major changes in version 3.2 include :-

The program now always makes three passes through the document - previously it only did this if a contents list was requested. This may make the conversion a little slower. The middle pass calculates how the file will be split into sections, where all the hyperlinks should point to and what the contents list should be. This approach should be less error prone than previously.

New "overview" options (see 'What to look for' Policies). These allow you to easily enable and disable the program's search for certain features.

Introduction of in-line tagging (see in-line tags). These allow you to get more out of your conversion by inserting commands into your source text.

Addition of DDE support (in Windows)

New and improved command line options, and full command line support built into the Windows version

Improved message filtering. Each message is now labelled according to its type (information, warning etc), and may be optionally suppressed or filtered by severity. A new /SILENT command qualifier allows complete suppression of messages.

Improved log file capability

Added support for mail and USENET headers

(Limited) support added for stripping out page markers, converting "double spaced" files, and converting .prn and VT escape sequences. This functionality may be improved in later versions.

New options to colour the odd and even rows of tables differently (see Table generation policies and 7.1.4)

New functions in version 3.2

Windows Version

Added "Save" option to status dialog, so that the messages can be saved into a .log file

Added DDE support to display results in existing browser window

Full drag and drop support added. You can now drag files onto the program when it is visible.

New "browse for directory" buttons added.

More menu options added to make finding policies easier.

All versions

Now support tab-delimited tables

Support for stripping out mail and USENET headers

New pre-processor directives :-

BEGIN/END_DELIMITED_TABLE section delimiters

BEGIN/END_IGNORE command

CONTENTS_LIST command

NAVIGATION_BAR command

LINERULE command

TOC command

New and improved command line qualifiers

/CONSOLE

/LIST

filename

/SILENT

(improved) <filespec>. You can now specify the log filename

(improved) filename. You can now specify the created policy filename

New overview "look for" analysis policies :-

Look for indentation

Look for white space

Look for short lines

Look for quoted text

Look for preformatted text

Look for MAIL and USENET headers

Look for horizontal rulers and Minimum ruler length

Look for MAIL and USENET headers

Look for bullets

Look for hanging paragraphs

Look for white space

Other new analysis policies :-

Input file has page markers and Page marker size (in lines)

Input file is double spaced

recognize '-' as a bullet

recognize 'o' as a bullet

New diagnostic policies :-

"Monitor tag generation"

Display messages policy and /SILENT qualifier

Suppress INFO messages,

Suppress TAG ERROR messages

Suppress URL messages

Suppress WARNING messages

Suppress program ERROR messages

Other new output policies :-

Maximum level to show in contents

Preserve underlining of headings

Use <EM> and <STRONG> markup

Colour data rows and related policies (see TABLE generation policies).

Default TABLE cell alignment and TABLE_CELL_ALIGN directive

Suppress all colour markup

Open links in new browser window and new browser window name

Break up long HTML lines

Other changes in version 3.2

On the web site, and documentation

A dedicated site www.jafsoft.com now deals with AscToHTM and related products.

An updates page has been added to the Web site. This will list all the updates available for AscToHTM, although in most cases you'll need to be a registered user to receive details for you to obtain the update.

An AscToHTM FAQ has been added to the web site. It's not finished yet (what part of the web is?), but it may help answer some of your questions.

Created a new document called "The Policy manual". This replaces what was becoming the largest section of this document.

Windows version

The Windows help file now has a better Index. It also has a full contents list as a topic, showing you the structure of the RTF file used to generate the Help file. Unfortunately I've been unable to hyperlink this topic.

The Windows version now "remembers" which options page you were on so that each time you go back there the same sheet is shown.

The Windows version is now "statically linked" against the necessary .DLLs. This makes the program slightly larger, but makes the download smaller as it is no longer necessary to ship .DLLs with the program. This makes overall version management simpler.

VMS version

The VMS version now converts all filenames to lower case internally. This is so that all hyperlinks and references to the file are in lower case, making them more Internet-friendly and portable to other systems.

All versions

Changes to the tagging to aid standards compliance and CSS support. this includes the addition of the </P> tag which was previously omitted. These changes have introduced slight differences in the amount of vertical white spacing produced in places.

Improvements have been made to the file splitting algorithms. In particular

The program will no longer generate two output pages with the same name. Where duplicate names are detected, the second file is given a generated name, usually by appending "_n" (n=1,2,3...) to the filename. All hyperlinks pointing to sections in the duplicate file will be adjusted accordingly.

A file with underlined headings can now be split into pages at the heading boundaries. The subsequent pages have _U1, _U2... appended to the name of the first page.

Local links (i.e. to anchors in the same file) are now recognised as such, and the filename is omitted.

This should make it easier to rename files after production without breaking local hyperlinks. Links to/from other files would still stop working though.

link names for underlined or capitalised headings that are more than 60 characters long are now truncated. They are given a link name derived from the first 30 characters of the section name with a unique identifier tagged on the end. This avoids long link names being split over two or more lines and becoming unusable.

Allow relative links to subtract out filename (e.g. in contents list) when target is in same file

Can now recognize URLs with commas in then such as recognize http://cgi.pathfinder.com/netly/opinion/0,1042,1692,00.html in addition to comma separated lists of URLs.

The KEYWORDS, DESCRIPTION and TITLE pre-processor commands can now be multi-line. This allows long lists of keywords to be placed over several lines (each beginning with the command), making then easier to manage.

The default name for the directory index file is now "dirindex.html" rather than "index.html" to prevent overwriting of any existing index file.

Program now always does a "contents pass". Benefits of this are

can now generate in situ contents lists /contents bars

can now generate navigation bars wherever wanted

can now eliminate duplicate filename generation

can check hyperlink cross references are correct

Improved table/diagram recognition

Now support conversion of tab-delimited data into tables, provided it's placed inside BEGIN/END_DELIMITED_TABLE directives

Relaxed indentation test on "n.n" headings. Heading can now be 2 characters to the left, or 1 character to the right of the expected position

Now recognize use of asterisk and underscore combined to produce bold-italic emphasis. Previously only asterisk (bold) and underscore (italic) by themselves were recognised.

Now recognize "]" as a possible "quoting" character.

Now recognize '+' as an underling character

Improved error reporting when file errors occur. The program will now abort the conversion on error, instead of continuing and reporting errors for each line.

Now detect read-only output directories and abort conversion. This would occur if you tried to convert a file on CD.

Definitions now use <DL compact> offering a more-faithful rendition of the original text

Underlined heading and text will now be rendered as underlined by default. Previously this either promoted the previous line to be a heading, or was drawn as a line.

Improved handling of first line indents on paragraphs. Now these are preserved in the output by the inclusion of   characters, and the error whereby the following line was deemed to be a different indentation (and thus acquire a <BLOCKQUOTE>) has been largely solved.

Introduction of the TEXT in-line tag now allows numbers like "Windows 3.1" to be protected from conversion into a hyperlink to section 3.1.

Version 3.0 (August 1998)

There are a fair number of small changes in functionality over V2.3, together with a fair number of bug fixes and refined algorithms. A lot of development during this time was directed towards the production of a text-to-RTF converter (AscToRTF) using the same analysis engine. Consequently there are a lot of changes "under the bonnet".

The main functional change has been the revamp of the Windows User Interface. A new section (4.1.2) has been added to this document describing the Windows interface in some detail. The changes include :-

the button bar is replaced by a proper Windows menu, allowing easier access to the programs functions.

under the Help menu a link to the HTML documentation shipped with the software is now provided.

the policy sheets are now "non-modal". This means you no longer have to dismiss them in order to do a conversion, you can leave them up whilst the conversion is going on, making it easier to go through the convert-change policy-convert cycle.

New functions in version 3.0

Windows Version

Major re-structuring of the user interface

Program's Help options now provide access to the online and offline versions of the HTML doco. A lot of people were downloading the software and then picking up a version of the doco, unawares they already had it. Don't you people read README.TXT files or what? :-)

All Versions

New Search for Definitions policy

New TAB size policy

New Expect sparse tables policy and TABLE_MAY_BE_SPARSE pre-processor command

New Add <BR> to lines with URLs policy

New Output file extension policy

New Minimise HTML file size policy

New Headings colour policy. Eventually I hope to add a whole suite of heading styling options, as these have been requested by a number of people.

New Convert TABLE X-refs to links policy and TABLE_CONVERT_XREFS pre-processor command

New CHANGE_POLICY pre-processor command

New Error reporting level policy

Other changes in version 3.0

Improved Windows interface

Empty lines in a table cell now get an extra   added, in addition to the <BR>. This is to compensate for a bug in Internet Explorer 3 which would ignore the <BR> otherwise, leading to alignment errors.

Now treat phrases with all the words connected by underscores, and with underscores at both ends as well as underlined e.g. this type of thing

Improved handling of tables with long URLs in them. Previously these would not be recognised as part of a table. Increased "long line" limit inside tables to 110 characters

Improved error reporting/handling

Report unrecognised pre-processor lines

Report results of table analysis (e.g. if diagrams are detected)

Report failure to find requested files

Abort conversion if can't find requested policy file

Improved detection of "mal-formed" tables. Previously this was over-cautious, especially on short tables.

Now add a trailing "/" to www etc URLs if none present (e.g. www.jafsoft.com). This is a more correct URL, which should be accessed slightly more efficiently.

Now recognised "....." underlining, although why people do this is beyond me :)

Improved contents list detection in short documents with only level one headings, and documents with a chapter "0".

Improved headings detection in small files. Made this less trigger happy.

Improved code detection, and now add bold emphasis of C++ like comments inside a code section

No longer allow "{" and "}" to be detected as probable bullet characters when code is expected

I've produced (with help from antipodean friends) an icon for files converted by AscToHTM. It's called a2hlogo.gif. Feel free to use it should you wish on any pages created with AscToHTM.

An example piece of HTML code would be

        <A HREF="http://www.jafsoft.com/asctohtm/?from=doco">
        <IMG SRC="a2hlogo.jpg" WIDTH=100 HEIGHT=36 BORDER=0
        ALT="Converted by AscToHTM"></A>

With the introduction of the Add <BR> to lines with URLs policy this behaviour is no longer default. That is, if you do want <BR> added at the end of all lines containing URLs you will need to switch this behaviour on using the new policy.

With the introduction of the Convert TABLE X-refs to links policy this behaviour is no longer default. That is, if you do want section links inside your tables, you will need to switch this behaviour on using the new policy.

".htm" files are now with a lowercase extension, unless Use DOS filenames policy selected

Version 2.3 (April 1998)

Minor bug fixes and upgraded functionality over V2.2. The main functional changes have been

The introduction of wildcard support to allow conversion of multiple files at once.

(related to the above) the introduction of the Directory Page
feature that allows the generation of a hyperlinked document spanning
all the files in a directory.

Major re-write of the contents-list generating routines. The
program now makes a third, intermediate, pass through the document
to analyse the contents structure. This means that contents lists
are now placed at the top of the HTML file be default, rather than
in a separate file as previously - though that behaviour is still
supported if wanted.

This approach is expected to pay further dividends in later releases.

New functions in version 2.3

Windows Version

Added a "Preform simple conversion" tick box on the front panel. This does exactly the same as the Keep it simple policy.

Improved the Headings dialog to allow headings policies to be more easily edited now.

Pre-processor document sections now working.

All versions

Wildcard support has been added (see Using Wildcards).

Major re-writing of contents list generation has occurred. Includes new Use any existing contents list and Generate external contents file. More changes are expected here in later versions.

New Directory Page feature. Supporting policies include:-

Make Directory
Directory filename
Show file titles in Directory
Indent headings in Directory
Directory title
Directory keywords
Directory description
Directory return hyperlink text
Directory Script file
Directory header file
Directory footer file

New Minimum TABLE column separation policy and TABLE_MIN_COLUMN_SEPARATION pre-processor command to allow some tuning of table analysis.

New Use first heading as title policy

New Use first line as title policy

New Recognised USENET groups policy

New Automatic centring tolerance policy

New Use <P> markup for paragraphs policy to allow choice of either <P> or <BR> markup to be used for paragraphs.

New Default table width policy and TABLE_WIDTH pre-processor command to allow table widths to be specified as percentages

New pre-processor command HTML_LINE

Other changes in version 2.3

Reinstated some of the "error" messages removed in the last version, to do with section numbering. This should make it more visible when the section heading analysis goes wrong.

Added error reporting to file open. You should now get an error message if the program fails to find/open a file somewhere.

Now support headings down to 5 levels (previously this was 4). Note, if you only have a couple at this level, the program may still ignore them as statistically insignificant.

Removed certain policies (such as "generate policy file") from the output when generating a full policy file. This is because, when they were read back in, they could cause problems.

The "Include document section" policy is now renamed to "Include document section(s)" reflecting the fact that you can now enter multiple values on one line, rather than requiring multiple lines with one value each as previously.

Major re-structuring and additions to HTML markup produced to make the section more coherent and up to date. Some of the sections marked as new in this version are simply the documentation catching up on the features added in earlier releases.

Sometimes I just work too hard :^)

Version 2.2 (Feb 1998)

First major release after V2.0 (when AscToHTM first went fully-Windowed). Major change this time has been the introduction of TABLE generating algorithms. These were first made available as a separate freeware utility AscToTab.

This version is reviewed by ZDNet and awarded 5-stars, their highest award.

New functions in version 2.2

Table generation

This is the biggest change in this version. AscToHTM now incorporates the technology first introduced in AscToTab. To support this the detection of pre-formatted text has been improved, new policies added, and new pre-processor commands added.

New policies include :-

Attempt TABLE generation
Default TABLE border size
Default TABLE header rows
Default TABLE header cols
Default TABLE cell spacing
Default TABLE cell padding
Default TABLE colour
Default TABLE border colour
Default TABLE caption

New Pre-processor commands include :-

BEGIN/END_CODE
BEGIN/END_DIAGRAM
BEGIN/END_TABLE
TABLE_BORDER
TABLE_BORDERCOLOR
TABLE_BGCOLOR
TABLE_CAPTION
TABLE_CELLSPACING
TABLE_CELLPADDING
TABLE_HEADER_ROWS
TABLE_HEADER_COLS

Other changes

Added pre-processor BEGIN/END_CODE commands to allow sections of code samples to be identified and distinguished from tables

Added pre-processor BEGIN/END_DIAGRAM commands to allow diagrams and sections ASCII art to be identified and distinguished from tables

Other changes in version 2.2

Documentation

Added the "Policy Dictionary" (since superceded by the Policy manual), and renumbered the document accordingly.

All versions

"tables/pre-formatted text"

Various improvements to detecting the start and end of pre-formatted regions of text.

Shareware now expires after 30 days, rather than after a fixed date.

Headings policies have been revised. Still more work to be done in this area.

Slight improvement in detection of centred text. Still not good enough to offer as a default though (too prone to errors).

Added section on saving/using policy files

Shareware version now adds nag lines at top and bottom of the page, instead of just the top.

A number of improvements in code sample detection

Reduced number of "error" messages reported. These may be made optional in a later version, and are still placed in the diagnostic files if these are created.

Windows version

Now added a "Settings" dialog to allow you to configure various aspects of how the program runs such as what browser to view files with, what policy file to use as default etc, etc.

New /COMMA and /TABBED command line qualifiers that allow comma delimited and tab delimited files be converted into tables.

Version 2.1 (never officially released)

V2.1 was never officially released, but much of this functionality "crept out" as the shareware version was updated. Some of these versions were shown as V2.01 instead of V2.1. There's nothing like a bit of consistency (and yeah, this was nothing like a bit of consistency).

New functions in version 2.1

New Document keywords policy and pre-processor KEYWORDS command.

New Document description policy and pre-processor DESCRIPTION command.

New Hyperlinks on Numbers contents policy

New Document style sheet policy and pre-processor STYLE_SHEET command.

Other changes in version 2.1

All versions

Now recognize domain names without a protocol specified (such as http:// or ftp:// etc.) that end in standard domains (e.g. .edu, .net, .org etc) as probable FTP sites. This allows references to sites like rtfm.mit.edu to be correctly turned into hyperlinks.

Some renumbering of this document has occurred

Quoted text is now marked up using <em>..</em> markup

Windows version

Now stores data in the Registry under the HKEY_CURRENT_USER root with a "\Software\JafSoft\AscToHTM\..." key

Now supports "most recently used" lists for both policy files and files to be converted. These are accessed via a drop-down Combo box.

Now remembers last source directory each time the program is run. This is used as the initial directory next time the Browse button is pressed.

The filenames now include the path. This is to allow the most recently used (MRU) file drop-down list to function correctly.

Version 2.0 (October 1997)

Version 2.0 marks the production of the first fully-windowed version for Windows 95/NT. This took a few months to be produced, so a fair number of other features have been added over this time.

New functions in version 2.0

New Output policy filename policy

New Use .HTM extension policy

New Generate diagnostics files policy

New "External contents list filename policy

New Use <DL> markup for defn. paras policy

New Ignore multiple blank lines policy

New Look for this and that emphasis policy

New Allow definitions inside PRE policy

New Pre-processor BEGIN/END_CONTENTS command

New Pre-processor BEGIN/END_HTML command

New Pre-processor TITLE command

New Pre-processor INCLUDE command

Other changes in version 2.0

White space immediately adjacent to PRE sections now ignored.

Changed anchor names to contain no spaces (makes URL's easier to quote)

Title defaults to "Converted from filename" instead of "No title" (see also 7.1.2)

Introduced some support for use of ctrl-H (backspace) in Unix documents to underlined and highlighted words

Automated "simple" file detection now attempted

Automated "code samples" detection now attempted

Some policies have been renamed as follows :-

Was Now

--- ---

Expect Numbered sections Expect Numbered Headings

HTML header HTML header file

HTML footer HTML footer file

The policy section headings have been renamed as well. This may cause "ignored policy line" messages when old policy files are used.

Version 1.1 (August 1997)

New functions in version 1.1

Added a Only use known groups policy to improve accuracy of newsgroup hyperlink detection.

Added more document colour policies

Added a /POLICY and Output Policy file option to make the generation of an output policy file optional

Added pre-processor support for user-formatted sections

Other changes in version 1.1

Indentation is now done using <BLOCKQUOTE> markup.

Changed default background colour to white.

Generation of a .pol file is no longer default

The use of <PRE> ... </PRE> to mark up user-formatted text is replaced by the new pre-processor commands BEGIN/END_PRE

re-write of section 4.1

Improved error reporting. The .LIS file created if the /DEBUG qualifier is used now has error and information messages included in it.

Version 1.05 (July 1997)

New functions in version 1.05

Added an Output policy file policy. This allows the suppression of output policy files where not wanted.

Added a Expect code samples policy. This helps in technical documents that include samples of C code.

Added pre-processor support to allow variant documents to be produced (see Pre-processor policies and Using the pre-processor)

Other changes in version 1.05

Policies now accept "Yes/No" as well as "True/False". "Yes/No" is now the default when outputting policies.

shareware version now limited to processing the first 500 lines only.

Lines with email addresses no longer have <BR>'s forced on the end. Lines with http, ftp and news links still do. This will become fully configurable in later versions.

Version 1.04 (July 1997)

New functions in version 1.04

Added policy Minimum automatic <PRE> size. This replaces the policy "Allow automatic 1-line <PRE>"

Added policies Largest allowed <Hn> tag and Smallest allowed <Hn> tag to allow control over generated heading

sizes.

Added policy Short line length

Added batch processing to allow multiple files to be converted at the same time.

Other changes in version 1.04

Created a 16-bit DOS version

VMS version now available as freeware.

Added "SendTo" tips for Windows 95/NT users section to the documentation.

Version 1.01 (April 1997)

Added the /CONTENTS qualifiers.

Added the /SIMPLE qualifier.

Back to Contents List

Normal text	Default font
Headings	Heading Font
Text in tables	Table font
Table of contents	Table of contents Font
Fixed-pitch text	Fixed font

Was	Now
---	---
Expect Numbered sections	Expect Numbered Headings
HTML header	HTML header file
HTML footer	HTML footer file