Documentation for the AscToHTM conversion utility |
This documentation can be downloaded as part of the documentation set in .zip format (370k)
- New /TABLE (see 4.2.2.13) command line qualifier that allows the input file to be treated as a single plain text table
- Added support for HEAD_SCRIPT HTML fragment. This allows HTML to be defined that can be copied into the <HEAD> of a document. This can inclued <META> tags of <SCRIPT>...</SCRIPT> sections.
- Added Swedish interface. Many thanks to Dan Sverraby.
- Added new policy "Only allow pages to be viewed in frames"
- New utility A2HDETAG is available to registered users so they can "de-tag" their source files to remove all AscToHTM pre-processor tags, leaving a plain text fit for publishing, e.g. on Usenet.
- Added BEGIN_ASCII ... END_ASCII pre-processor tags. These identify text that will be copied to the output of A2HDETAG. It is ignored in all other conversions, and is intended to allow alternative text to be placed in text and HTML versions of a document.
- Added "character endcoding" policy to allow the character encoding of a document to be set. The software has limited ability to detect Japanese ("x-sjis") and Cyrilic ("koi-8") text, but in some cases this will need to be set.
The auto-detect of character sets can be switched off by using the "Look for character encoding" policy
- Added policies to allow different fonts to be applied to different types of text as follows
Normal text "Default font" Headings "Heading Font" Text in tables "Table font" Table of contents "TOC Font" Fixed-pitch text "Fixed font"
The "Default Font" policy existed previously, the other four policies are new in this version.
- Added "PAGE" directive. This marks a page boundary. In HTML this simply results in a <HR> tag, since HTML doesn't really support pages. This may be expanded in future to allow page numbers and the like to be displayed.
Windows version
- Loading a policy file with "place policy in frames" policy will now toggle the Conversion type
- You no longer get prompted to "save policy" just because you pressed OK on one of the policy sheets. Now this only happens when something has been changed.
- The main menu now has a "check for updates" option. If you select this you'll be taken to the JafSoft website where you'll be told if any newer versions of the software have been released.
Documentation
- The list of bug fixes is removed from this document and is now to be found on-line at http://www.jafsoft.com/doco/asctohtm_bug_history.html
All versions
- Added support for HTML fragment files to $_$_INCLUDE other HTML fragment files. This allows common fragments to be shared.
- Fine-tuned the detection of whether or not a file has an in-situ contents list
- When Frames generation is selected the default "Split level" is set to 1 instead of 2. This means you'll get fewer files generated and - depending on the type of headings you have - no splitting may occur unless you manually increase the split level.
- The "LINKPOINT" pre-processor tag can now be used as a directive as well as an in-line tag. (see the Tag manual for details).
- Added a "Range" attribute to the "CONTENTS_LIST" tag. This allows mini-contents lists to be generated which contain only entries for a part of the document, rather than the whole document, e.g. for just a single chapter. This should help those who want to split large files into pages and to have a mini-contents list for each section.
- Improved handling of VT escape characters. These are either removed from the output or converted to "line" characters
- Added auto-detect of double spaced files (files where every second line is blank). This will set the "Input file is double spaced" policy whenever double-spaced text is detected (unless the policy has already been set).
API version
- For those wishing to call AscToHTM programmatically, an API has been developed. This is sold under separate license. Contact sales@jafsoft.com if you're interested.
Linux version
- A Linux command line version will soon be available. Beta versions have been tested, and I hope to do a Linux command line release just after version 4 is released.
Windows version
- You can now choose from the main screen whether you want your HTML output as one or more HTML file(s), sent to the Windows Clipboard (see 3.4.5 and 4.1.1.4), or turned into a set of HTML frames (see Frames).
- Program now remembers positions of windows from one invocation to the next.
- The user interface is now available in Italian.
All versions
- Version 4 introduces frames support (see Frames). This introduces a large number of supporting policies :-
"Place document in frames"
"Output frame name""Header Frame depth"
"Footer Frame depth"
"Contents Frame width"
"Use main header in header frame"
"Use main footer in footer frame"
"Add contents frame if possible"
"Add Frame border""Open frame links in new window"
"New frame link window name""Add NOFRAMES links"
"NOFRAMES link URL""Number of levels in contents frame"
"Header frame background colour"
"Header frame text colour"
"Contents frame background colour"
"Contents frame text colour"
"Footer frame background colour"
"Footer frame text colour"
- Added HTML fragments feature, with "HTML fragments file" policy and DEFINE_HTML_FRAGMENT, RESET_HTML_FRAGMENT pre-processor commands. This allows you to define HTML fragments that can be used to replace the standard HTML generated by the program. This allows you to customize headers, footers, horizontal rules, contents lists, navigation bars and more.
- Added support for URL parsing, including :-
- new top level domains (.info, .biz etc) are supported
- the "snews://" secure news server protocol type is now supported
- URLs of the form http://username@domain_name/... are now supported
- Added "Check domain name syntax" policy
- Added "Create Telnet links" policy
- Added support for "obfuscated" URLs such as
http://3640005069/
http://7934972365/
http://0330.0366.0021.0315/ http://%6c%6f%63%6b%65%72%67%6e%6f%6d%65%2e%63%6f%6d/Although the display text is left unchanged, the hyperlink will point to a non-obfuscated URL (either the domain name, or an IP address). This is because the obfuscated URLs such as there are often used by spammers, and the author has no intention of allowing his software to aid spammers in their goals.
If someone cares to give me a valid reason for using such URLs I may reconsider this behaviour.
- Added support for embedded headings with the "Expect embedded headings" policy (see 5.4.4). These are "headings" that are embedded as the first sentence in a paragraph.
- Added support for headings that start with particular words or phrases via the "Heading key phrases" policy (see 5.4.5).
- New /COMMA (see 4.2.2.1) and /TABBED (see 4.2.2.12) command line qualifiers that allow comma delimited and tab delimited files be converted into tables.
- Added "Check indentation for consistency" policy to allow checking of headings to be relaxed (e.g. when they're centred on the page).
- Added "Look for diagrams" policy
- Added "Input file contains PCL codes" policy
- Added "Input file contains Japanese characters" support.
- Added "Preserve new paragraph offset" policy
- Added "Omit <HEAD> and <BODY> from output" policy
- Added "Document Base URL" policy
- Added "Comment generation code" policy
- Added "Number of words to include in filename" policy to allow filenames to be generated from the first few words of the title when splitting documents with underlined or capitalised headings at each heading.
- Added "Lines to ignore at end of file" and "Lines to ignore at start of file" policies to allow lines at the start and end of the source file to be discarded. This can be useful if you source text is coming from a third party source that adds extra, unwanted, lines.
- Added "Suppress all colour markup" policy
- Added "Suppress all font markup" policy
- Added "Mirror margins" policy (RTF only)
- Added "First line indentation (in blocks)" policy
- Added "Column boundaries have zero width" policy
Windows version
- On some systems DDE doesn't always work properly. This would cause the program to hang when it attempted to display results. In such cases you would need to stop the program from the task manager. In version 4 the program will now detect when this has happened and disable use of DDE next time it runs.
NOTE: DDE won't work with Netscape 6.0 (it doesn't support it)
- Added the policy "Suppress URL messages" to the Settings | Diagnostics menu option. When disabled all URLs, email addresses etc will be listed in the log file. Since this file can be saved to disk, this is one way of identifying all the candidate hyperlinks from your text file.
All versions
- Improved analysis for tables using bar ('|') column separators
- Improved detection of ASCII art diagrams.
- Improved handling of heavily indented blocks of text. Previously these were (poorly) rendered as tables. Now the tables more accurately preserve the large indentation (see 5.5.3.4).
- The first three words of an underlined heading are now used to generate the filename. Previously only the first word was used, leading to less meaningful names, with more chances of duplication.
- VMS command line now allows multiple filespecs, separated by spaces. Policy file must now be a .pol file, rather than the second argument.
- Anchor names from filename are now lower case (to reduce possible mismatches)
- Shareware version now expires after 30 days + 5 uses. This will allow people to use the software on 5 different days after the first 30 days, giving people more time to evaluate the software at their leisure.
- Now strip out leading and trailing "---" from heading text to make them more presentable in HTML or RTF
- Added support for headings that span up to 3 lines, previously this was only 2.
- Changed heading to allow <H4> markup to be used. Previously "level 4" headings would get <H3> markup since anything smaller would end up smaller than the main text. With the advent of CSS style sheets This should be less of a problem.
- Changed emphasis handling to allow hyphenated parts to be emphasised independently, e.g. pre-formatted or pre-formatted.
The AscToHTM 3.3 release follows 6 "micro-releases" announced via the updates page on the Web. As such it will appear as a small step forward over 3.2.06, but in fact it offers a fair amount of new functionality over version 3.2
Major changes in version 3.3 include :-
- Support for fonts. You can now choose a font for the whole document. By default this is implemented using CSS, but you can elect to use <FONT> tags should you prefer.
- Enhanced Language support The Spanish and German interfaces added in the last version have had Portuguese added. Also a new feature allows you to save the interface to a "language skin" text file which may be edited and then reloaded. Using this feature we can now offer
- American English (simply a spell-checked UK English file)
- "Babelfish" French. A French translation from http://babelfish.altavista.com/
- "Babelfish" Italian. An Italian translation from Babelfish.
If anyone wants to correct these files and send them back to me, feel free.
- More table generation controls. Several new controls have been added to give you more control over the detection, analysis and generation of tables in the text.
- Support for comma and tab delimited tables. Pre-processor commands have been added to allow you you mark up a section of comma-delimited or tab-delimited data you want turning into a table.
- Support for preserving file/line structures. You can now elect to preserve the original line structure of a file, or to place the whole file in <PRE> markup (which is a little defeatist, but has its uses)
- Support for non-standard characters. The program can now recognise, to a limited extent, DOS line-drawing characters, MIME-encoded text and text documents with "change bars" in them.
- New "Tag manual". The Using the preprocessor and In-line tags sections of this document have now been re-merged and their contents largely moved to a new document called the Tag manual.
Fonts
- The default font for the whole document can now be set via the "Default font" policy. Headings will also adopt the selected font, and will scale with the selected font size, although the <H1> headers are slightly smaller than the default.
You can choose to have the fonts implemented using <FONT> tags or CSS (e.g. according to your target audience) using the "Use CSS to implement fonts" policy.
Definition Blocks
- Definition blocks allow you to define blocks of text that you may then insert at any point in the text (e.g. to give an "end of page" effect). You can also "define variables" whose value is then inserted wherever a VARIABLE tag is used.
The pre-processor commands involved are:
"DEFINE/END_BLOCK and RESET_BLOCK"
"DEFINE_VARIABLE"
"EMBED_BLOCK"
"INSERT_BLOCK"
"VARIABLE"
Tables
- Added several new policies and tags to help with table analysis. Policies added include
"Default TABLE layout"
(also pre-processor tag "TABLE_LAYOUT")This allows you to specify the number of columns in each table, and the attributes of each column, specifically the character position that marks the end of each column. Rather than use this policy, it is probably better to use the related directive $_$_TABLE_LAYOUT in the source text on a per-table basis.
"Default TABLE alignment"
(also pre-processor tag "TABLE_ALIGN")Allows the alignment of the table to be specified (left, right, center)
"Ignore table header during analysis" (also pre-processor tag "TABLE_IGNORE_HEADER")Specifies that table headers should be ignored when columns are being auto-detected. Some tables have complex headers that confuse the analysis. This policy can be used to help them be ignored.
Controls the degree to which pre-formatted lines should be expanded into adjacent text.
Controls the degree to which columns which don't appear to be very clear should be "merged" together
"Tables could be blank line separated"Indicates that tables could be using blank lines to separate rows of data. This affects the analysis and detection of the tables extent.
- Added support for embedding comma-delimited and tab-delimited table data in your source file (e.g. data exported from Excel and the line).
The new pre-processor directives :-
"BEGIN/END_COMMA_DELIMITED_TABLE"
"BEGIN/END_DELIMITED_TABLE"
Other
- Added options to allow more control over how the original document's file structure should be preserved
"Treat each line as a paragraph"
If this option is selected, every line in the source file is treated as a paragraph. This may be suitable if the file has been authored using an editor that wraps the lines (i.e. doesn't put in hard breaks) and which doesn't add blank lines between paragraphs.
If this option is selected a <BR> is added to every line, thereby preserving the line structure of the original and giving the resulting HTML file an "A4 look" that hugs the left margin regardless of how wide the window is made.
"Preserve file structure using <PRE>"
If this option is selected the whole document is placed in <PRE> markup, and very few conversions are attempted. This is really a "last resort" option that you may want to use if the file has complex structures which the program is failing to understand. This option was added for a customer who wanted to convert all 2800 RFCs without having to manually correct each one.
- Added support for parsing files with some Mime-encoded quotable strings in them. The new policy "Input file contains mime encoding" can be found under Analysis->File structure. At present there is some (very limited) auto-detect for this feature.
- Added support for documents with change bars. By default change bars are stripped out, and the changed text coloured red this behaviour may be changed in later versions. Added the new policy "Input file has change bars" which can be found under Analysis->File Structure.
- Added support for converting DOS characters. The new policy "Input file contains DOS characters" can be found under Analysis->File Structure.
There is a limited auto-detect of DOS characters when diagrams are present.
- Changed hyperlink detection to only allow explicit FTP URLs and email addresses that don't start with numbers. These behaviours can be reversed using the new policies "Only allow explicit FTP links" and "Allow email beginning with numbers", both of which are on the Output->Hyperlinks tab.
- Added the policy "Create gopher links" to toggle the conversion of gopher links into hyperlinks.
- Added the policy "Check indentation for consistency" so that it could be disabled in documents where headings were centred (and thus all at different indentations)
- Added several new pre-processor in-line tags :-
"FILENAME" - output name of converted file
"FRACTION" - output a fraction
"VERSION" - output program version number"IGNORE_THIS" - for comments in the source code
- Added policy to allow selection of which version of HTML should be generated. Policy is "HTML version to be targeted".
Only "HTML 3.2" and "HTML 4.0 Transitional" are currently supported.
Windows
- The main screen now allows access to Policy file selection. Previously this was only available on the menu structure. The Menu structure has been left unchanged, meaning you now have two ways of choosing your policy files.
All
- The contents list styling has been changed slightly. For example only the major section headings are now shown in bold. People were complaining :-)
- Now add BORDER=0 attribute to tables with no border, rather than just omitting the attribute. This is a workaround for a bug in Netscape where a gap appears where a border would be when coloured rows are selected.
- Support for IE 3.0 as the browser of choice is added, by allowing the filename rather than file URL to be passed to the browser. To do this disable the "file://localhost/" option on the Settings->Viewers dialog screen.
- More changes on bullet characters, in particular to disallow 'O' (upper case) from becoming a bullet character through analysis. This really doesn't work in Portuguese documents :-) 'o' (lower case) may still be detected. If upper case 'O' is wanted this can still be manually switched on.
- Increased maximum width allowed in tables to 200 (after encountering a sample at 165). Lines longer than this are disregarded as candidate table lines.
- Introduction of German and Portuguese user interface, with extension of the Spanish user interface.
- Horizontal lines are now implemented as <HR> tags whose length attempts to approximate the original (e.g. 50% or whatever). Previously lines would become full width.
- Chapters 7 and 8 of this document were merged into a single chapter 7 (about the pre-processor). Most of that material has now been moved to the new Tag manual. Subsequent chapters have thus been renumbered which may lead to invalid references to chapter 11... especially if you keep old versions of the doco lying around.
Also reversed the order of sections in this "Change History" section
(Version 3.1 was never released, but a release of AscToTab occurred sometime after version 3.0, and so in keeping with the policy of synchronizing version numbers that was labelled version 3.1)
Over a year after the last release, version 3.2 is a major upgrade, but is only given a minor version number change because the remainder of the functionality produced in that time will be revealed in version 4.0.
Version 3.2 starts to prepare the groundwork for Cascading Style Sheet (CSS) and general font support that will be introduced in version 4.0. This has required a fairly radical change to the type of HTML code generated and how this is put together.
For example the HTML is now more standards compliant (this is now a stated goal of the software, although I can't always promise full compliance see 1.1.4), and as an aid towards CSS support "optional" end tags such as </P> are now being placed in the generated HTML.
Note that the use of the <FONT> tag is deprecated in HTML 4.0, and if you choose to add FONT markup to your pages they'll become much bigger, especially if they contain tables. This is because the HTML standard requires the FONT tag to continually be re-expressed to achieve the right appearance in all browsers (believe me, I only accepted this through bitter experience and grudgingly).
Major changes in version 3.2 include :-
- The program now always makes three passes through the document - previously it only did this if a contents list was requested (see 3.3). This may make the conversion a little slower. The middle pass calculates how the file will be split into sections, where all the hyperlinks should point to and what the contents list should be. This approach should be less error prone than previously.
- New "overview" options (see 6.2.1). These allow you to easily enable and disable the program's search for certain features.
- Introduction of in-line tagging (see In-line tags). These allow you to get more out of your conversion by inserting commands into your source text.
- Addition of DDE support (in Windows) (see 4.1.3.4)
- New and improved command line options, and full command line support built into the Windows version (see 4.2.2)
- Improved message filtering. Each message is now labelled according to its type (information, warning etc), and may be optionally suppressed or filtered by severity. A new /SILENT command qualifier (see 4.2.2.10) allows complete suppression of messages.
- Improved log file capability (see 4.3.4)
- Added support for mail and USENET headers (see 5.4.7)
- (Limited) support added for stripping out page markers, converting "double spaced" files, and converting .prn and VT escape sequences. This functionality may be improved in later versions.
- New options to colour the odd and even rows of tables differently (see Table generation policies and 7.1.4)
Windows Version
- Added "Save" option to status dialog, so that the messages can be saved into a .log file
- Added DDE support to display results in existing browser window
- Full drag and drop support added. You can now drag files onto the program when it is visible.
- New "browse for directory" buttons added.
- More menu options added to make finding policies easier.
All versions
- Now support tab-delimited tables (mainly for AscToTab) (see 7.1.1)
- Support for stripping out mail and USENET headers (see 5.4.7)
- New pre-processor directives :-
- "BEGIN/END_DELIMITED_TABLE" section delimiters
- "BEGIN/END_IGNORE" command
- "CONTENTS_LIST" command
- "NAVIGATION_BAR" command
- "LINERULE" command
- "TOC" command
- New and improved command line qualifiers
- New overview "look for" analysis policies :-
- "Look for indentation"
- "Look for paragraphs"
- "Look for short lines"
- "Look for quoted text"
- "Look for preformatted text"
- "Look for mail headers"
- "Look for horizontal rules" and "Minimum ruler length"
- "Look for MAIL and USENET headers"
- "Look for bullets"
- "Look for hanging paragraphs"
- "Look for white space"
- Other new analysis policies :-
- New diagnostic policies :-
- "Monitor tag generation"
- "GOTO Display messages" policy and /SILENT qualifier (see 4.2.2.10)
- "Suppress INFO messages",
- "Suppress TAG ERROR messages"
- "Suppress URL messages"
- "Suppress WARNING messages"
- "Suppress program ERROR messages"
- Other new output policies :-
- "Create a log file" and "Output log filename"
- "Maximum level to show in contents"
- "Preserve underlining of headings"
- "Use <EM> and <STRONG> markup"
- "Colour data rows" and related policies (see TABLE generation policies).
- "Default TABLE cell alignment" and "TABLE_CELL_ALIGN" directive
- "Suppress all colour markup"
- "Open link in new browser window" and "new browser window name"
- "Break up long HTML lines"
On the web site, and documentation
- A dedicated site www.jafsoft.com now deals with AscToHTM and related products.
- An updates page has been added to the Web site. This will list all the updates available for AscToHTM, although in most cases you'll need to be a registered user to receive details for you to obtain the update.
- An AscToHTM FAQ has been added to the web site. It's not finished yet (what part of the web is?), but it may help answer some of your questions.
- Created a new document called "The Policy manual". This replaces what was becoming the largest section of this document.
Windows version
- The Windows help file now has a better Index. It also has a full contents list as a topic, showing you the structure of the RTF file used to generate the Help file. Unfortunately I've been unable to hyperlink this topic.
- The Windows version now "remembers" which options page you were on so that each time you go back there the same sheet is shown.
- The Windows version is now "statically linked" against the necessary .DLLs. This makes the program slightly larger, but makes the download smaller as it is no longer necessary to ship .DLLs with the program. This makes overall version management simpler.
VMS version
- The VMS version now converts all filenames to lower case internally. This is so that all hyperlinks and references to the file are in lower case, making them more Internet-friendly and portable to other systems.
All versions
- Changes to the tagging to aid standards compliance and CSS support. this includes the addition of the </P> tag which was previously omitted. These changes have introduced slight differences in th amount of vertical white spacing produced in places.
- Improvements have been made to the file splitting algorithms. In particular
- The program will no longer generate two output pages with the same name. Where duplicate names are detected, the second file is given a generated name, usually by appending "_n" (n=1,2,3...) to the filename. All hyperlinks pointing to sections in the duplicate file will be adjusted accordingly.
- A file with underlined headings can now be split into pages at the heading boundaries. The subsequent pages have _U1, _U2... appended to the name of the first page.
- Local links (i.e. to anchors in the same file) are now recognised as such, and the filename is omitted.
This should make it easier to rename files after production without breaking local hyperlinks. Links to/from other files would still stop working though.
- link names for underlined or capitalised headings that are more than 60 characters long are now truncated. They are given a link name derived from the first 30 characters of the section name with a unique identifier tagged on the end. This avoids long link names being split over two or more lines and becoming unusable.
- Allow relative links to subtract out filename (e.g. in contents list) when target is in same file
- Can now recognise URLs with commas in then such as recognise http://cgi.pathfinder.com/netly/opinion/0,1042,1692,00.html in addition to comma separated lists of URLs.
- The KEYWORDS, DESCRIPTION and TITLE pre-processor commands can now be multi-line. This allows long lists of keywords to be placed over several lines (each beginning with the command), making then easier to manage.
- The default name for the directory index file is now "dirindex.html" rather than "index.html" to prevent overwriting of any existing index file.
- Program now always does a "contents pass". Benefits of this are
- can now generate in situ contents lists /contents bars
- can now generate navigation bars wherever wanted
- can now eliminate duplicate filename generation
- can check hyperlink cross references are correct
- Improved table/diagram recognition
- Now support conversion of tab-delimited data into tables, provided it's placed inside "BEGIN/END_DELIMITED_TABLE" directives
- Relaxed indentation test on "n.n" headings. Heading can now be 2 characters to the left, or 1 character to the right of the expected position
- Now recognise use of asterisk and underscore combined to produce bold-italic emphasis. Previously only asterisk (bold) and underscore (italic) by themselves were recognised.
- Now recognise "]" as a possible "quoting" character.
- Now recognise '+' as an underling character
- Improved error reporting when file errors occur. The program will now abort the conversion on error, instead of continuing and reporting errors for each line.
- Now detect read-only output directories and abort conversion. This would occur if you tried to convert a file on CD.
- Definitions now use <DL compact> offering a more-faithful rendition of the original text
- Underlined heading and text will now be rendered as underlined by default. Previously this either promoted the previous line to be a heading, or was drawn as a line.
- Improved handling of first line indents on paragraphs. Now these are preserved in the output by the inclusion of characters, and the error whereby the following line was deemed to be a different indentation (and thus acquire a <BLOCKQUOTE>) has been largely solved.
- Introduction of the TEXT in-line tag (see 7.2.2) now allows numbers like Windows 3.1 to be protected from conversion into a hyperlink to section 3.1.
There are a fair number of small changes in functionality over V2.3, together with a fair number of bugfixes and refined algorithms. A lot of development during this time was directed towards the production of a text-to-RTF converter (AscToRTF) using the same analysis engine. Consequently there are a lot of changes "under the bonnet".
The main functional change has been the revamp of the Windows User Interface. A new section (4.1.2) has been added to this document describing the Windows interface in some detail. The changes include :-
- the button bar is replaced by a proper Windows menu, allowing easier access to the programs functions.
- under the Help menu a link to the HTML documentation shipped with the software is now provided.
- the policy sheets are now "non-modal". This means you no longer have to dismiss them in order to do a conversion, you can leave them up whilst the conversion is going on, making it easier to go through the convert-change policy-convert cycle.
Windows Version
- Major re-structuring of the user interface (see 4.1.2)
- Program's Help options now provide access to the online and offline versions of the HTML doco. A lot of people were downloading the software and then picking up a version of the doco, unawares they already had it. Don't you people read README.TXT files or what?
All Versions
- New "Search for Definitions" policy
- New "TAB size" policy
- New "Expect sparse tables" policy and "TABLE_MAY_BE_SPARSE" pre-processor command
- New "Add <BR> to lines with URLs" policy
- New "Output file extension" policy
- New "Minimise HTML file size" policy
- New "Headings colour" policy. Eventually I hope to add a whole suite of heading styling options, as these have been requested by a number of people.
- New "Convert TABLE X-refs to links" policy and "TABLE_CONVERT_XREFS" pre-processor command
- New "CHANGE_POLICY" pre-processor command
- New "Error reporting level" policy
- Improved Windows interface
- Empty lines in a table cell now get an extra added, in addition to the <BR>. This is to compensate for a bug in Internet Explorer 3 which would ignore the <BR> otherwise, leading to alignment errors.
- Now treat phrases with all the words connected by underscores, and with underscores at both ends as well as underlined e.g. this type of thing
- Improved handling of tables with long urls in them. Previously these would not be recognised as part of a table. Increased "long line" limit inside tables to 110 characters
- Improved error reporting/handling
- Report unrecognised pre-processor lines
- Report results of table analysis (e.g. if diagrams are detected)
- Report failure to find requested files
- Abort conversion if can't find requested policy file
- Improved detection of "mal-formed" tables. Previously this was over-cautious, especially on short tables.
- Now add a trailing "/" to www etc URLs if none present (e.g. www.jafsoft.com). This is a more correct URL, which should be accessed slightly more efficiently.
- Now recognised "....." underlining, although why people do this is beyond me :)
- Improved contents list detection in short documents with only level one headings, and documents with a chapter "0".
- Improved headings detection in small files. Made this less trigger happy.
- Improved code detection, and now add bold emphasis of C++ like comments inside a code section
- No longer allow "{" and "}" to be detected as probable bullet characters when code is expected
- I've produced (with help from antipodean friends) an icon for files converted by AscToHTM. It's called a2hlogo.gif. Feel free to use it should you wish on any pages created with AscToHTM.
An example piece of HTML code would be
<A HREF="http://www.jafsoft.com/asctohtm/?from=doco"> <IMG SRC="a2hlogo.jpg" WIDTH=100 HEIGHT=36 BORDER=0 ALT="Converted by AscToHTM"></A>
- With the introduction of the "Add <BR> to lines with URLs" policy this behaviour is no longer default. That is, if you do want <BR> added at the end of all lines containing URLs you will need to switch this behaviour on using the new policy.
- With the introduction of the "Convert TABLE X-refs to links" policy this behaviour is no longer default. That is, if you do want section links inside your tables, you will need to switch this behaviour on using the new policy.
- ".htm" files are now with a lowercase extension, unless "Use DOS filenames" policy selected
Minor bugfixes and upgraded functionality over V2.2. The main functional changes have been
- The introduction of wildcard support to allow conversion
of multiple files at once.
- (related to the above) the introduction of the Directory Page
feature that allows the generation of a hyperlinked document spanning
all the files in a directory.
- Major re-write of the contents-list generating routines. The
program now makes a third, intermediate, pass through the document
to analyse the contents structure. This means that contents lists
are now placed at the top of the HTML file be default, rather than
in a separate file as previously - though that behaviour is still supported if wanted.This approach is expected to pay further dividends in later releases.
Windows Version
- Added a "Preform simple conversion" tick box on the front panel. This does exactly the same as the "Keep it simple" policy.
- Improved the Headings dialog to allow headings policies to be more easily edited now.
- Pre-processor document sections now working.
All versions
- Wildcard support has been added (see 4.3.3.1).
- Major re-writing of contents list generation has occurred (see 3.4.2). Includes new "Use any existing contents list" and "Generate external contents file". More changes are expected here in later versions.
- New Directory Page feature. Supporting policies include:-
"Make Directory"
"Directory filename"
"Show file titles in Directory"
"Indent headings in Directory"
"Directory title"
"Directory keywords"
"Directory description"
"Directory return hyperlink text"
"Directory Script file"
"Directory header file"
"Directory footer file"
- New "Minimum TABLE column separation" policy and "TABLE_MIN_COLUMN_SEPARATION" pre-processor command to allow some tuning of table analysis.
- New "Use first heading as title" policy
- New "Use first line as title" policy
- New "Recognised USENET groups" policy
- New "Automatic centring tolerance" policy
- New "Use <P> markup for paragraphs" policy to allow choice of either <P> or <BR> markup to be used for paragraphs.
- New "Default table width" policy and "TABLE_WIDTH" pre-processor command to allow table widths to be specified as percentages
- New pre-processor command "HTML_LINE"
- Reinstated some of the "error" messages removed in the last version, to do with section numbering. This should make it more visible when the section heading analysis goes wrong.
- Added error reporting to file open. You should now get an error message if the program fails to find/open a file somewhere.
- Now support headings down to 5 levels (previously this was 4). Note, if you only have a couple at this level, the program may still ignore them as statistically insignificant.
- Removed certain policies (such as "generate policy file") from the output when generating a full policy file. This is because, when they were read back in, they could cause problems.
- The "Include document section" policy is now renamed to "Include document section(s)" reflecting the fact that you can now enter multiple values on one line, rather than requiring multiple lines with one value each as previously.
- Major re-structuring and additions to HTML markup produced to make the section more coherent and up to date. Some of the sections marked as new in this version are simply the documentation catching up on the features added in earlier releases.
Sometimes I just work too hard :^)
First major release after V2.0 (when AscToHTM first went fully-Windowed). Major change this time has been the introduction of TABLE generating algorithms. These were first made available as a separate freeware utility AscToTab.
This version is reviewed by ZDNet and awarded 5-stars, their highest award.
Table generation
This is the biggest change in this version. AscToHTM now incorporates the technology first introduced in AscToTab. To support this the detection of pre-formatted text has been improved, new policies added, and new preprocessor commands added.
New policies include :-
"Attempt TABLE generation"
"Default TABLE border size"
"Default TABLE header rows"
"Default TABLE header cols"
"Default TABLE cell spacing"
"Default TABLE cell padding"
"Default TABLE colour"
"Default TABLE border colour"
"Default TABLE caption"
New Pre-processor commands include :-
"BEGIN/END_CODE"
"BEGIN/END_DIAGRAM"
"BEGIN/END_TABLE"
"TABLE_BORDER"
"TABLE_BORDERCOLOR"
"TABLE_BGCOLOR"
"TABLE_CAPTION"
"TABLE_CELLSPACING"
"TABLE_CELLPADDING"
"TABLE_HEADER_ROWS"
"TABLE_HEADER_COLS"
Other changes
- Added a policy to allow <CODE> markup to be used for code fragments in the document (see HTML Styling Policies)
- Added pre-processor "BEGIN/END_CODE" commands to allow sections of code samples to be identified and distinguished from tables
- Added pre-processor "BEGIN/END_DIAGRAM" commands to allow diagrams and sections Ascii art to be identified and distinguished from tables
Documentation
- Added the "Policy Dictionary" (since superceded by the Policy manual), and renumbered the document accordingly.
All versions
- "tables/pre-formatted text"
- Various improvements to detecting the start and end of pre-formatted regions of text.
- Shareware now expires after 30 days, rather than after a fixed date.
- Headings policies have been revised. Still more work to be done in this area.
- Slight improvement in detection of centred text. Still not good enough to offer as a default though (too prone to errors).
- Added section on saving/using policy files (see 6.5)
- Shareware version now adds nag lines at top and bottom of the page, instead of just the top.
- A number of improvements in code sample detection
- Reduced number of "error" messages reported. These may be made optional in a later version, and are still placed in the diagnostic files if these are created.
Windows version
- Now added a "Settings" dialog to allow you to configure various aspects of how the program runs such as what browser to view files with, what policy file to use as default etc, etc.
- New /COMMA (see 4.2.2.1) and /TABBED (see 4.2.2.12) command line qualifiers that allow comma delimited and tab delimited files be converted into tables.
V2.1 was never officially released, but much of this functionality "crept out" as the shareware version was updated. Some of these versions were shown as V2.01 instead of V2.1. There's nothing like a bit of consistency (and yeah, this was nothing like a bit of consistency).
- New "Document keywords" policy and pre-processor "KEYWORDS" command.
- New "Document description" policy and pre-processor "DESCRIPTION" command.
- New "Hyperlinks on Numbers" contents policy
- New "Document style sheet" policy and pre-processor "STYLE_SHEET" command.
All versions
- Now recognise domain names without a protocol specified (such as http:// or ftp:// etc.) that end in standard domains (e.g. .edu, .net, .org etc) as probable FTP sites. This allows references to sites like rtfm.mit.edu to be correctly turned into hyperlinks.
- Some renumbering of this document has occurred
- Quoted text is now marked up using <em>..</em> markup
Windows version
- Now stores data in the Registry under the HKEY_CURRENT_USER root with a "\Software\JafSoft\AscToHTM\..." key
- Now supports "most recently used" lists for both policy files and files to be converted. These are accessed via a drop-down Combo box.
- Now remembers last source directory each time the program is run. This is used as the initial directory next time the Browse button is pressed.
- The filenames now include the path. This is to allow the most recently used (MRU) file drop-down list to function correctly.
Version 2.0 marks the production of the first fully-windowed version for Windows 95/NT. This took a few months to be produced, so a fair number of other features have been added over this time.
- New "Output policy filename" policy
- New "Use .HTM extension" policy
- New "Generate diagnostics files" policy
- New "External contents list filename" policy
- New "Use <DL> markup for defn. paras" policy
- New "Ignore multiple blank lines" policy
- New "Search for emphasis" policy
- New "Allow definitions inside PRE" policy
- New Pre-processor "BEGIN/END_CONTENTS" command
- New Pre-processor "BEGIN/END_HTML" command
- New Pre-processor "TITLE" command
- New Pre-processor "INCLUDE" command
- White space immediately adjacent to PRE sections now ignored.
- Changed anchor names to contain no spaces (makes URL's easier to quote)
- Title defaults to "Converted from filename" instead of "No title" (see also 7.1.2)
- Introduced some support for use of ctrl-H (backspace) in Unix documents to underlined and highlighted words
- Automated "simple" file detection now attempted
- Automated "code samples" detection now attempted
- Some policies have been renamed as follows :-
Was Now Expect Numbered sections Expect Numbered Headings HTML header HTML header file HTML footer HTML footer file
- The policy section headings have been renamed as well. This may cause "ignored policy line" messages when old policy files are used.
- Added a "Only use known groups" policy to improve accuracy of newsgroup hyperlink detection.
- Added more document colour policies
- Added a /POLICY and "Output Policy file" option (see 4.2.2.9) to make the generation of an output policy file optional
- Added preprocessor support for user-formatted sections (see 7.1.1)
- Indentation is now done using <BLOCKQUOTE> markup.
- Changed default background colour to white.
- Generation of a .pol file is no longer default (see 4.2.2.9)
- The use of <PRE> ... </PRE> to mark up user-formatted text is replaced by the new preprocessor commands "BEGIN/END_PRE"
- re-write of section 4.1
- Improved error reporting. The .LIS file created if the /DEBUG qualifier is used (see 4.2.2.4) now has error and information messages included in it.
- Added an "Output directory" policy. This allows redirection of output to a directory different from that containing the source files.
- Note:
- This functionality may not be available in the shareware version of the software.
- Added an "Output policy" policy. This allows the suppression of output policy files where not wanted.
- Added a "Expect code samples" policy. This helps in technical documents that include samples of C code.
- Added preprocessor support to allow variant documents to be produced (see Preprocessor policies and Using the preprocessor)
- Policies now accept "Yes/No" as well as "True/False". "Yes/No" is now the default when outputting policies.
- shareware version now limited to processing the first 500 lines only.
- Lines with email addresses no longer have <BR>'s forced on the end. Lines with http, ftp and news links still do. This will become fully configurable in later versions.
- Added policy "Minimum automatic <PRE> size". This replaces the policy "Allow automatic 1-line <PRE>"
- Added policies "Largest allowed <Hn> tag" and "Smallest allowed <Hn> tag" to allow control over generated heading sizes.
- Added policy "Short line length"
- Added Batch processing to allow multiple files to be converted at the same time. (see 4.3.3.2)
- Created a 16-bit DOS version
- VMS version now available as freeware.
- Added "SendTo" tips for Windows 95/NT users section to the documentation (see 4.4.4)
New functions
Converted from
a single text file by
AscToHTM © 1997-2001 John A Fotheringham |