As well as detection features present in the source text, the software allows you to add in features that you would expect in the output file that can't be inferred from the input
These include the following.
AscToHTM can calculate - or be told - the title of a document. This will be placed in document properties section in the header of each HTML file produced.
The Title is calculated as in the order shown below. If the first algorithm returns a value, the subsequent ones are ignored.
AscToHTM can detect the presence of a contents list in the original document, or it can insert a field code that will generate a contents list from the headings that it observes.
There are a number of policies that give you control over how and where a contents list is generated (see contents list policies).
Contents lists placement
By default the contents list will be placed at the top of the output file. You can cause contents lists to be placed wherever you want by using the CONTENTS_LIST pre-processor command.
Contents list detection
AscToHTM can detect contents lists in a number of ways
This is often a hit-and-miss procedure, and is liable to error.
Should the analysis fail, you can attempt to correct it via the Contents lists policies.
AscToHTM can be made to add HTML headers and footers to each page of HTML generated. Although there are policies that allow you to specify special header and footer files, in later versions of the software we recommend the use of HTML fragments
Using a HTML Fragments file, you can define the reserved fragment names HTML_HEADER and HTML_FOOTER and these will be copied into the output. You can even embed HTML fragment tags into these definitions, to allow for some customisation. Other reserved names allow you to customize the headers and footers when converting files into a set of frames.
For example the definition
$_$_DEFINE_HTML_FRAGMENT HTML_FOOTER <HR> <P>© JafSoft 2004</P> <P>Converted by AscToHTM 5.0 from this source file on 14-Dec-2004</P> $_$_END_BLOCK
defines a fragment that will add a line at the end of the page and then a copyright notice and a line of text showing the version number of the conversion program used, a link to the original text source file (assumed to be a local link) and the date the conversion was performed.
See Using HTML fragments for a fuller description.
From version 4 onwards the program allows you to define "HTML fragments", that is fragments of HTML that can be used by the software to override the standard HTML that it produces. This allows you to customise the headers, footers, horizontal rules, contents list and more.
See HTML fragments
By default AscToHTM creates a single HTML file. However it is possible to get the software to split large files into smaller files, all linked together. For this to be possible the program has to first detect headings in the file. Once this is done, you can choose at what level of heading you want to split the file into pages.
This feature is described more fully in the AscToHTM FAQ that is part of the HTML documentation
The program can only split into files at headings it recognises (see "Detecting Headings"). Before splitting the file you first need to check that the program is correctly determining where the headings are, and what type they are.
Headings can be numbered, capitalised or underlined. To tell if the program is correctly detecting the headings
Note that the split level is set to 1 to split at "chapter" headings, 2 to split at "chapter and major section" headings etc.
Underlined headings tend to start at level 2, depending on the underline character (see "Underlined heading detection")
Hopefully this will give you some pointers, but if you still can't get it to work, please mail a copy of the source file (and any policy file you're using) to info<at>jafsoft.com and I'll see what I can advise.
You can use the Conversion Type to select the option of placing your document into a set of HTML frames. This will consist of a master document containing the necessary <FRAMESET> tags to define the frames, and then a number of supporting documents. The main conversion files will be created as before, and will be displayed in the main frame.
The program has the ability to generate a set of frames from your source file. The program works to a model set of frames as shown below, but you have a great degree of control over how the frames are laid out, and what their contents are.
+------------------------------------------------------------+ | Header frame | | (optional) | +-------------+----------------------------------------------+ | NOFRAMES | | | link | | | | | | | | | | | | Contents | Main | | Frame | Frame | | (optional) | | | | | | | | | | | | | | | | | +-------------+----------------------------------------------+ | Footer frame | | (optional) | +------------------------------------------------------------+
Frames are implemented under HTML by having a document that describes the frame layout by using one or more nested <FRAMESET> tags. These tags group together <FRAME> tags that identify other HTML files that describe the contents of the individual frames or panes. The HTML page containing the <FRAMESET> doesn't normally contain any visible content. The source of this HTML page looks something like this :-
<FRAMESET ROWS="110,*,90"> <FRAME NAME="header" SRC="header.html"> <FRAMESET COLS="260,*"> <FRAME NAME="contents" SRC="contents.html"> <FRAME NAME="main" SRC="main.html"> </FRAMESET> <FRAME NAME="footer" SRC="footer.html""> <NOFRAMES> <BODY> <p>This browser does not support FRAMES<p> <p>Visit <A TARGET="_top" HREF="noframes_main.html">this link</A></p> </BODY> </NOFRAMES> </FRAMESET>
This example produces a layout similar to that shown in the diagram in the Frames Overview. There are four frames as follows :-
- "header" at the top of the screen with content taken from the HTML page header.html
- "footer" at the bottom of the screen with content taken from the HTML page footer.html
- the two frames "contents" and "main" side by side in the middle of the screen, between the "header" and "footer" frames. The "contents" frame is on the left, the "main" frame on the right. The contents of these frames are held in the html files "contents.html" and "main.html".
The <NOFRAMES> tag describes the content to be displayed if the browser doesn't support frames. This is less common now, but is still important as many search engines don't understand frames, and will only index the pages linked to in the <NOFRAMES> tag.
In HTML the frame names and source file names can be whatever you like. AscToHTM uses the frame names "header", "footer", "contents" and "main", but will vary the source file names according to the name of your input filename.
Depending on the details of your conversion, not all of the above frames are generated, in which case the <FRAMESET> tags will look slightly different.
You don't need to worry about any of this as AscToHTM will determine what layout is required and will generate the necessary HTML <FRAMESET> code.
By default if you convert a file called "myfile.txt" the files created are named as follows:-
myfile_frame.html - Master <FRAMESET> file myfile_header_frame.html - "header" source file. myfile_contents_frame.html - "contents" source file. myfile_footer_frame.html - "footer" source file. myfile.html - "main" source file.
The "main" frame will contain the conversion of your source file. If you elect to split a document into many pages, then this will show the start page (which will have links to any next/previous page).
See also Splitting up large files
If your document has recognised headings, then the program is able to generate a contents list (see 5.6.2). In such cases a "contents" frame is generated and the contents list is placed in a file called "myfile_contents_frame.html".
If no contents list can be generated, then no contents frame is created unless you supply a CONTENTS_FRAME HTML fragment to be used as the contents of the "contents" frame.
The contents frame is placed to the left of the main frame. It will include a hyperlink labelled "NOFRAMES" (see NOFRAMES tag and NOFRAMES link) and the generated contents list. This is different from the <NOFRAMES> tag described in The master <FRAMESET> document.
You can use policies (see Using policies to control the frame structure) to suppress the creation of a contents frame or to control the following:-
- width of the frame
- colours of background and text
- number of levels shown in the generated contents list
- whether a "NOFRAMES" link is shown, and what URL it links to
You can also customize the frame's appearance using the following HTML fragments (see Using HTML fragments to override frame contents)
- CONTENTS_FRAME
- START_TOC / END_TOC
See also Using HTML fragments to override frame contents
The software cannot "detect" headers and footers in your source text, so you will only get a header or footer frame if you supply the HTML yourself. Header and footer frames can be useful as they provide you with the opportunity to supply titles, navigation links or copyright notices that are always visible.
Prior to version 4 the software already had the ability to add HTML headers and footers to each page generated using HTML supplied in separate files identified by policy values. From version 4 onwards HTML fragments may also be used.
It's expected that you may want to convert the same source into both frames and non-frames forms, using the same policy file. Given this the program has the ability to "promote" the HTML headers and footers used in non-frames production into their own always-visible frames. Equally there may be times when this behaviour is not wanted.
The relationships between headers and footers used in non-frames conversion and those used in frames-based conversion are quite complex. In the following sections we describe how headers (footers) are calculated. The logic is described for headers, but applies equally well to footers if you make the necessary name changes.
In non-frames conversion each page created will get a HTML header if
If both are set, the HTML_HEADER fragment is used in preference.
The selected header is referred to as the "standard" header in the discussion in the next two sections.
In frames conversion the HTML header added to each page is determined by three things
If the fragment MAIN_FRAME_HEADER is defined, then that is used.
If the fragment MAIN_FRAME_HEADER is not defined, and there is no "standard" header, then the main frame gets no HTML header.
If the fragment MAIN_FRAME_HEADER is not defined, and the policy is not set then the "standard" header is used as in non-frames conversion.
If the fragment MAIN_FRAME_HEADER is not defined, and the policy is set then the "standard" header is promoted into its own "header" frame, and the main frame gets no HTML header.
In frames conversion whether or not a "header" frame is created is determined by three things
If the fragment HEADER_FRAME is defined, then that is used as the contents of a "header" frame.
If the fragment HEADER_FRAME is not defined, and there is no "standard" header, then no "header" frame is created.
If the fragment HEADER_FRAME is not defined, and the policy is not set, then no "header" frame is created.
If the fragment HEADER_FRAME is not defined, and the policy is set, then the "standard" header is used as the contents of the "header" frame. In other words "standard" header is promoted from the "main" frame into its own "header" frame.
A large number of policies influence frames generation. These are described more fully in the Policy manual.
general
contents frame
Add contents frame if possible
Add NOFRAMES links
NOFRAMES link URLNumber of levels in contents frame
Contents Frame width
Contents frame background colour
Contents frame text colourmain frame
A number of file generation policies affect the main frame's appearance, including :-
header and footer frames
Use main header in header frame
Header Frame depth
Header frame background colour
Header frame text colourUse main footer in footer frame
Footer Frame depth
Footer frame background colour
Footer frame text colour
HTML fragments were introduced in version 4 as a means of allowing users to customize some the HTML generated by the software. This feature is heavily used in frames generation.
The fragment names used in frames production includes
HEADER_FRAME If defined, this fragment is used as the contents of a header frame at the top of the screen FOOTER_FRAME If defined, this fragment is used as the contents of
screena footer frame at the bottom of the CONTENTS_FRAME If defined, this fragment is used as the contents of the "contents" frame on the left of the screen. If not defined the "contents" frame will contain a generated contents list MAIN_FRAME_FOOTER If defined, this fragment is used as the HTML footer of each page that appears in the main frame, overriding any HTML_FOOTER or value defined via policy file. MAIN_FRAME_HEADER If defined, this fragment is used as the HTML header of each page that appears in the main frame, overriding any HTML_HEADER or value defined via policy file.
Other HTML fragments may have an effect. For example :-
START_TOC A fragment to be output before any generated
table of contents. If not defined the default
behaviour is to output the title "Table of Contents"END_TOC A fragment to be output after any generated table
of contents. If not defined the default behaviour
is to simply put out a horizontal rule <HR>
See also Using HTML fragments
There are several reasons why providing a non-frames alternative to your pages is a good idea. These include
- Not all browsers support frames. This is rarer these days, but there are still people who use text-based or non-visual browsers that can get confused by frames.
- Not all people like frames. This is understating it, as many people loathe frames. This is because frames pages are hard to bookmark and the navigation can confuse some people.
- Many search engines won't access the HTML pages used inside frames. This means your pages will go un-indexed, making it hard for people to find them.
To help with these problems the software supplies a <NOFRAMES> tag in the main <FRAMESET> document, and a visible "NOFRAMES" hyperlink in the contents frame.
The program can place a hyperlink in the contents frame. This link is labelled "NOFRAMES" and will link to the first main page. This will allow users who don't like frames to view your pages in a non-frames window. You can control this link to a limited extent using policies.
See also Using policies to control the frame structure
HTML provides a tag whose contents is displayed to any browser that doesn't support the <FRAMESET> tag. The program will automatically generate a <NOFRAMES> tag that displays a message saying the page requires frames, and offering a link to the first main page. This will allow users with non-frames browsers, and search engines to access your main pages.
You should consider whether or not your pages are suitable for both frames and non-frames viewing. If they are, then you can use the first page displayed in the main frame as your NOFRAMES hyperlink target. This is, in fact, the default behaviour.
There are a number of reasons that you might want to maintain two sets of pages :-
- You don't want to have the non-frames version split into as many small pages as the frames version (different Split level policy values)
- You want to place different headers and footers on the two versions to allow for different methods of navigation.
If you do want two sets of files, simply convert the file twice with and without frames generation selected. You can either move the files into different directories, or change the output filename for one of the sets. Other than these changes you should be able to use the same policy file.
If you create two sets of files, make sure you set the NOFRAMES link URL policy to point to the first non-frames HTML page.
You can use the Conversion Type to select the option of placing the generated HTML onto the Windows clipboard, ready for use in other Windows applications.
In this case the HTML generated will omit the <HTML>, <HEAD> and <BODY> tags as these are not suitable when pasting into an existing HTML document.
Using AscToHTM in this way can be a very powerful technique which allows you to merge converted HTML with more traditionally authored content.
This approach becomes even more powerful if you use a Clipboard extender like ClipMate to remember and organise everything to the clipboard. You could convert a few files, and then use Clipmate to recall the pasted HTML at your leisure for insertion into your other HTML.
ClipMate is produced by ThornSoft and can be downloaded from their website at http://www.thornsoft.com/
Converted from
a single text file by
AscToHTM © 1997-2004 John A Fotheringham |