The latest version of these files is available online at http://www.jafsoft.com/doco/docindex.html
Easy. You register the software (see "registration and updates"), or you remove them by hand. "Nag" lines only appear in unregistered trial copies of the software. If you register, these are removed.
Easy. You register the software (see "registration and updates").
The case is only adjusted in unregistered trial copies of the software, either after the line limit is reached, or after the 30 day trial has expired. The case is adjusted so that you can still evaluate the conversion has produced the right type of HTML, but since the text is now all in the wrong case and had letters substitutes the HTML is of little use to you.
This is intended as an incentive to register.
That said, you will find pages on the web that have been converted in this manner.
The program is detecting a "definition". Definitions are usually keywords with a following colon ":" or hyphen "-", e.g. "text:"
You can see this more easily if you go to Output-Style and toggle the "highlight definition term" option... the definition term (to the left of the definition character) is then highlighted in bold.
If the definition spreads over 2 "lines", then a definition paragraph is created, giving the effect you see.
If you have created your file using an editor that doesn't output line breaks then only long paragraphs will appear to the program as 2 or more "lines". In such cases only the longer paragraphs will be detected as "definition paragraphs", the rest are detected as "definition lines", even though they're displayed in a browser as many lines. If you view the file in NotePad you'll see how the program sees it.
To stop this you have a number of options.
switch this off. This will stop all attempts to spot "definition" lines
switch this off. This will stop anything with a colon (:) being recognized as a definition.
disable this. The definitions will still be recognized, but the <DL> markup won't be used.
Sometimes AscToHTM will produce HTML with words broken - usually over two lines. This can happen if your text file has been edited using a program (like NotePad) that doesn't place line breaks in the output.
AscToHTM is line-orientated (see 2.1.2). Programs like NotePad place an entire paragraph on a single "line", or on lines of a fixed length (e.g. 1000 characters).
AscToHTM places an implicit space at the end of each line it reads. This is to ensure you don't get the word at the end of one line merged with that at the start of the next.
However, in files with fixed length "lines", large paragraphs will be broken arbitrarily, with the result that a space (and possibly a <BR>) will be inserted into the middle of a word.
You can avoid this by breaking your text into smaller paragraphs, passing your file through an editor that wraps differently prior to conversion, or selecting any "save with line breaks" option you have.
The software will add a line break to "short" lines, or - sometimes - to lines with hyperlinks in them.
You can edit your text to prevent the line being short, or you can use policies to alter the calculation of short lines. Use the "Policy Manual" to read about the following policies
- "Add <BR> to lines with URLs"
- "Look for short lines"
- "Short line length "
- "Page Width"
Do you mean line structure, or do you really mean paragraph structure?
The program looks for "short lines". Short lines can mark the last line in a paragraph, but more usually indicate an intentionally short line. The calculation of what is a short line and what isn't can be complex, as it depends on the length of the line, compared to the estimate with of the page.
You have a number of options :-
See also "how do I preserve one URL per line?"
Usually because you had lots of white space in your original document. If that is the case, then you can set the policy
Ignore multiple blank lines : Yes
to reduce this effect.
Some people complain that there are blank lines between paragraphs, or between changes in indentation. Often this is the vertical spacing inserted by default in HTML. This can only be controlled in later versions of HTML which support HTML 4.0 and Cascading Style Sheets (CSS)
Occasionally certain combinations of features lead to an extra line of space.
Well, at time of writing, I know of a 56,000 line file (3Mb) which was converted into a single (4Mb) HTML file. Of course, it was also converted in a suite of 300 smaller, linked, files weighing in at 5Mb of HTML.
This file represented 1,100 pages when printed out.
I do sometimes wonder if anyone ever reads files that big though.
Since version 4.1 the short answer is "probably".
Although the software has no ability to understand documents written this way, and was designed to cope with the ASCII character set, from version 4.0 onwards it is possible to manually set the "charset" used. This tells the HTML browser how to interpret the characters. Whether or not you see the page correctly then depends on the browsers and fonts installed on the viewer's machine.
In version 4.1 some auto-detection of character sets has been added. This can usually detect which character encoding is being used. You can switch this behaviour off should you wish, and you can also set the correct charset by hand.
See the policies "Character encoding" and "Auto-detect character encoding".
Under Windows the software usually tries to display the results files in your browser or viewer of choice. To prevent multiple instances of the browser being launched, DDE is used. DDE is a Windows mechanism that allow requests to be passed from one program to another, in this case the software is asking the browser to display the HTML just created.
Some users have reported problems with DDE - especially under Windows Millenium. When this occurs any program - including AscToHTM - will hang whenever it attempts to use DDE... you notice it first with AscToHTM because it uses DDE all the time. When this happens you will need to use the Task Manager to kill the program.
You can solve this problem by using the Settings -> Viewers for results to disable the use of DDE.
From version 4 onwards the software will detect when this has happened, and will disable its use of DDE next time it is run. You can re-enable this (e.g. after a reboot has cleared the problem) under the Settings->Viewers menu option.
Note, this is a workaround and not a solution. When DDE stops working on your system other programs sill have problems, e.g. when you click on a hyperlink inside your email client.
Sadly I don't know a solution for the DDE problem. Sometimes rebooting helps - initially at least - sometimes stopping a few applications helps. Sometimes it doesn't. :-(
Because it wasn't designed to. No, really.
The software is designed to convert ASCII text into HTML. That is plain, unformatted documents. Word and other wp packages use binary formats that contain formatting codes embedded in the text (or in some cases the text is embedded in the codes :-).
Even RTF, which is a text file, is so heavily full of formatting information that it could not be perceived as normal text (look at it in Notepad and you'll soon see what I mean).
Why the omission? Well, like I said, that was never the intention of this program. I always took the view that, in time, the authors of those wp packages would introduce "export as HTML" options that would preserve all the formatting, and in general this is what has happened. To my mind writing such a program is "easy".
My software tackles the much more difficult task of inferring structure where none is explicitly marked. In other words trying to "read" a plain text file and to determine the structure intended by the author.
See also "rtf-to-html converter etc?".
You can't. Unlike Netscape versions up to and including 4.7, Netscape 6.0 doesn't support DDE in its initial release under Windows.
By itself, no.
AscToHTM can only really produce relatively "static", mostly-text web pages. To add any dynamic contents and graphics you'd effectively need to add the relevant HTML yourself, so the answer is essentially "no".
Adding a shopping cart is actually fairly tricky. You either have to install the software yourself, or sign up with an ISP that will do this for you. Most such systems require a database (of items being sold). Having not dealt much with such systems myself I can't really advice on a web authoring tool (which is what AscToHTM is) that would integrate seamlessly with a shopping cart system.
My advice would be to identify an ISP that offers shopping cart functionality and see what methods they offer for web authoring.
I wish you luck.
At present you can't. The windows version won't respond to stimulus while a conversion is in progress, meaning that the windows will not refresh. Normally this isn't a problem, but in large conversions this can be a little disconcerting.
Fixing this is on the "to do" list.
Here's an overview of how the software works, this will give you a flavour for the complexity of the issues that need to be addressed.
The software first looks for pre-formatted regions of text. It does this by
Once a preformatted region is identified, analysis is performed to see whether this is a table, diagram, code sample or something else. This decision depends on
If nothing fits then this text is output "as normal", expect that the line structure is preserved to hopefully retain the original meaning.
If the software decides a table is possible, it
Once a tentative set of column boundaries has been identified, the following steps are repeated
Finally, having settled on a column structure the software
Naturally any one of these steps can go wrong, leading to less than perfect results.
The program has mechanisms (via policies and preprocessor commands) to
- Influence the attempt to look for tables
- Influence the attempt to extend tables (steps (1)-(3))
- Influence the decision as to what a preformatted region is (steps (4)-(6))
- Influence the column analysis (steps (7)-(11))
- Influence the header size and column alignment (steps (12)-(15))
Read the table sections in the "Tag Manual" and "Policy Manual" for more details.
The software will attempt to detect regions of "pre-formatted" text. Once detected it will attempt to place such regions in tables, or if that fails sometimes in <PRE>...</PRE> markup.
Lines with lots of horizontal white space or "table characters" (such as "|". "-". "+") are all candidates for being pre-formatted, especially where several of these lines occur.
This often causes people's .sigs from email to be placed in a table-like structure.
You can alter whether or not a series of lines is detected as preformatted with the policies
Look for preformatted text No Minimum automatic <PRE> size 4
The first disables the search for pre-formatted text completely. The second policy states that only groups of 4 or more lines may be regarded as preformatted. That would prevent most 3-line .sigs being treated that way.
If you have pre-formatted text, but don't want it placed in tables (either because it's not tabular, or because the software doesn't get the table analysis quite right), you can prevent pre-formatted regions being placed in tables via the policy
Attempt TABLE generation : No
First read "how does the program detect and analyse tables?" for an overview of how tables are detected.
If you're not getting tables this is either because they are not being detected, or that having been detected they are being deemed to be not "table-like". Look at the HTML code to see if there are any comments around your table indicating how it's been processed.
If the table is not being detected this could be because
If all this fails, edit the source to add preprocessor commands around the table as follows
$_$_BEGIN_TABLE ... ...(your table lines) ... $_$_END_TABLE
First read "how does the program detect and analyse tables?" for an introduction to how tables columns are analysed.
The short answer is "the analysis went wrong". Answering why it went wrong is almost impossible to answer in a general way. Some things to consider
Often the table extent is correct, but the analysis of the table has gone wrong.
If all this fails you can explicitly tell the software what the table layout by using either the TABLE_LAYOUT preprocessor command, or the "Default TABLE layout" policy. Only use the policy if all tables in the same source file have the same layout.
The software removed them because it thought they would look wrong as characters. The lines are usually replaced by a non-zero BORDER value and/or some <HR> tags placed in cells.
One tip. If you insert a line of dashes after the header like so...
Basic Dimensions Hole No. X Y ------------------------- 1 3.2500 5.0150 2 1.2500 3.1250 etc.....
The program should recognize this as a heading, and modify the HTML accordingly (placing it in bold).
Alternatively you can tell the program (via the policy options or preprocessor commands) that the file has 2 lines of headers.
(see the example table in 3.3.6)
The spanning of "Basic Dimensions" over the other lines can be hit and miss. Basically if you have a space where the column gap is expected the text will be split into cells, if you don't then the text will be placed in a cell with a COLSPAN value that spans several cells.
For example
| space aligns with column "gap" v Basic Dimensions Hole No. X Y ------------------------- 1 3.2500 5.0150 2 1.2500 3.1250 etc.....
In this case you'd get "Basic" in column 1 and "Dimensions" spanning columns 2 and 3. If you edit this slightly as follows then the "Basic Dimensions" will span all 3 columns
| space no longer aligns with column "gap" v Basic Dimensions Hole No. X Y ------------------------- 1 3.2500 5.0150 2 1.2500 3.1250 etc.....
It's a bit of a black art.
Sometimes when the table is wrong, it's a good idea to set the BORDER size to 0 (again via the policy options) to make things look not so bad. It's a fudge, but a useful one to know.
The program can attempt to recognize five types of headings:
Numbered headings. These are lines that begin with section numbers. To reduce errors, numbers must be broadly in sequence and headings at the same level should have the same indentation. Words like "Chapter" may be before the number, but may confuse the analysis when present.
Capitalised headings. These are lines that are ALL IN UPPERCASE.
Underlined headings. These are lines which are followed by a line consisting solely of "underline" characters such as underscore, minus, equals etc. The length of the "underline" line must closely match the length of the line it is underlining.
Embedded headings. These are headings embedded as the first sentence of the first paragraph in the section. The heading will be a single all-UPPERCASE sentence. Unlike the other headings, the program will place these as bold text, rather than using heading markup. You will need to manually enable the search for such headings, it is not enabled by default.
Key phrase headings. These are lines in the source file that begin with user-specified words (e.g. "Chapter", "Appendix" etc.) The list of words and phrases to be spotted is case-sensitive and will need to be set via the "Heading key phrases" policy.
The program is biased towards finding numbered headings, but will allow for a combination. It's quite possible for the analysis to get confused, especially when
To tell if the program is correctly detecting the headings
Conversion Options -> Analysis policies -> headings
after the conversion.
Depending on what is going wrong do one or more of the following :-
This is a failure of analysis. The program looks for a possible contents list at the top of the file before the main document (sometimes in the first section).
If your file has no contents list, but the program wrongly expects one, then as it encounters the headings it will mark these up as contents lines.
To prevent this, set the analysis policy
Expect contents list : No
to "no". Or add a preprocessor line to the top of your file as follows
$_$_CHANGE_POLICY Expect contents list : No
Either a failure of analysis, or an error in your document. The software checks headings "obey policy" and are in sequence. If you get your numbering sequence wrong, or if you place the heading line at a radically different indentation to all the others, then the software will reject this as a heading line, in which case the number may well be turned into a hyperlink.
If it's an error in your document, fix the error.
For example, a common problem is numbered lists inside sections. If the list numbers occur at the same level of indentation as the level 1 section headings, then eventually a number on the list will be accepted as the next "in sequence" header. For example in a section number 3.11, any list containing the number 4 will have the "4" treated as the start of the next chapter. If section "3.12" is next, the change in section number from 4 will be rejected as "too small", and so all sections will be ignored until section 4.1 is reached.
The solution here is edit the source and indent the numbered list so that it cannot be confused with the true headers, Alternatively change it to an alphabetic, roman numeral or bulleted list.
Another possible cause if is the software hasn't recognized this level of heading as being statistically significant. (e.g. if you only have 2 level 4 headings (n.n.n.n) in a large document). In this case you'll need to correct the headings policy, which is a sadly messy affair.
The software can have problems distinguishing between
1 This is chapter one
and
- This is list item number one.
To try and get it right it checks the sequence number, and the indentation of the line. However problems can still occur if a list item on the right number appears at the correct indentation in a section.
If possible, try to place chapter headings and list items at different indentations.
In extreme cases, the list items will confuse the software into thinking they are the headings. In such a case you'd need to change the policy file to say what the headings are, with lines of the form
We have 2 recognized headings
Heading level 0 = "" N at indent 0
Heading level 1 = "" N.N at indent 0
(this may change in later versions).
The software can detect numbered headings. Any lines that begin with numbers are checked to see if they are the next heading. This check includes checking the number is (nearly) in sequence, and that the line is (nearly) at the right indentation.
If the line meets these criteria, it is likely to become the next heading, often causing the real heading to be ignored, and sometimes completely upsetting the numbering sequence.
You can fix this by editing the source so that the "number" either occurs at the end of the previous line, or has a different indentation to that expected for headings.
The software prefers numbered headings to underlined or capitalised headings. If you have both, you may need to switch the underlined headings on via the policy
Expect underlined headings : Yes
If the program is looking for underlined headings (see "Why are underlined headings not recognized?") then the only reason for this is that the "underlining" is of a radically different length to the line being underlined. Problems can also occur for long lines that get broken.
Edit your source to
- place the whole heading on one line
- make the underlining the same length
The level of heading associated with an underlined heading depends on the underline character as follows:-
'****' level 1 '====','////' level 2 '----','____','~~~~' level 3 '....' level 4
The actual markup that each heading gets may depend on your policies. In particular level 3 and level 4 headings may be given the same size markup to prevent the level 4 heading becoming smaller than the text it is heading. However the logical different will be maintained, e.g. in a generated contents list, or when choosing the level of heading at which to split large files into many HTML pages.
A couple of possible reasons :-
- a numbered list is confusing the software. This is the same problems as "why are the numbers of my headings coming out as hyperlinks?"
- Some of your headings are "failing" the checks applied. See the discussion in "how does the program recognize headings?"
One of the reasons for "failure" is that - for consistency - headings must be in sequence and at the same indentation. This is an attempt to prevent errors in documents that have numbers at the start of a line by chance being treated as the wrong headings.
If some headings aren't close enough to the calculated indent then they won't be recognised as headings. If a few headings are discarded then later headings that are at the correct indentation are discarded as being "out of sequence".
If you're authoring from scratch then the easiest solution is to edit all the headings to have the same indent. Alternatively disable the policy "Check indentation for consistency".
The software attempts to recognize all URLs, but the problem is that - especially near the end of the URL - punctuation characters can occur. The software then has difficulty distinguishing a comma separated list of URLs from a URL with a series of commas in it (as beloved at C|Net).
This algorithm is being improved over time, but there's not much more you can do than manually fix it, and report the problem to the author who will pull out a bit more hair in exasperation :)
To avoid errors the program will only recognize newsgroups in the "big 7" hierarchies. Otherwise filenames like "command.com" might become unwanted references to fictional newsgroups.
This means that uk.telecom won't be recognized, although if you place "news:" in from of it like this news:uk.telecom then it is recognized.
If you want to make "uk." recognized as a valid news hierarchy, then set the policy
recognized USENET groups : uk
Then any work beginning "uk." may become a newsgroup link.
The program will only convert numbers that match known numbered sections into hyperlinks. If the number is a genuine section heading, then the chances are that this level of heading has not been detected. This has happened in large documents which contained only 2 level 5 headings. In such document you may need to manually add the extra level to your policy file.
Another limit is that the program won't convert level 1 heading references, because the error rate is usually two high. For example if I say "1, 2, 3" it's unlikely I want this to become hyperlinks to chapters 1, 2 and 3.
In a numbered document numbers of the form n.n may well become hyperlinks to that section of the document. This can cause "Windows 3.1" to become a hyperlink to section 3.1 if such a section exists in your document.
You can either insert some character (such as "V" to make "V3.1"), place the number inside a protective pre-processor TEXT tag as follows
[[TEXT 3.1]]
or disable this feature entirely via the policy
Cross-refs at level : 3
(which means only "level 3" headings such as n.n.n will be turned into links, or
Cross-refs at level : (none)
which should disable the behaviour.
The software will sometimes break long lines to make the HTML more readable. If this happens in the middle of a hyperlink, the browser reads the end of line as a space in the URL.
You can fix this by editing the output text so that the HREF="<url>" part of the file is all on the same line.
This "feature" may be fixed in later versions of AscToHTM.
Some files contain lists of URLs, with one URL per line. By default the software will not normally preserve this structure because long lines are usually concatenated into a single paragraph.
You can change this behaviour using the option on the Output policies -> Hyperlinks policy sheet.
See also "why isn't the software preserving my line structure?"
First time I looked it was nearly 200, recently the number is approaching 250. They kind of sneak up on you, I guess. The "Policy Manual" gives a pretty comprehensive description of what each one does and where it can be found. Last time I checked that file was 5000 lines of text before conversion to HTML.
People complain that there are too many policies, but then they say "couldn't you add an option to ...", and so it goes. Organizing these policies in a logical manner is a fairly difficult problem, and if anyone has any bright ideas I'm listening. In recent versions I added overview policies to make things easier to locate or to switch off en masse.
Make sure you're using an "incremental" policy file, rather than a full one. You can do this by viewing the .pol file in a text editor. An "incremental" policy file will only contain lines for the policies you've changed. A full policy file will contain all possible policies.
If you load a "full" policy file you prevent the program intelligently adjusting to the particular file being converted. If this happens either edit out the lines you don't want from your policy file, or reset the policies to their defaults and create a new policy file from scratch.
(see 1.7)
The program can't distinguish between indented paragraphs and paragraphs that are intended as follow-on paragraphs from some bullet point or list item.
This means that whilst the first paragraph (the one with the bullet point) is indented as a result of being placed inside appropriate list markup, the second and subsequent paragraphs are just treated as indented text.
The bullet point will be indented as one level deeper than the text position of the bullet. The follow-on paragraph will be indented according to it's own indentation position compared to the prevailing documentation pattern. Ideally this will be one level deeper than the text position of the bullet.
Occasionally the two result in different indentations. The solutions are either to
- Review your indent position(s) policy with a view to adjusting the values to give the right amount of indentation to the follow-on paragraphs. Sometimes adding an extra level to match the indentation of the follow-on paragraph is all that's necessary.
- Edit your source text slightly, adjusting the indent of either the list items or follow-on paragraphs until the two match.
HTML doesn't allow the numbering to be marked up explicitly. Instead you can only use a START attribute in the <OL> tag to get the right first number which is incremented each time a <LI> tag is seen.
Some browsers don't implement the START attribute, and so they always restart numbering at 1.
There's not much I can do about this problem.
I've also seen a bug in Opera V3.5 where any tag (such as <EM>) placed between the <OL> and the <LI> causes the numbering to increment. That shouldn't be a problem here, as that's illegal HTML markup - and we try very hard not to generate any of that!
There's a bug (in Opera), where a <FONT> tag between the <OL> and <LI> tag causes all that text to not be displayed.
That shouldn't be a problem here, as that's illegal HTML markup - and we try very hard not to generate any of that!
If there's any other problem of this sort please email info<at>support.com with details.
There are a number of ways:-
- If the file already has a contents list this may be detected if the sections are numbered, and the contents line will be turned into links to the sections concerned.
- You can forced the addition of a contents list using the policies under the menu at
Conversion Options -> Output Policies -> Contents List
A hyperlinked contents list will be generated from the headings that the program detects. This list will be placed at the top of the first file.
- If you don't want the generated list to be placed at the top of the file, insert the preprocessor command $_$_CONTENTS_LIST at the location(s) you want. This command takes arguments that allow a limited number of formatting options. It can also be limited in scope, so you can, if you wish, add a $_$_CONTENTS_LIST to each chapter in your document.
First read "how does the program recognize headings?".
If you're generating a contents list from the observed headings, then any missing headings are either because
- The program didn't recognize the headings
- The policy Maximum level to show in contents has been set to a value that excludes the desired heading.
If you're converting an in-situ contents list, then only (a) is likely to apply, in which case you need to ensure the program recognizes your headings.
There used to be a problem whereby the software would add hyperlinks to sections that didn't exist, or would point to the wrong file when a large file was being split into many smaller files.
Both problems should now be fixed, so if you encounter this problem, contact info<at>support.com.
Emphasis markup can be achieved by placing asterisks (*) or underscores (_) in pairs around words or phrases. The matching pair can be over a few lines, but cannot span a blank line. Asterisks and underscores can be nested.
Asterisks generate bold markup, underscores generate italic markup, and combining these generates bold, italic markup.
If you wrap a phrase in underscores, and replace and replace all the spaces by underscores like_this then the result will be underlines like this and not in italics.
The algorithm copes reasonably well with normal punctuation, but if you use some unanticipated punctuation, it may not be recognized!&%@!
You can have a phrase that spans a couple of lines that contains another phrase of a different type in the middle of it, but you can't have two phrases of the same type nested that way. Be reasonable :-)
Phrases that span a blank line are not permitted. You'll need to end the markup before the blank line, and re-start it afterward. This is to reduce the chances of false matches.
The link dictionary allows you to add hyperlinks to particular words or phrases. You can choose the phrase to be matched, the text to be displayed and the URL to be linked to.
This can help when building a site by converting multiple text files. For example the whole www.jafsoft.com site is built from text files, and extensive use of a link dictionary is made to add links from one page to another.
Known problems include
- if the "match text" matches part of the URL the program may get confused. Try to keep them different.
- if the "match text" of one link is a substring of another the program will get confused
- if a link is repeated on the same line on the first occurrence is converted (fixed post V3.0)
- if the "match text" spans two lines it won't be detected.
One tip is to place brackets round the [match text] in your source file... this not only makes the chances of a false match less likely, but also makes it clearer in the source files where the hyperlinks will be.
The Link Dictionary support in the Windows version of the software is a little quirky. Apologies for that.
The way it should work is that you click on "add new link definition", button.
I realize now that this is counterintuitive, and will probably address this in the next release.
If you save your policy, each link appears as a line of the form
Link definition: "match text" = "display text" + "URL"
e.g.
Link definition: "jaf" = "John Fotheringham" + "http://www.jafsoft.com/"
The whole definition must fit on one line.
You may find it easier to open your .pol file in a text editor and add these by hand.
For more information see the section "Processing several files at once" in the main documentation. The software supports wildcards, and console versions are available to registered users which are better suited for batch conversions.
In the shareware versions no more than 5 files may be converted at once. This limit is absent in the registered version (see "what's the most files I can convert at one go?").
If you only want a few files converted, then the simplest way is to drag and drop those files onto the program. You can either drag files onto the program's icon on the desktop, or onto the program itself.
If you drag files onto the program's icon there is a limit with this approach of around 10 files. This limit arises because the filenames are concatenated to make a command string, and this seems to have a Windows-impose limit of 255 characters. This problem may be solved in later versions.
The same limit doesn't seem to apply when you drag files onto the open program.
Alternatively you can browse to select the files you want converting.
If you want to convert many files in the same directory, then just type in a wildcard like "*.txt" into the name of the files to be converted.
Registered users of the software can get a console version of the software. This can accept wildcards on the command line, and is more suited for batch conversion, e.g. from inside windows batch files (for example it won't grab focus when executed).
If you want to convert many files in different directories, either invoke the console version multiple times using a different wildcard for each directory, converting one directory at a time, or investigate the use of a steering command file when running from the command line. See the main documentation for details.
The largest number of files converted at one time using the wildcard function was reported to be around 2000. A week later someone contacted me with around 3000 files to be converted. A few weeks after that someone was claiming 7000. If you'd like to claim a higher number, let me know.
Theoretically the only limit is your disk space. The program operates on a flat memory model so that the memory used is largely independent of the number of files converted, or the size of the files being converted.
Such conversions are a testament to the program's stability and efficient use of system resources. That said if possible we recommend you break the conversion into smaller runs you reduce your risks :-)
The program can only split into files at headings it recognises (see "how does the program recognize headings?"). You first need to check that the program is correctly determining where the headings are, and what type they are.
Headings can be numbered, capitalised or underlined. To tell if the program is correctly detecting the headings
Conversion Options -> Analysis policies -> headings
Once the headings are begin correctly diagnosed, you can switch on file splitting using the policies under
Conversion Options -> output policies -> file generation
Note that the "split level" is set to 1 to split at "chapter" headings, 2 to split at "chapter and major section" headings etc.
Underlined headings tend to start at level 2, depending on the underline character (see "How do I control the header level of underlined headings?")
Hopefully this will give you some pointers, but if you still can't get it to work, please mail a copy of the source file (and any policy file you're using) to info<at>support.com and I'll see what I can advise.
Prior to version 4 there was a bug which meant the policy "Add navigation bar" was being ignored when splitting files (the only time it was used). This is now fixed.
However also available in version 4 is a new "HTML fragments" feature that allows you to customize some of the HTML generated by the software. This includes the navigation bars so that, for example, if you wanted to suppress just the top navigation bar, you could define the fragment NAVBAR_TOP to be empty.
See "customizing the HTML created by the software" and the "Tag Manual" for more details.
The software attempts to detect pre-formatted text in your files and, when it finds some, attempts to turn these into tables. In many cases having detected some pre-formatted text it recognises that it cannot make a table and so resorts to using <PRE>...</PRE> markup instead (in RTF is uses courier font), giving a "mal-formed table" error message.
These <PRE> sections actually work quite well for some documents, but in other cases they would be better not handled this way.
Happily the solution is simple. On the menu go to
Conversion Options -> Analysis policies -> What to look for
and disable "pre-formatted regions of text".
No.
My converters convert from plain ASCII text into HTML or RTF. Their "unique selling point" is that they intelligently work out the structure of the text file.
However other people provide other converters.
There are a number of html->text converters on top of which Netscape has a good "save as text" feature. Or you can import the HTML into Word and use Word's save as text features (although in my opinion these are inferior to Netscape's).
If you visit my ZDNet listing at http://www.hotfiles.com/?000M96 and click on the "related links" you'll see a number of converters listed.
There are at least two RTF-to-HTML converters called RTF2HTML and RTFtoHTML and of course Word for Windows offers this capability (it doesn't suit everyone though).
In fact, here are four products:-
RTFtoHTML can be found at http://www.sunpack.com/RTF/
RTF2HTML can be found at http://www.xwebware.com/products/rtf2html/
RTF-2-HTML can be found at http://www.easybyte.com/rtf2html.com
IRun RTF conveter (free) can be found at http://www.pilotltd.com/irun/index.html
Yet another Word convetrter can be found at http://www.yawcpro.com/
Converted from
a single text file by
AscToHTM © 1997-2003 John A Fotheringham |