Documentation for the AscToRTF conversion utility |
The latest version of these files is available online at http://www.jafsoft.com/doco/docindex.html
AscToRTF has the ability to detect and analyse tables. However, if you want to convert a file containing multiple tables, or to convert multiple files containing a variety of tables, it may be useful to create a "Table Definition File" (TDF). In this file you can define multiple table types, and each type can be associated with a particular structure and a particular format.
By associating a table type with a structure, you are telling the software how the table is laid out, thereby simplifying it's task, and reducing the likelihood of analysis errors.
By associating a table type with a format, you can control various aspects of how the final table looks. This feature was mainly added for HTML conversion where, for example, you can associate CSS styles with columns, allowing tables of a known type to have their columns formatted through an external stylesheet. In RTF conversion there are far fewer formatting options available.
The use of TDFs was added for a future version of AscToHTM, and they allow you to define features that fine tune the HTML created by AscToHTM. These features are not described here, but will be described in full in the appropriate section of the Table manual
Contents of this section
Loading the Table Definition File
Sample Table Definition File
Defining a Table Type
TDF line: Table typeDefining a Table Structure
TDF line: Must contain
TDF line: Use Structure
TDF line: Use Format
TDF line: Table structureDefining a Table Format
TDF line: Layout
TDF line: Table format
TDF line: Column names
TDF line : Caption
TDF line: Table colour
TDF line: Border colour
TDF line: Colour data rows
TDF line: Border size
TDF line: HTML attributes
TDF line: HTML cell attributes
TDF line: Show columns
All the table definition information should be placed in a "Table Definitions File" (TDF). By convention this file should have a ".inc" extension, but this isn't a requirement.
To have the file loaded, you will need to set a policy line in your policy file of the form
[config files] Definitions file : table_definitions.inc
This will normally appear in the "[config files]" section of your policy file, but again this isn't strictly necessary.
Once this option appears in your policy file, the table definitions file will be loaded before the conversion process starts. During the conversion each table that is found will be tested against the known table types, and if a match is made, any associated structure and formatting will be applied.
You can select the Style Definition File that you want from the Config File locations menu option.
Here is a sample Table Definition File (TDF). This example is primarily geared towards creating a HTML table with coloured alarm states from a financial results table.
table type : TradeDate must contain : TradeDate Momentum SinceSig use format : F_TradeDate use structure : S_TradeDate table structure : S_TradeDate Layout : 6,7,39,51,61,71,80 table format : F_TradeDate column names : ,,,Momentum,SinceSig column format : name=Momentum alarms="Momentum_red,Momentum_green" column format : name=SinceSig alarms="SinceSig_red,SinceSig_green" alarm : name=Momentum_red type=LE value=0.0 alarm_color=red alarm : name=Momentum_green type=gt value=0.0 alarm_color=green alarm : name=SinceSig_red type=LE Value=0.0 alarm_color=red alarm : name=SinceSig_green type=gt value=0.0 alarm_color=green HTML attributes : onMouseover="changeto(event, 'white')"
This example consists of three sections;
Although this example has just one of each section type, in more complex conversions you might have multiple Table Types sharing just a handful of Table Structures and Table Formats.
Whenever the software identifies a candidate table, it will check against the Table Definitions File (TDF) to see if this is a table of a known type. If it is, it will apply any defined structure or formatting for that type to the candidate table.
Table types are defined by having a series of table type sections something like the following in your definition file
table type : Rank must contain : Rank ROC5 ROC10 ROC21 use format : format_rank use structure : structure_rank
Each table type section starts with the table type line, and is followed by a number of additional definition lines that apply to this table type.
NOTE: The use format and use structure lines are optional, although you would want at least one to be present for the table type to have any real use.
Each table type section must contain at least one "identifier" line. Identifier lines are those that specify conditions by which tables of this type can be identified.
NOTE: At present the only identifier line supported is the must contain line
The identifier line(s) specify conditions that are used to test each table to see if it could be of this type. Commonly the test is a search for text that is present in the header of all tables of this type.
If multiple table type sections are defined, then the first table type whose identifiers match the table is used, so the ordering of table type sections within the definition file can be important.
For example :-
table type : type_1 must contain : Rank ... table type : type_2 must contain : date Rank ...
In the above example any table that contains a line of text containing "date rank" would match both table types. Since type_1 is listed first, all such tables would be treated as type_1. In this case it would be better to define type_2 first as below
table type : type_2 must contain : date Rank ... table type : type_1 must contain : Rank ...
With the above definitions, a table containing "date rank" would be assigned to type_1, one which simply contained "rank" would be type_2
The presence of a "table type" line in the definition file marks the start of a new table type definition. The line has the format
table type : <type_name>
where <type_name> is the a unique name for this type of table, and shouldn't contain any spaces.
The "must contain" line is an identifier line. That is it is a condition applied against a candidate table to see if it matches this table type.
The line has the format :
must contain : <text_to_be_matched>
The candidate table is tested line-by-line, so the text to be matched must be wholly contained within one line of the source text. When testing the match expression the case of the text being tested is ignored, and all multiple white space is treated as a single space. So the match text "Date rank" will match the word "date" followed by any number of spaces and tabs, followed by "rank".
A good use of this line would be to match the know headers in a table, e.g.
must contain : Date Temperature Max Min
would match a table with headers "Date", "temperature", "MAX" and "MIN" if they were in that order, regardless of the spacing.
NOTE: AT present only one "must contain" line per table type is supported
The "use structure" line specifies that if a table matches the identifying conditions for this table type, then it's structure should be interpreted according to the rules specified in the named table structure section.
This line has the form
use structure : <structure_name>
where <structure_name> is the name of a table structure section defined elsewhere in the definition file. The <structure_name> should contain no spaces.
The same structure may be shared by multiple table types, that is several table type sections could all end up having the same "use structure" line.
If the <structure_name> given doesn't match a structure defined elsewhere in the definition file, then the software will report this error each time it matches a table of this type and then fails to locate the named structure. That is it gives a runtime error, rather than reporting the error when the definition file is loaded.
NOTE: This may change in later versions
See Defining a table structure
The "use format" line specifies that if a table matches the identifying conditions for this table type, then it should be formatted according to the rules specified in the named table format section.
This line has the form
use format : <format_name>
where <format_name> is the name of a table format section defined elsewhere in the definition file. The <format_name> should contain no spaces.
The same format may be shared by multiple table types, that is several table type sections could all end up having the same "use format" line.
If the <format_name> given doesn't match a format defined elsewhere in the definition file, then the software will report this error each time it matches a table of this type and then fails to locate the named format. That is it gives a runtime error, rather than reporting the error when the definition file is loaded.
NOTE: This may change in later versions
Table structures are defined by having a table structure section something like the following in your definition file
table structure : Percent_funds Layout : 6,31,41,51,61,71,81
Each table structure section starts with the table structure line, and is followed by layout line.
The presence of a "table structure" line in the definition file marks the start of a new table structure definition. The line has the format
table structure : <structure_name>
where <structure_name> is the a unique name for this structure, and shouldn't contain any spaces. The <structure_name> is the name used in any "Use structure" statement in a table type section.
The Layout line in a table structure section describes how the matched table should be interpreted into columns. The format of the Layout line is the same as that used by the TABLE_LAYOUT pre-processor command.
Layout : 6,7,39,51,61,71,80
For example the above line specifies the data comes in 6 columns which end at the column positions 7, 39, 51.. etc
Table formats are defined by having a table format section something like the following in your definition file
table format : Rank column names : ,,,ROC5,ROC10,ROC21 caption : Rankings table colour : #E1D1C1 border colour : #E2E2E2 Colour data rows : Yes border size : 2 HTML attributes : onmouseover='click' HTML cell attributes : nowrap show columns : no,no,no,,,,,,,,
The presence of a "table format" line in the definition file marks the start of a new table format definition. The line has the format
table format : <format_name>
where <format_name> is the a unique name for this format, and shouldn't contain any spaces.
The <format_name> is the name used in any "Use format" statement in a table type section.
The "Column Names" line allows you to give each column a name. This name should be unique, and can be referenced in more advanced configuration lines.
However these options are not available in RTF generation.
The "Caption" line allows you to add a caption to any table that matched the table type conditions.
However this option is not yet implemented in RTF generation.
The "Table Colour" line allows you to specify a background colour for the table.
However this option is not yet implemented in RTF generation.
The "Border Colour" line allows you to specify a border colour for the table.
However this option is not yet implemented in RTF generation.
The "Colour data rows" line allows you to specify that odd and even rows of the table should be coloured differently to allow the table to be more easily understood.
However this option is not yet implemented in RTF generation.
The "Border size" line allows you to specify a border colour for the table.
However this option is not yet implemented in RTF generation.
The "HTML attributes" line only applies to HTML generation by AscToHTM
The "HTML cell attributes" line only applies to HTML generation by AscToHTM
Format is
show columns : <yes_no>,<yes_no>,...
where <yes_no> is either "yes" or "no" or blank. If blank it's taken to be "yes" for all columns
This line specifies whether or not each column should be included in the output. If your table has been exported in CSV format from a spreadsheet it may contain "work" columns that you don't want shown. This option allows you to specify that.
Converted from
a single text file by
AscToHTM © 1997-2004 John A Fotheringham |