Internal Property Index & Descriptions

See AbiWord Internal Property List.

DTDs for AWML and XHTML with AWML extensions

Proposal for version 2.0 AbiWord file format

At least two bugs in Bugzilla (Bug 2759 & Bug 4213) are concerned with invalid XML where the error is in non-conformance to the stated DTD. AbiWord's current DTD is completely inadequate - I mean no offense to the original authors; DTD syntax is deeply scary and AbiWord's file format isn't trivial to figure out - especially with the emergence of new technologies and the increasing adoption of XML as an interoperability language.

While XML is suitable for representing AbiWord's format, AWML has evolved with little real regard for XML conventions and the current format is in too many ways ill-defined. With AbiWord-2.0 approaching fast, however, the file format is once again stabilizing and the opportunity arises to formalize it with a valid DTD. Further, expressing the format in DTD form exposes weaknesses.

Of course, proposing a new standard is all very well, but it's the implementation that counts. People need to rely on the format, being able to read new and existing documents without loss of data.

And finally, if the file format is well-documented, this will aid developers who wish to write importers and file-format converters.

New XML Namespace for AWML

One of my pet peeves (and we're currently looking for the origin of that expression, so please give us you theories) is AWML's current namespace URL (xmlns:awml="http://www.abisource.com/awml.dtd") so I'm taking this opportunity to change it to something a little more elegant. So, the URI for the new file format is:

xmlns:awml="http://www.abisource.com/2003/"

which, by an uncanny coincidence, is also the URL of this page. Weird.

The DOCTYPE Declaration

For the new DTD we need a new DOCTYPE declaration:

<!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 2.0 Strict//EN" "http://www.abisource.com/2003/awml.mod">

Type Groups

Field Types

app_compiledate | app_compiletime | app_id | app_options | app_target | app_ver | char_count | line_count | nbsp_count | para_count | word_count | date | date_ddmmyy | date_dfl | date_doy | date_mdy | date_mmddyy | date_mthdy | date_ntdfl | date_wkday | datetime_custom | endnote_anchor | endnote_ref | file_name | footnote_anchor | footnote_ref | list_label | mail_merge | meta_contributor | meta_coverage | meta_creator | meta_date | meta_description | meta_keywords | meta_language | meta_publisher | meta_rights | meta_subject | meta_title | meta_type | page_count | page_number | page_ref | time | time_ampm | time_epoch | time_miltime | time_zone

Section Types

header | header-even | header-first | header-last | footer | footer-even | footer-first | footer-last

AWML element and attribute definitions

a

<!ELEMENT a ( #PCDATA | c | field | image | cbr | pbr | br | bookmark )+ >
<!ATTLIST a
    href	%URI.datatype;		#REQUIRED
  >

abiword

<!ELEMENT abiword ( metadata?, revisions?, styles?, ignoredwords?, lists?, pagesize*, section+, data? ) >
<!ATTLIST abiword
    template	( true | false )	#REQUIRED
    styles	( locked | unlocked )	#REQUIRED
    version	CDATA			#IMPLIED	'unnumbered'
    props	CDATA			#IMPLIED	''          
    fileformat	CDATA			#FIXED		'2.0'
  >

<!-- TODO: xml:space="preserve" -->

bookmark

<!ELEMENT bookmark EMPTY >
<!ATTLIST bookmark
    type	( start | end )	#REQUIRED
    name	ID		#REQUIRED
  >

br

<!ELEMENT br EMPTY >

c

<!ELEMENT c ( #PCDATA | cbr | pbr | br )* >
<!ATTLIST c
    style	CDATA		#IMPLIED	'inherit'
    props	CDATA		#IMPLIED	''
  >

<!-- TODO: "type", "endnote-id" & "footnote-id" attributes? -->
<!-- TODO: "style" - implement "inherit" -->

cbr

<!ELEMENT cbr EMPTY >

cell

<!ELEMENT cell ( p )+ >
<!ATTLIST cell
    props	CDATA		#IMPLIED	''
  >

d

<!ELEMENT d ( #PCDATA ) >
<!ATTLIST d
    name	CDATA		#REQUIRED
    mime-type	CDATA		#REQUIRED
    base64	( yes | no )	#FIXED		'yes'
  >

<!-- TODO: MIME types? -->

data

<!ELEMENT data ( d )* >

endnote

<!ELEMENT endnote ( #PCDATA | c | field | br | a )* >
<!ATTLIST endnote
    endnote-id	CDATA		#REQUIRED
  >

field

<!ELEMENT field EMPTY >
<!ATTLIST field
    endnote-id	CDATA		#IMPLIED	''
    footnote-id	CDATA		#IMPLIED	''
    style	CDATA		#IMPLIED	'inherit'
    props	CDATA		#IMPLIED	''
    type	FieldType.datatype;	#REQUIRED
  >

<!-- TODO: default 'none' for "endnote-id" & "footnote-id"? -->

foot

<!ELEMENT foot ( #PCDATA | c | field | br | a )* >
<!ATTLIST foot
    footnote-id	CDATA		#REQUIRED
  >

ignoredwords

<!ELEMENT ignoredwords ( iw )* >

image

<!ELEMENT image EMPTY >
<!ATTLIST image
    dataid		CDATA	#REQUIRED
    props		CDATA	#IMPLIED	''
  >

<!-- TODO: "dataid", or "href"?, should really be of type IDREF -->

iw

<!ELEMENT iw ( #PCDATA ) >

l

<!ELEMENT l EMPTY >
<!ATTLIST l
    id			CDATA	#REQUIRED
    parentid    	CDATA	#REQUIRED
    type        	CDATA	#REQUIRED
    start-value 	CDATA	#REQUIRED
    list-decimal	CDATA	#REQUIRED
    list-delim  	CDATA	#REQUIRED
  >

<!-- TODO: "id" should be ID and "parentid" should be ( IDREF | none ) -->
<!-- TODO: "type" is what? '0' for numbered, '5' for bulleted? -->
<!-- TODO: "start-value" is an integer -->
<!-- TODO: "list-decimal" can be empty or "NULL" (presumably for bulleted lists) -->

lists

<!ELEMENT lists ( l )* >

m

<!ELEMENT m ( #PCDATA ) >
<!ATTLIST m
    key			CDATA	#REQUIRED
  >

metadata

<!ELEMENT metadata ( m )* >

p

<!ELEMENT p ( #PCDATA | c | field | foot | endnote | image | cbr | pbr | br | bookmark | a )* >
<!ATTLIST p
    id		CDATA		#IMPLIED	''
    parentid	CDATA		#IMPLIED	'0'
    listid	CDATA		#IMPLIED	''
    props	CDATA		#IMPLIED	''
    level	(0|1|2|3|4|5|6|7|8|9|10 | CDATA)	#IMPLIED
    style	CDATA		#IMPLIED	'Normal'
  >

<!-- TODO: "type", "endnote-id" & "footnote-id" attributes? -->
<!-- TODO: "id" should be ID, and "listid" & "parentid" should be ( IDREF | none ) -->
<!-- TODO: need a new element type for lists, perhaps? <li>
maybe? -->
<!-- TODO: is "level" needed? -->

pagesize

<!ELEMENT pagesize EMPTY >
<!ATTLIST pagesize
    pagetype	CDATA		#REQUIRED
    orientation	CDATA		#REQUIRED
    width	CDATA		#REQUIRED  
    height	CDATA		#REQUIRED
    units	CDATA		#REQUIRED
    page-scale	CDATA		#REQUIRED
  >

pbr

<!ELEMENT pbr EMPTY >

r

<!ELEMENT r ( #PCDATA ) >
<!ATTLIST r
    id		ID		#REQUIRED
  >

<!-- TODO: "id" - but was that intended? do we need "reviewer", and maybe "date", also? -->

revisions

<!ELEMENT revisions ( r )* >

s

<!ELEMENT s EMPTY >
<!ATTLIST s
    basedon	CDATA		#IMPLIED	'None'
    name	CDATA		#REQUIRED
    type	( P | C )	#REQUIRED
    props	CDATA		#REQUIRED
    followedby	CDATA		#IMPLIED	'Normal'
  >

section

<!ELEMENT section ( p | table )+ >
<!ATTLIST section
    props	CDATA		#IMPLIED	''
    type	SectionType.datatype;	#IMPLIED	''
    id		ID		#IMPLIED
    header	CDATA		#IMPLIED	''
    footer	CDATA		#IMPLIED	''
    num_columns	CDATA		#IMPLIED	'1'
    column_gap	CDATA		#IMPLIED	''
  >

<!-- TODO: "header" & "footer" should be IDREF and default to 'none' -->
<!-- TODO: "num_columns" should be a whole number -->
<!-- TODO: "column_gap" should be a positive length (or 'auto'?) -->
<!-- TODO: "type" needs a default - 'flow' maybe? -->

styles

<!ELEMENT styles ( s )* >

table

<!ELEMENT table ( cell )+ >
<!ATTLIST table
    props	CDATA		#IMPLIED	''
  >