OpenXML Import

From AbiWiki

(Redirected from OpenXMLImport)
Jump to: navigation, search

Synopsis: The project's goal is to create an importer plugin for AbiWord that will read documents conforming to Microsoft's OpenXML format (specifically, Word 2007 documents of the .docx extension) and convert them to AbiWord's internal working format.


  • 2007-08-03*: Inline formatting has been implemented, for the most part. Font face, font size, bold, italics, underline, strikethrough, and color/highlight can now be imported accurately. Also, paragraph formatting such as margins, indent (first-line and hanging), and line-spacing are also imported correctly. The next thing to be implemented is Style-parsing.
  • 2007-06-22*: *First milestone reached!* The warnings have been fixed, and the ListenerState classes have been developped sufficiently to support basic text import! Unicode is supported, providing you have the correct fonts to display the text. The plan for next week is to fix memory leaks, make some code improvements in the iteration of STL containers, and implement basic formatting parsing.
  • 2007-06-15*: The libgsf functions are now available; therefore, a PackageManager wrapper class has been written to offer a convenient interface for parsing Word 2007 documents specifically. It has been tested and is functional. The plan for next week is to implement the ListenerState classes which will parse the different parts of the package. Also, some warnings crept up that should be investigated.
  • 2007-06-08*: The bugs in the data model have been ironed out; it is working and stable. Work has begun on the actual import filter, but there was a design issue concerning the parser of Open packages. It has been decided that the utility functions currently in gnumeric will be used to parse the package. There are plans to integrate these functions into libgsf directly over the weekend; therefore, the plan for next week is to develop the part of the import filter that will use these functions.
  • 2007-06-01*: The design has been finalized for the most part. A preliminary version of the data model has been implemented, but a problem with polymorphism currently prevents it from being fully functional. The plan for next week is to have a functional data model that supports the most basic structures (sections, paragraphs, text) as well as to begin work on the actual parser of .docx files.
  • 2007-05-21*: Most of the preliminary work (analysis and design) has been completed. A dummy version of the plugin currently works.

Original Schedule

All dates are, of course, approximate.

  • Week 1-3 (April 11th to April 29th)*: Documentation phase: Familiarization with the project community, the architecture of AbiWord, how filters are written, and the OpenXML specifications regarding Word documents.
  • Week 4-5 (April 30th to May 13th)*: Analysis phase: Carefully map each feature of OpenXML to the corresponding feature in AbiWord;
  • Week 6 (May 14th to May 20th)*: Design phase: Design a backend that translates from OpenXML to AbiWords data model, and design the import filter that will use this backend separately. The backend would be designed so that it could be extended to work the other way eventually (from AbiWord to OpenXML translation). This would allow for import/export filters to use the same backend.
  • Week 7 (May 21st to May 27th)*: One week contingency period (for eventual delays).
  • Week 8-10 (May 28th to June 17th)*: Implementation phase begins: The most basic features are implemented (tests are done with unit testing).
  • Week 11-12 (June 18th to July 1st)*: Implementation of import filter.
  • Week 13 (July 2nd to July 8th)*: One week contingency period & additional tests)
  • July 9th*: Mid-term upload to Googles servers.
  • Week 14-16 (July 9th to July 29th)*: The remaining features of the backend are implemented and tested).
  • Week 17-18 (July 30th to August 12th)*: The import filter is finished.
  • Week 19 (August 13th to August 19th)*: One week contingency period & additional testing.
  • August 20*: Final upload to Googles servers.
Personal tools