AbiWordDevelopment

From AbiWiki

(Difference between revisions)
Jump to: navigation, search
(Plans)
(Contents)
Line 3: Line 3:
Place content here related to [[AbiWord|AbiWord]] Development.
Place content here related to [[AbiWord|AbiWord]] Development.
-
==== Contents ====
+
==== Contents
 +
====
-
==== Plans
+
=== Plans
 +
 
====
====

Revision as of 03:12, 26 March 2008

Contents

AbiWord Development Area

Place content here related to AbiWord Development.

==== Contents

==

=== Plans

==

Planning for futre version, etc. Hardcopy of ml or IRC discussions in general:

Old Plans:


==== TestingFramework

==

We try to setup a Framework to automatically test AbiWord to ease Quality Assurance and development. The first brick of this framework is the UnitTest.

=== Performance Issues.

==

AbiWords performance degrades with document size until it becomes barely usable at around 400 pages with reasonable computers.

I think that there are a few areas where we can substantially improve performance. There are:

==== PieceTable speedups.

==

Abiword has a doubly linked-list of fragments which can be any sort of document structure including paragraph blocks and text structures. Each fragment knows its position in the document and has links to the next and previous fragment.

To speed up searches for character positions we have a cache containing the most recently used fragment and a vector of fragment positions. The combination allows to either find the location immediately and to do binary searches for the location of a given position at speed that decreases as log(N) where N is the number of fragments in the document.

However upon insertions or deletions in the piecetable we need to update every fragment downstream of the change. This operation is order (N) where N is the number of fragments.

For a typical 300 page document Tomas estimates we do about 16000 updates for each insertion/deletion event. That is every time the user presses a key to insert text we need to do 16,000 updates of our document.

A few years ago Joaquin Cuenca Abela developed a prototype Piece Table based on a balanced Red/Black tree structure where all operations (search,insert,delete) require Order(Log(N)) operations.

| Dont forget the possibility of optimizing away the color.

http://e98cuenc.free.fr/wordprocessor/piecetable.html

I estimate that the example only requires 30 operations for Joaquins Piece Table. This is one of the major performance improvements we can make to AbiWord.

| "Take cuencas AbiWord PT dll->rbtree refactoring a step further by rewriting it in a fractal prefetching b+-tree. Then wrap it all up and tie it in to the tree. Part of the wrapping entails maintaining the logical concept of a dll with getPrev/getNext wrappers."

| Note that when Schault described your attempts to go to a more complex, more unified data structure encompassing more of the relational placement of document content as moving in the wrong direction, he wasnt kidding. Note also that having such and such a structure does not necessarily mean creating it, it can also mean having it in a more abstract since, or being able by operative implementation to act as if the data was formally constructed in such a structure. For example, to use a previously brought up possibility, your memory model may emulate two arrays, but that doesnt mean one of them isnt a virtualization, like a virtual image thats a manifestation of but not a realization of its source, manifested by the action of performing operations.

| Another thing to consider is highly efficient property hashing. Tamlin can best explain how this helps, so Ill leave it to him to discuss it here.

| Dont forget that, as Cuenca amongst others has observed in the research youve cited, while optimizations such as highly efficient property hashing can be very helpful, their biggest boon is in the enabling of greater improvements in our layout code, which is absolutely atrocious in its growth factors.

==== The UpdateLayout methods in the fl_SectionLayout classes

==

Every time a user presses a key in AbiWord the PieceTable is modified and new ChangeRecord describing the change is created. These ChangeRecords are both saved in an undo stack in the AbiWord document and pointers to them are broadcast to every view attached to the document via the FL_DocListener class.

The change record has a pointer to fl_BlockLayout classes that contains the changed location. This fl_BlockLayout class is called via the appropriate method and updates itself and the physical runs (fp_Run classes) contained by the block.

It then calls a method called "format" which breaks text runs into lines which fit into the container that contains the block.

At end of every single edit, the method fl_DocLayout::updateLayout(..) is called.

This scans the entire document looking for blocks that need to be updated. We can speed things up enormously by making adding pointers to a local cache of blocks in each section that need to be formatted.

This would speed the process so that instead of requiring N operations for each keypress, it only requires O(1-3) operation.

==== BreakSection

==

This method in fb_ColumnBreaker assembles lines (fp_Line) and tables (fp_TableContainer) onto pages. It needs to iterate through an entire section after a line changes its position or height. The length of time this takes is Order(N) where N is the number of lines.

It might be possible to solve this order(N) problem for a number of cases using Joaquins Red/Black tree but doing so will be very tricky. Especially when taken in conjunction with the complexities of break tables across pages.

A less ambitious goal and one that would make Abiword appear to be faster, is to just update the pages from the beginning of the document (?) until just after the current on screen page during the keystroke processing, then finish the rest of the update during idle between keystrokes.

_Oops—I had trouble understanding Martins original paragraph and made some changes including inserting the phrase "from the beginning of the document" in an attempt to clarify. On further thought ;-), and pending clarification from Martin, I realize he might have meant soemthing like "from the insertion point (for the current change / keystroke)" -- Main.RandyKramer - 22 Mar 2005_

  • Note*: a while back we tried to format the layout only until the end of the screen. This caused huge problems (crashes) because the x,y coordinates of various layout elements get queried and used even when the elements are not on screen (not sure why). Leaving part of the layout update into idle time would have potentially similar consequences; this is probably the way to go, but it might be initially more disruptive than it appears.

==== Table Layout

==

The current table layout algorithm is adapted from the algorithm used by Gtk+. It is very robust amd handles tables nested to arbitary depth. Unfortunately it takes Order N^2 time to layout tables where is the number of cells. Ive implemented a few hacks to speed things up while typing for abiword-2.2 but the fundamental problem persists.

It may well be worth examining the algorithms used by the gecko layout engine which can position complex tables with thousands of cells extremely quickly.

==== Plugin Issues

==

This section describes some of the issues various plugins have, which are not directly bugs.

==== OpenDocument plugin

==

==== AbiCollab Collaboration tool

==

==== Contributors

==

Personal tools