AbiCollab

From AbiWiki

Revision as of 02:53, 17 October 2007 by Maintenance script (Talk)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Contents

AbiCollab Collaboration tool


AbiCollab is a feature which enables different users to directly type and make all the formatting changes normally associated with word processing in a remote users document. This allows users to directly and immediately collaborate for the creation of documents. For a broad overview see this document in the Gnome Journal.

The fundamental thing we have to do is make sure all the AbiWord documents for all users remain in synch.

The other vitial issue is make Abiword behave in the ways users would expect. So when a user types in particular location in her document she expect the text to go into the location the word processor carat was when she pressed her key. In addition, when she clicks undo, she expects her most recent change to be undone.

As you can see from the following, without explicit algorithims to handle internet lag and remote users editting your documents, both of these issues will destroy the utility of AbiCollab.

[[[MarkGilbertIdeas|MarkGilbertIdeas]] Ideas on AbiCollab from Mark Gilbert]

The problem of "internet lag"

The problem is that since users can independently edit their own documents it is very easy for documents to get out of synchronization.

The most straight forward problem is solving the issue of internet lag that occurs from two users just typing in different regions of the document. Suppose we have two users Bob and Jane who are sharing an AbiCollab session. As they type they create ChangeRecords which transmitted to each other over the internet. The ChangeRecord describe exactly where and how the PieceTable is changed.

The fundamental problem is that Jane and Bob could get their documents out of synch because it takes a finite time for changes from one user to propagate over the network to the others computer.

Imagine for example, Bob adds a word or character at the end of the document. This forms a ChangeRecord which then travels over the network to Jane. Due to network latencies, this may take some time. Meanwhile, Jane adds also a word or character before the end of the document which forms a ChangeRecord. Thus Jane inserts extra characters into her document which pushes characters after it to larger positions. Bobs document is not aware of this however. Consequently when Bobs CR arrives at Jane it asks for a character to be inserted one or more places before the end of the document.

  • Note that the Algorithim described in the Gnome Journal article is not what we use now.*

The solution for internet lag

We solve this fundamental problem as follows. Every ChangeRecord gets a unique number that is incremented as it is created. A proper document thus consists of ChangeRecords numbered 1, 2, 3, 4, 5, for example. Janes typing creates a new ChangeRecord with id numbered 6.

In addition every ChangeRecord (CR) received by abiCollab now calculates the size of the difference in the document and position at the which size change takes place. So for example, insering one character has a size difference of +1. Deleting 1 character has size difference of -1. Deleting 10 characters has a size difference of -10 etc.

This information together with the ChangeRecord number is both stored locally and broadcast with the rest of the packet to the remote abiword. In addition the local abiword sends back to the remote abiword the last CR number it received from from the remote abiword.

When the local abiword receives the CR packet from the remote abiword it looks at the last CR number the remote abiword says it received from the local. (Remember we just sent this out.)

The number tells the local abiword what state the remote abiword thought the local was in when the change at the remote abiword was made.

The local abiword can then look back through the list of CRs it has sent, their changes and the positions of each change. It adjusts the point of application of the remote abiwords changes to take account of the local changes made without the remote abiword knowledge.

So Janes AbiWord also sends the last CR number it received from Bob along with the CR. Bob then looks to see if Jane has received all the CRs he sent her. If she hasnt he scans through the CRs he sent but she didnt receive and adjusts the insertion point appropriately. For each Change Record we can calculate how the position of an insertion point ahead of the changes would change and adjust the position of the insertion point for the received change record packet.

This works perfectly for pair-wise connected users as they each know exactly what they sent and received.

Accordingly the geometry of a large AbiCollab collaboration consists of a central hub connected to many spokes. ie There is one Master abicollab session, all other sessions connect to the master. In this way every user can be guaranteed to be synchronized with the master and hence with everyone else.

This technique appears to work well and has been tested with 5 simultaneous users.

This technique forms the basis of corrections for internet lag. Next I describe how we handle undos and complex operations on the PieceTable (like inserting a table).

Problems with undo

The issue here is that every change to the document is recorded in the local users undo stack in the order the changes are received. Before version 2.5.0, the code mades no distinction between edits created by the local user and edits by remote users. Consequently pressing undo originally could undo a change made by the remote user in a different part of the document and not the most recent change by the local user.

In order to understand how this fixed it first neccessary to explain the original method used by AbiWord for undo/redo.

Right from the start AbiWord was designed to to allow infinite undo/redo within a particular editing session.

This done by recording a stack of ChangeRecord created by the user as they edit the document. | | | | ChangeRecord 0 | | | ChangeRecord 1 | | | ChangeRecord 2 | | | ChangeRecord 3 | | | ChangeRecord 4 | | | ChangeRecord 5 | <-undoPos |

This is what the stack would look like after making 5 changes to the document. If the user now presses "undo", ChangeRecord 5 is returned and its "inverse" is a ChangeRecord that undoes the effect of the change is computed. So for example if ChangeRecord 5 was "insertSpan e at position 5", the inverse would be, "deleteSpan e at position 5".

The stack now looks like: | | | | ChangeRecord 0 | | | ChangeRecord 1 | | | ChangeRecord 2 | | | ChangeRecord 3 | | | ChangeRecord 4 | <-undoPos | | ChangeRecord 5 | |

Pressing undo again will pop off ChangeRecord 4, decrease the undoPos one position further. This can continues until undoPos is before ChangeRecord 0.

Having undone some changes, the user now has the option to "redo" the changes. Doing a "Redo" simply increments the undoPos, and then executes the ChangeRecord found at the new undoPos.

Now suppose the undo stack looks like the following after some set of undo/redos: | | | | ChangeRecord 0 | | | ChangeRecord 1 | | | ChangeRecord 2 | <-undoPos | | ChangeRecord 3 | | | ChangeRecord 4 | | | ChangeRecord 5 | |

If the user make some new change to the document not involving undo/redo, the stack ahead of undoPos is blown away to be replaced by the new set of change records with undoPos pointing to the most recent ChangeRecord. So if two new ChangeRecords are created, the stack would look like: | | | | ChangeRecord 0 | | | ChangeRecord 1 | | | ChangeRecord 2 | | | ChangeRecord 6 | | | ChangeRecord 7 | <-undoPos |

Changes needed for abicollab

Ok now if we allow a remote user to type in our document our undo stack would resemble this:

| | | | | | | | local | ChangeRecord 0 | pos | len | | | | local | ChangeRecord 1 | pos | len | | | | local | ChangeRecord 2 | pos | len | | | | local | ChangeRecord 3 | pos | len | | | | remote | ChangeRecord 4 | pos | len | | | | remote | ChangeRecord 5 | pos | len | | | | remote | ChangeRecord 6 | pos | len | | | | remote | ChangeRecord 7 | pos | len | | | | local | ChangeRecord 8 | pos | len | | | | local | ChangeRecord 9 | pos | len | | | | local | ChangeRecord 10 | pos | len | | | | remote | ChangeRecord 11 | pos | len | <-undoPos | <-offset |

So now if the user presses "undo" the first change record it sees is from the remote user. We do not undo the remote users change records, instead we decrement the "offset" one position, and read off ChangeRecord 10. However before compute the inverse and apply the change to the document we first have to check that ChangeRecord 11 from the remote user was applied to the document in a location before the location that ChangeRecord 10 was. To do this we scan back from offset to undoPos and examine the intervening remote CRs until we reach undoPos again. We look to see if:

  • The remote CR was before to position of the local change. If so we adjust the position of application of the local CR to take account of this. This operation is applied to both the local document and all the remote documents.
  • We also look to see if any remote CRs applied to the document after the local CR operate over a document range that overlaps the local CR. If it does we remove the local CR from the stack as well as all prior CRs. We do this because we cannot determine what the effect of the remote CR will be without undoing all remote users CRs.
  • The effect is to disallow undos through a remote users changes to the document. The local user is free to manually change any portion of the document she wishes, but she cant undo a remote users changes.

Redos occur the same way the old method worked. If offset <> undoPos, it is first decremented (moved towards undoPos). If it finds a local CR, it scans back through the remote CRs until it reaches the top of the undo stack. As it scans it looks for the sames types of CRs as undo. If the remote CR is entirely before the local CR, the point of application of the local CR is adjusted. If the remote CR overlaps the local CR, the local CR is discarded along with all the later CRs in the stack.

If the local user makes a manual change after some undos, all higher CRs are discarded and the new top of the stack is the most recent local CR.

Editting another users text

AbiCollab does not prevent any user from making changes to other users text. The issue with is that the remote user will not be able "undo" the change. The remote user could manually change it as they wish however. The local user has exactly the same possibilites as the remote in this regard. The local user cannot undo a remote users changes and will find all history prior to the remote users changes lost.

For this and sociological reasons, it will probabally be wise to check with remote user before making changes to their part of the document.

Handling GLOBs

Many operations in AbiWord require stringing together a collection of individual changes. These must be carried out without interuption and when being undone, must also be undone without interuption. To handle this, Abiword implements "GLOBs" which enclose these set of contiguous change with special "GLOB" ChangeRecords.

AbiCollab emables GLOBs to be propagated to remote documents by recognizing the opening BeginGLOB CR then collecting all the CRs until the matching EndGlob is found. It then broadcasts this collection of packets encased in their GLOBs to remote users. When AbiCollab sees an incoming packet starting with a GLOB it processes all the enclosed CRs just as if it was initiated in the remote document. Because AbiWord is a single threaded application, all local changes are frozen until the document has finished processing all the packets in the GLOB

Handling "collisions"

It is possible that users will make changes to overlapping regions of the document without being aware that the remote user has also made a change to the document. Imagine what would happen of Alice and Bob went to the very end of the document and simulateously Alice pressed "a" and Bob pressed "b". Which character goes first? The answer is undefined.

This is an example of a collision. The signature of a collision is if a CR from a remote user overlaps a region changed by the local user which also appears before the remote user received the local users CR which overlaps the remote CR.

We detect this by scanning back through the change records sent by the local user until the last CR received before the remote user sent his CR. If there are local CRs that overlap the current remote CR in this range, we have detected a collision.

When such a collision is detected the local user does not apply the remote CR but marks it and no further CRs from the remote user is accepted until the inverse CR to the colliding CR is detected.

Locally, all changes to the local document back to and including overlap CR are undone. These are CR are broadcast back to the remote user. Eventually the inverse to the colliding CR will be sent to the remote user. Note that the remote user will also detect a collision and do the same set of undos eventually generating the inverse to the overlap CR.

When the inverse of the overlapped CR is detected, normal operation is resumed.

Reset - for when it all goes horribly wrong

At this point in time (January 27th, 2007), with limited testing, AbiCollab appears to work very well. However it is likely that we have not forseen all use cases and hence things will almost certainly go wrong and documents will end up out of synchronization. For this reason, we will allow the Master document at the center of the hub to send out a "reset" signal together with the local version of the document. All the remote users documents will then be reset to this, "Master", version.

We hope that that this "reset" feature not be needed often and we will improve the algorithim to deal with the cases when it is.


[[[AbiWordDevelopment|AbiWordDevelopment]] Back to AbiWord development]

Contributors

Personal tools