Thoughts on text styles, Show Paragraphs, and internal representation

Subject: Thoughts on text styles, Show Paragraphs, and internal representation
From: Jesper Skov (
Date: Sat Jun 24 2000 - 15:07:26 CDT

Hi there

This mail is due to some concerns I've had for some time now over the
internal representation of the formatter (Runs/Blocks).

 o I don't know enough of how the fmt and ptbl stuff interacts. There
   may be bad assumptions of how things work/can be made to work.

 o I very much believe in argumenting by providing working patches,
   but I live a life restricted to a mere 24 hours a day, so I need to
   know there's at least a reasonable chance something will work (and
   be accepted) before I want to spend time on it. So this is "just"
   talk for now. RFC, more precisely. I'll appreciate constructive
   comments, but stuff that just shoots this down (with arguments) are
   _also_ useful :)


OK, first my motivation so y'all know what's driving this. Several
issues which I believe are affected by this, which makes it more than
a trivial hack (and quite possibly beyond me):

 o Code robustness - the work I did recently on the cursor location
   code convinced me that the current mix of Runs which can contain
   the point (cursor location) and Runs which cannot is a bad thing.
   My changes may have fixed some problems, but it was very hard to
   get the code to work as well as it does now (still not perfect,
   actually). This means the code is fragile, and is bound to break
   next time someone sneezes anywhere near it.

 o Paul Rohr brought up the remaining implementation details of an old
   POW, also pointing out some of the problems with zero-length Runs.
 o Show Paragraphs (show codes?) [I'm not much of a WP user myself, so
   this observation may not be correct]: if I enable show codes I
   would expect to see a marker between Runs of different text
   styles. E.g., <bold> or <font 20>. Currently, this information
   lives in the Runs, not as explicit elements in a block, and thus
   cannot easily be displayed. I feel that may become a problem (or
   maybe I'm just a control freak, I don't know).

   This introduces two limitations (that I can think of):

   1) When typing we always inherit the text style from the text to
      the left. So if entering text at the border between two styles,
      and you want the style to the right, you need to manually change
      it before you start to type.

      I would love to be able to enable show codes, move the cursor
      right over the formatting code(s) and start typing with the
      style at the right. [I believe this was possible in WordPerfect
      5, which is the last time I used a WP ;]

   2) There is no way to distinguish between style changes inserted
      automatically and those inserted explicitly by the user. The
      latter type is when the user changes font size, for
      example. The former could be when 'Insert Symbol' changes
      font. This can cause problems, as described in Bug 903 (see )

      If there were explicit formatting style codes in separate Runs
      in the internal representation, a flag could tell whether the
      cursor (and thus what's typed) should inherit the Run's style or
      go back to a previous Run for that information.

OK, that was fairly structured, I hope. Still reading? Now I'm going
to pour my brain out of one ear (the left): implementation thoughts,
random comments and whatnot. The ride may get a bit rough...

Random thought #1
I'd like a preference to switch between showing codes as magic
graphical characters which hardcore WP users may understand, and
descriptive strings which the rest of the world understands
(e.g. <newline>, <linebreak>, <EOD>, ...).

Avoiding zero-length content Runs

All break Runs can contain point. Computed cursor position depends on
IP offset and hide/show codes mode:

 hide codes: nothing to render, impossible to select, but still
             possible to place cursor just before the break (i.e., at
             the offset of the Run).
 show codes: renders whatever is configured for that break. Can be
             selected (and thus deleted). Cursor can move past
             rendered graphics/text. In other words, the break is
             editable like any other text (and the world is a better
             place for some of us :)

Avoiding zero-length formatting Runs
<only me to blame for this one>

Currently FMT Runs have a zero length. Changing this will:

 o Allow display (and editing) of FMT codes [same hide/show codes
   behavior as above]

 o Avoids multiple Runs in a block having the same offset. It bothers
   me, even though it may be benign.

 o Removes the need for fancy cursor location search algorithm (all
   Runs can contain point, so things get _much_ simpler).


 o How does/should this interact with the ptbl hierarchy? Is it
   possible to keep this just in the fmt hierarchy? This is my
   understanding (at what is the weakest link in my argumentation).

   The FMT Runs need to be inserted at the proper places. This can be
   handled by the block insert functions, if the info only lives in fmt
   for the sake of (optional) displaying.

 o As I sketch things, "<bold><font40>" will be atomic. The user will
   not be able to delete "<bold>". Instead, the cursor should be
   placed after the code (or the code should be selected) and the
   appropriate (GUI) controls should be adjusted. Would that be
   frustrating, I wonder?

   Making it non-atomic, you have to handle insertion of text between
   the formatting elements (i.e., create a new FMT, so you get
   "<bold>hello world<font 40>"). Power users would probably expect
   this behavior, but I don't think making the FMT atomic is

 o If non-atomic: Should stray FMT codes be left alone? Or always
   cleaned up? I believe the latter is the right approach, since
   they'll die when saving the document anyway.

   [by this, I mean something like this: <bold><bold><font20><font40>X
    which results in <bold><font40>X]

 Added Bonus! (I think :)
 Instead of content Runs containing formatting information, they refer
 to the relevant FMT Run.

 Internal representation becomes a little leaner (from a quick naive
 look, I think these get replaced with a single pointer: m_fPosition,
 m_pFont, m_pFontLayout, m_iAscent, m_iDescent, m_iHeight,
 m_iAscentLayoutUnits, m_iDescentLayoutUnits, m_iHeightLayoutUnits)

 And juggling with the Runs becomes a bit cheaper CPU wise +
 lookupProperties only needs doing once for an "ideal" Run, regardless
 of whether it has been properly coalesced or not.

 o I don't know if this holds water. I need people with a better
   understanding of abiword (and WP in general) to think it through.
   And other people shouldn't get their hopes up - I suspect this baby
   leaks water like waterfall.

 o It'll take time to implement. Possibly more than is available
   before 1.0.

 o Show _codes_ would work. Not just show _paragraphs_.

 o There would be no magic zero-length-runs, nor any Runs which cannot
   contain the point. Thus the now fragile cursor position computation
   code would become simpler and more robust.

 o Show paragraphs POW would be completed.

 o Makes it possible to have an "atomic" combination of FMT and
   content Runs such as used by 'Insert Symbol' without messing up the
   text typed in by the user.

Hope I got at least a few brain cells firing maniacally :)


This archive was generated by hypermail 2b25 : Sat Jun 24 2000 - 15:07:36 CDT