Re: Spell checking (was Re: various questions)

Justin Bradford (justin@ukans.edu)
Wed, 22 Sep 1999 16:09:28 -0500 (CDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"
Previous message: Decklin Foster: "Re: Suggestion."
In reply to: Paul Rohr: "Spell checking (was Re: various questions)"
Next in thread: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"
Reply: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"

> We'd love to have a cooler engine than ispell, and aspell's results sure
> look cool, but given all the platforms people want to run AbiWord on, the
> portability problem is a biggie. I suspect that the problems of generating
> and distributing aspell-format dictionaries for various languages pale in
> comparison to this.

I was just considering adding the algorithmn (not the actual code) to the
ispell base. It's language dependent (metaphone mapping), but something we
could make a configuration option.

Also, it would be easy to add glue for aspell (and other similar
libraries) which could be an optional build item. eg. if the aspell
library is present, compile with that rather than our modified ispell.

Anyway, I've been stalled by some other work for a bit, but the dialog
should be done soon.

> Since ignore lists tend to be small, actually using ispell to manage them
> seems like rampant overkill. A trivial in-memory representation would be to
> just store the words in a per-document UT_AlphaHashTable. Then if we need
> to persist that information in the file format -- does Word do this, BTW? --
> we could serialize that word list in a header section of the document.

Yeah, that's where I was planning to someday to put them. As for storing
them in file, I am not aware of Word doing that. It was just an idea; I
assumed if someone said to ignore all instances of a word, they'd like it
to not be squiggled when they reload the doc. Perhaps it would be better
(UI-wise, anyway) to have them click "add to document's dictionary"
rather than "ignore all" before it's preserved across sessions.

> Likewise, personal dictionaries also tend to be far, far smaller than ispell
> dictionaries -- my idiolect is a *lot* smaller than the rest of the English
> language :-) -- so a similar approach should work there too. In this case,
> the UT_AlphaHashTable would be app-wide, and could easily persist to a
> simple text file with one word per line. In fact, iterative calls to
> UT_AlphaHashTable::getNthEntryAlpha() would even ensure that the resulting
> file is alpha-sorted, which is pretty nice.

Ok, that should work. However, I think some spell checking systems handle
the user-dictionary side of things, too. It might be nice to handle that,
but I say we revisit the problem if it ever comes up.

Justin

Next message: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"
Previous message: Decklin Foster: "Re: Suggestion."
In reply to: Paul Rohr: "Spell checking (was Re: various questions)"
Next in thread: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"
Reply: Shaw Terwilliger: "Re: Spell checking (was Re: various questions)"

This archive was generated by hypermail 1.03b2.