From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Thu May 02 2002 - 00:01:25 EDT
--- Jordi Mas <jmas@softcatala.org> wrote: > Hi guys,
>
> Inside the file abi\src\af\util\xp\ut_misc.cpp there
> is a struct called
> "s_word_delim".
>
> In order to make the Catalan spell checker to work
> properly with AbiWord
> we need to hack this structure to include the "·"
> character (not recognized
> right now). Also, we need to hack the
> UT_isWordDelimiter() because
> the '-' character can be part of word in Catalan
> (ex. copiar-lo).
The dash can be part of a word in English and French
too. I wonder why it's not already there.
> In my opinion, all this settings should be locale
> sensitive and we should
> move them from the code into an external file where
> we define all this
> setting and can be easily modified for every
> language.
>
> What do you think guys? If we agree on how to do
> this, I can make the
> changes myself.
Actually it's a harder problem than this. In a
multilingual context you need a function to find word
boundaries since some languages (Thai, Khmer,
Japanese)
make spaces between words either optional or illegal.
So we need a to call a function and this function
needs to be able to call a language-specific function.
Most languages (or the default) can then use a single
function which, in turn, can use the "word delimeter"
method which, as you point out, needs to be
extensible.
A perfect little case for OO (:
Andrew Dunbar.
> Thanks,
>
> Jordi,
>
=====
http://linguaphile.sourceforge.net http://www.abisource.com
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
This archive was generated by hypermail 2.1.4 : Thu May 02 2002 - 00:03:32 EDT