Re: Language Codes

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Wed Oct 23 2002 - 23:30:21 EDT

  • Next message: Martin Sevior: "Encoding is totally broken."

     --- "E . A . Zen" <ericzen@ez-net.com> wrote:
    > On 2002.10.21 13:04 David Chart wrote:
    > >
    > > The problem with historical languages is that you
    > > need to specify a time as well. For example,
    en-GB-
    > > 1600 is rather different from en-GB-2002 (have a
    > > look at Shakespeare). A dictionary based on a
    > > renaissance mathematician is one historical slice
    > > of Latin, and different from la-GB-1400, as well
    > > as from la-IT-1100.
    > >
    > > Until ISO get this sorted out, which I suppose
    > > might happen, I suggest that we avoid using
    > > kludges to handle dead languages and historical
    > > versions of living languages.
    > >
    > > (Although the ability to set my locale to
    > > en-GB-1600 would be rather cool -- 'Thou hast
    > > changed thy document. Dost thou wish to retain thy
    > > changes on disk?')
    > >
    > > --
    > > David Chart
    > > http://www.dchart.demon.co.uk/
    >
    >
    > The mass complexity of languages, be as organic as
    > they are, still results in many problems. Language,
    > Locale, Dialect, Subvariant and Age are all
    > necessary. If SIL International has a set of public
    > standards for breakdown (other than the anthropology
    > section), it would probably be more beneficial to
    > move to SIL, despite risk of non-compliance to ISO.

    Actually it's not even that simple. SIL doesn't cover
    everything. Their codes don't differentiate between
    one language written in two scripts such as Serbian
    and
    they ignore most artificial languages other than
    Esperanto, Interlingua, and Interlingue.

    That's why I've been putting forward the idea of a
    language object. The object will have methods to
    retrieve info such as ISO 2-letter, ISO 3-letter, SIL
    code, other registered language codes such as
    i-klingon
    or art-lojban, script, and orthography. We could also
    include info such as Windows codepage, Windows LID,
    etc.

    The hard part will be deciding which subset of this
    info to encode in our native document format. Mapping
    to other document formats should be much easier since
    we'll know what their limitations are.

    All of this stuff is known in the locale community but
    no solutions have yet been forthcoming.

    Andrew Dunbar.

    > Either that, or we send Andrew to some of these
    > barely-existent meetings and see if we get anything
    > out of it.
    >
    > -Zen

    =====
    http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Wed Oct 23 2002 - 23:37:57 EDT