RFP: externalize *all* locale-specific information

From: Paul Rohr (paul@abisource.com)
Date: Mon Apr 29 2002 - 17:36:37 EDT

  • Next message: Paul Rohr: "Re: RFP: use gnu gettext"

    Yay! It's spring, when young men's fancies turn lightly to thoughts of ...


    I'd like to point out three related goals here.

    1. let's make life easier for translators
    There are a lot of willing translators in the Unix world, and switching to
    gettext would make them a lot happier. We currently have tools to convert
    back and forth between their PO files and our strings files, but I can
    understand why even that hiccup is annoying.

    From what little I know of the PO format, it makes the work of maintaining
    translations easier for at least three reasons:

      - Unix translators are familiar with those tools
      - you can see the base string you're translating right there
      - you can tell when that base string changed underneath you

    It might not be hard for someone to write tools to provide equivalent
    functionality for comparing and versioning our XML-based strings files, but
    to date nobody has.

    Thus, it's pretty cool that Dom has volunteered to do the *non-Unix* work
    required to switch out the bottom layer of our lightweight strings mechanism
    in favor of gettext. ( I assume that the resulting resource files will be
    portable between platforms without having to worry about line-ending
    conventions, etc. )


    In return for that free gift, I personally am willing to forgo any of the
    possible technical quibbles that could be lodged against gettext:

      - locale bloat ... redundant storage of all those english strings

      - app bloat ... given that we already link an XML parser, the rest of
             the strings mechanism is almost certainly lighter weight than
             the gettext library

      - speed ... ID lookups should be faster than atomized string keys

    Given the performance of modern machines, I'm confident that none of these
    should be bad enough to offset the potential benefits to translators.

    2. let's make life easier for translators (non-Unix)
    Could someone more familiar with gettext explain what a Windows user would
    need to do to create, say, a Swahili translation that our gettext-enabled
    builds could use on all platforms?

    I know there are more evolved PO-handling tools on Unix, but could they do
    the job -- and test the results -- without installing cygwin *or* a
    compiler? In an ideal world, that's how it'd work.

    3. let's do the whole job
    More importantly, I'd love to see us do *all* the work required to
    completely remove locale-specific information from our static binaries.
    Ideally, we'd ship a hardwired en-US (yes, I'm a chauvinist, sorry) binary
    which could be easily packaged up with a collection of zero or more
    locale-specific resources.

    This has two advantages:

      - it reduces our core footprint
      - it allows after-the-fact translations to be "dropped in"

    To be clear. AFAICT switching to gettext does nothing to advance this goal.
    It merely switches the mechanism we use for the strings we've managed to
    externalize so far. Instead of shipping the current XML-based resources:

    we'd ship the same information in a resource format that's prevalent on Unix
    systems. (See #1 above for a partial reason why that'd be useful.)

    As I've mentioned repeatedly in the past, the more serious problem is that
    we currently don't externalize enough of the locale-specific information.


    Is anyone interested in coming up with solutions for the rest of the problem

    bottom line
    It's a good thing to make translators lives easier. Now we just need enough
    volunteers (for #2 and especially #3) so we can finish the job once and for


    This archive was generated by hypermail 2.1.4 : Mon Apr 29 2002 - 17:38:09 EDT