[Logo]

Enchant

by Dom Lachowicz - <domlachowicz@gmail.com>

News

April 1, 2010: Enchant 1.6.0 released. See below for download information.

  • Fix bug 12567: the ispell sources aren't licensed under the LGPL
  • Add a function to get enchant's version (enchant_get_version)
  • Disable zemberek plugin by default, as it's known to cause issues/crashes with WebKit
  • Fix bug 12472: Win32 DLL dependency not found popup occurs when module has unmet dependencies
  • Possibly fix Ubuntu bug 474062
  • Fix bug 12409: Registry handle not closed in enchant_get_registry_value_ex
  • Fix bug 12406: Leak in _enchant_get_user_home_dirs() on Windows
  • Fix bug 12007: Update FSF address
  • Fix bug 12305: Zemberek module lists a Turkish dictionary even without Zemberek installed
  • Don't assert if passed a null string list
  • Fix bug 12350: enchant_pwl_init_with_file truncates pwl file
  • Fix a double-free memory corruption bug
  • Fix bug 12173: fix some small memory leaks
  • Fix bug 12174: mis-acceptence of dictionaries which start with a partial match of the lang id
  • Fix bug 12160: enchant 1.5.0 always looks in "lib" dir for plugins
  • Fix the build with the MSVC compiler
  • Add a --with-system-myspell option
  • Package missing compile-resource file
  • Compare paths ignoring case sensitivity on windows

May 23, 2009: Enchant 1.5.0 released. See below for download information.

  • TODO: someone should update this :)

May 5, 2008: Enchant 1.4.2 released. See below for download information.

  • Voikko (Finnish) language support
  • Zemberek (Turkish) language support
  • Better support for Unicode in the personal dictionaries
  • Personal dictionaries offer better suggestions
  • Use OpenOffice's dictionaries on Windows
  • Aspell works on Windows
  • Can use a system-wide Hunspell/Myspell installation on Unix-like platforms
  • Require Hunspell 1.2.1
  • .NET bindings
  • More lax language matching rules (eg. if you request a "pl" dictionary, but only have a "pl_PL" myspell dictionary installed, it will do the right thing)
  • Use XDG's data-dirs spec for locating dictionaries (eg. ~/.config/enchant/myspell/)
  • Lots of unit tests
  • Lots of bug fixes

October, 2003: Enchant is listed in the Freedesktop.org software repository!

GnomeSpell and GtkSpell use Enchant!

  • GnomeSpell and GtkSpell use Enchant

LibSexy uses Enchant

KDE/KOffice folks considering using Enchant

  • "Current plans are either to rewrite the current API after KOffice 1.3 and KDE 3.2 releases or to reuse the newly born Enchant backend neutral spell checking engine." More information can be found here.

What is Enchant?

On the surface, Enchant appears to be a generic spell checking library. You can request dictionaries from it, ask if a word is correctly spelled, get corrections for a misspelled word, etc...

Beneath the surface, Enchant is a whole lot more - and less - than that. You'll see that Enchant isn't really a spell checking library at all.

"What's that?" you ask. Well, Enchant doesn't try to do any of the work itself. It's lazy, and requires backends to do most of its dirty work. Looking closer, you'll see the Enchant is more-or-less a fancy wrapper around the dlopen() system call. Enchant steps in to provide uniformity and conformity on top of these libraries, and implement certain features that may be lacking in any individual provider library. Everything should "just work" for any and every definition of "just working."

Enchant, Pipes, API, and ABI

Enchant does not use pipes to talk with command line programs. Enchant is a full-fledged library, with a C and C++ API. Its communication with its various backends happens via "virtual function calls", or at least the C equivalent thereof. The backends are small C/C++ plugins. The backends themselves do not use the popen() call, but instead link to various libraries (eg: aspell) and interface with said library's API to "get the job done."

Further, enchant has a standard C API/ABI that will in due course be frozen for all of eternity, barring any gross oversights or shortcomings introduced by the author. I believe the current API to be rather sufficient for my various projects' needs. It may not, however, be sufficient for yours. If not, please read on to the "Enchant and Collaboration" section below.

No pipes? But I thought you support Ispell!

Enchant does support most Ispell dictionaries. Back in the day, I and some other folks from the AbiWord team forked Ispell in order to turn it into a C library. Unfortunately, it had a lot of Ispell "global state" cruft in it. Since then, we have turned this library into a stateful C++ library, complete with 1st-class classes and objects. In the process, we also believe that we've made Ispell more flexible, and that there is room for further flexibility still. Enchant's Ispell backend is based on this work done by myself and others inside of AbiWord.

Command Lines

Enchant is not another "ispell compatible" command line program. However, it does currently ship with one, as a "technology demo." The program currently isn't particularly good, especially when compared to 'aspell'. Contributions improving the command line program are generously welcomed.

Enchant and Wheels

As far as I know, Enchant re-invents no wheels here. This wheel is entirely new. Enchant doesn't implement its own correction code. It doesn't "duplicate" functionality already found in other projects, as frankly, it doesn't have much functionality of its own. Enchant wraps a common set of functionality present in a variety of existing products/libraries, and exposes a stable API/ABI for doing so. Where a library doesn't implement some specific functionality, Enchant will emulate it.

Backends

Enchant is capable of having multiple backends loaded at once. Currently, Enchant has 8 backends:

  • Aspell/Pspell (intends to replace Ispell)
  • Ispell (old as sin, could be interpreted as a defacto standard)
  • MySpell/Hunspell (an OOo project, also used by Mozilla)
  • Uspell (primarily Yiddish, Hebrew, and Eastern European languages - hosted in AbiWord's CVS under the module "uspell")
  • Hspell (Hebrew)
  • Zemberek (Turkish)
  • Voikko (Finnish)
  • AppleSpell (Mac OSX)

More/other backends are welcomed.

Enchant and multiple backends

Why would Enchant want to support multiple backends? There are several reasons for this.

  • Not all spell checking libraries are created equal. Their language coverage and various algorithms aren't all of equal quality. Spell-checking some languages is fundamentally different than spell-checking others. For instance, one might want to write a word processing document with both Yiddish and English words in it. If you just have Aspell or Ispell installed, you can't currently spell-check both languages. Enchant presents a consistent interface so that you can have Uspell spell-checking the Yiddish bits and Aspell correcting the English ones. Your Word Processor needn't know the fact that 2 different spell-checkers are needed for this task. It all "just works" (tm). Also consider the fact that (perhaps) Aspell does a better job at handling English than Ispell, but Ispell does a better job at French. You as a user don't want to use a sub-standard spell checker. Enchant provides a few different solutions to this problem 1) Define an ordering of providers based on language tags 2) Install an English Aspell dictionary and a French Ispell one. Don't install a French Aspell one or an English Ispell one. Either way, the intended result will happen like magic, because Enchant allows you to have 0 or more backends active at the same time.
  • We want to provide a consistent API to all of these backends. Fundamentally, they all do a similar job and expose similar functionality to the outside world. But they don't present a consistent API, and many don't or won't have a stable API and/or ABI. Some of them didn't even have an API/ABI until Enchant came along <cough>Ispell</cough>. This is a problem for consumer programs, such as Office applications. Enchant will shield you from these problems via its plugin architecture, and stable API/ABI.
  • We want to integrate well with native solutions. This means that we want to have a "MacOS X" backend that uses the builtin spell-checker. We want a "MS Office" backend that uses their plugins. We want to be viral, invasive, seamless, and all-encompassing. And we want to do it well.
  • We want to integrate with whatever solution the user already has installed, be it Ispell, Pspell, MySpell, ... and not force some alternative down their throats, or force them to install a different dictionary to work with OpenOffice than they need to work inside of GEdit.

Enchant and API Inspiration

Enchant's API was inspired by the Aspell/Pspell API, and aims to be both a superset and a "ease of code" simplification of said API.

All inputs and outputs are in UTF-8 encoding. All language tags are based on ISO standards, and take the form of "xx_YY" (language_LOCALE), where the locale ("_YY") portion is optional, but encouraged. We may consider extending the language tag mechanism to accomodate future needs, such as medical dictionaries.

Enchant and Multi-User systems

Enchant strives to support multi-user systems as well as is humanly possible. This in part means that users can have locally defined plugin backends for enchant (for example, in ~/.enchant/) that work with the spell checker of their choice. This is in addition to any globally installed plugin providers. The user-defined plugins are guaranteed to always take precedence.

Similarly, several of Enchant's backends support user + global dictionaries. This includes the Ispell, Uspell, and Myspell backends. Users may have locally installed dictionaries (for example, in ~/.enchant/ispell or ~/.enchant/uspell) in addition to any globally installed ones. As with plugins, the user-defined dictionaries will always take precedence over the globally installed ones.

Also, both user-local and global Provider Ordering files can coexist, with the values in the user-local one guaranteed precedence. Further, an API exists so that one can modify said orderings at runtime, if necessary.

Why is this done?

  • User might not have a multi-user license to use a particular spell checker
  • User might not have a multi-user license for a particular dictionary
  • User might've modified a plugin or dictionary and not want/have administrator priveleges to update a global dictionary or plugin, or add a new plugin/dictionary
  • User might have a different idea of which spell checker provides the best result for various languages than the administrator does.

Building Enchant

Enchant strives to be buildable with any toolchain, not just the GNU ones. Currently, the only ones that have been tested are the GNU one (or any toolchain capable of being driven by auto* and libtool) and MSVC. This, however, does not mean that we want to stop here. We want Enchant to build with the compiler and linker of your choice. For example, the fact that I don't own a copy of MSVC shouldn't stop you from sending me a DSW project file that is capable of building Enchant. I will 'svn add' it gratefully.

Enchant is vanilla C and C++ code, and as such should be buildable with just about any C/C++ compiler known to man. Some of it is based on AbiWord code, which compiles on [Unix, MacOS, BeOS, QNX, Win32]. Some of it uses Glib2 code, which compiles on [Unix, MacOS, Win32, BeOS, QNX, ...]. In short, enchant's code should be highly portable. If something doesn't work for your $COMPILER on $PLATFORM, please send problem reports or patches that fix the problem. Other than that, the only "obstacle" should be getting a build system set up for Enchant for your particular $COMPILER and $PLATFORM, which shouldn't be too daunting of a task.

Enchant exposes 0 outside dependencies, save LibC. Internally, Enchant only requires Glib2 version 2.6 or later. This was done for pragmatic reasons - it's simply easier to use a portability library that handles modules, lists, etc... than writing one's own copy. That said, patches to remove the Glib2 dependency are welcomed. As I said, the decision to include it in the first place was extremely pragmatic. I didn't want to put it in, but I didn't want to rewrite a portability library, so I chose one I'm familiar with, is well tested, and works well. It saved me time getting the project off the ground. Now that it works and there is some outside interest, I'd wouldn't be saddened to see it go, provided that a viable alternative was implemented.

Getting Enchant

You can get Enchant from AbiWord's SVN. Instructions for how to use AbiWord's anonsvn can be found here: http://www.abisource.com/developers. The module's name is "enchant".

Alternately, you can download Enchant 1.6.0 (April 1, 2010). Older versions are still available here.

You can browse the Enchant source online from: http://www.abisource.com/viewvc/enchant/trunk/. Enchant's public API can be found here. Enchant's provider plugin API can be found here. Enchant's C++ API can be found here.

Enchant and AbiWord [i.e. Collaboration]

Enchant was written by AbiWord's maintainer and lead developer. He also develops and maintains a lot of other software too. Enchant was inspired by a host of bugs in the AbiWord bugzilla. Please don't let the "AbiWord" part throw you for a loop.

The developer's intentions are for Enchant to be as widely adopted and successful as possible. He wants this to be used by (KOffice, OpenOffice, KDE, Gnome, ...). Patches and contributions from non-AbiWord developers are not only gladly welcomed, but strongly encouraged. Don't settle for mediocrity - stand up and be counted.

As with many proposed standards, the initial work on Enchant was done to address a real need in a real product by a real developer. It was not designed by committee. It addresses the author's needs remarkably well. Also like many standards, the author proposes his current body of work for discussion, collaboration, improvement, and ultimately standardization. It is often useful to have a working reference base in order to stimulate and direct discussion. The Enchant codebase is meant to do exactly that.

For the record, Enchant is already in use by AbiWord, while Lyx and Conglomerate are presently considering using it. Gnome-Spell and GtkSpell are currently in the works of transitioning between Pspell and Enchant. We hope to achieve broader use at an exponential level.

The current list for Enchant discussion is the AbiWord-devel list. Please see http://www.abisource.com/developers/ for more information on how to join, and for archives. If there is sufficient interest, I can look into moving this onto a separate, more "neutral" list, such as a Freedesktop.org one.

Issues with Enchant


Obviously, there are going to be issues, bugs, patches, RFEs, etc... for Enchant. These are gladly welcomed. You may do one of a few things (in my order of preference):

  • File your issue in http://bugzilla.abisource.com/ under the "Enchant" product
  • Send a message to the abiword-devel mailing list
  • Send the author an email directly (please, only do this as a last resort)

Enchant's Future

Some stuff that we're thinking about or are already in the process of adding:

  • Standardization!

Enchant's License

Enchant is currently licensed under the LGPL license, with an Exception so that non-free (as in speech and/or cost) plugin backends could be loaded and used by Enchant. This is mainly so that we can use the native spell checkers on various platforms (MacOS X, MS Office, ...), or that our users could "plug in" their favorite commercial product to do the job.

The author would consider re-licensing it as necessary to help foster broader adoption. This may include MPL or JCA/SISSL license, but the author would also consider alternate licenses as well.

What's in a Name?

"Enchant"'s English definition is roughly "to cast a spell" - whose meaning seemed to fit in well with a spelling-oriented framework. Add to that the fact that I wrote most of this this on a lonely Sunday morning while my gf was engrossed with "Harry Potter and the Order of the Phoenix", and you can see where I got its name ;-)