Port AbiWord for Windows to Unicode
From AbiWiki
Port AbiWord for Windows to Unicode
By Kathiravelu Pradeeban
Contents |
Synopsis
AbiWord for Windows is currently an ANSI application. Porting AbiWord completely to Unicode API to use the Multilingual UI (MUI) features of Microsoft Windows (Windows NT and its descendants) is the goal of this project. This will produce a Unicode only build for AbiWord for Windows.
Benefits to the AbiWord (and/or other) project(s)
The advantage of Unicode is common to all the applications. When considering AbiWord, the benefits are enormous. For a word processor, multilingual support and Internationalization is required, and Unicode is the most effective way to achieve this. As the lower layer of Windows NT and its descendants is implemented in Unicode version of functions, implementing the application with Unicode version of Windows API, makes the application efficient, as the Unicode – ANSI conversion is avoided. While achieving the Latin and user locale support can be done with the non-Unicode applications, Unicode is needed for the multilingual support.
As AbiWord is implemented to support One Laptop per Child (OLPC), it has to support the Unicode-only languages too. Making AbiWord for Windows, a complete Unicode application, ensures this as an ideal word processor on Windows platform, so this project will make AbiWord more suitable for OLPC.
Deliverables
At the end of the project, I will deliver the fully Unicode only application, which will use the wide characters (WCHAR) and free from the existing TCHAR versions. The final Unicode application will also solve the existing bugs in the current win32 application. The requirements to run the program would be Windows NT and its descendants. Since this Unicode build will be used in the future versions for the AbiWord windows build, I will make sure to deliver all the necessary documentation with the project.
Project Details
Unicode makes computing in local languages or more specifically multilingual applications possible. Windows NT and its descendents use Unicode as their internal character encoding. AbiWord for windows is an ANSI application. AbiWord 2.4.6 is the latest version which works on legacy windows (Windows 9x) operating systems. Due to the fact that those legacy operating systems are rarely used nowadays, it is decided to drop the ANSI build and completely porting AbiWord windows to Unicode.
Porting AbiWord for Windows to Unicode is not a new project at all. The initial effort was started early back in 2007 [1]. During that time, it was discussed whether to continue the support to the Windows 9x versions, and the code was organized so that the program can be built either using Unicode or ANSI. I will use this initial effort as a reference for my project.
Since the later versions depend on Glib, continuing the support for the earlier windows versions has become impossible. Hence maintaining an ANSI build for the legacy Windows operating systems or porting completely to Unicode yet supporting Unicode and ANSI builds using Unicows are not sensible options. Thus currently we have decided to support only the Windows NT and its descendants (Windows 2000/XP/Vista), by providing a Unicode only build. This project will focus on efficiently using the Multilingual UI (MUI) features of these operating systems.
AbiWord internally handles Unicode text correctly, but not at the User Interface level in Windows. The user interface uses Microsoft’s ANSI controls to edit and display text. This will be completely ported to use the Wide Character controls.
There are issues with the international file names. The drag and drop of documents is buggy when using international file names. Text is displayed in the editing area using the Microsoft’s Uniscribe technology [3]. This code also contains some bugs, and hence this code should be improved to support more scripts. Existing bug reports show some problems using international file names. The find dialog has also shown some garbage, when using Unicode text. Russian and Hebrew language users have confirmed this issue. These bugs [4] should be rectified in the Unicode only build.
The testing will be done using the non Latin languages like Indic, East Asian, and Indo-Iranian Languages. The project will be developed on a Windows XP platform.
Project Schedule
April 20 to May 22
This period has been announced as the community bonding period. I will use this period to interact more with the AbiWord developers and the other GSoC students, via mailing lists and IRC, to get ideas about the overall AbiWord development process and the tasks that are currently implemented. I will also use this period to get more ideas from the mentor and the other developers about my project. During this period I will learn much about my project and the methodologies to implement Unicode Applications effectively. During this period, I would be able to get a more solid picture of my task, and during this period, I will submit a design specification, which will contain a more low level analysis of the upcoming project schedule. Hence I would be able to get the views of the developers who got involved in Porting Abiword Windows to Unicode project earlier.
May 23 to July 5
This is the longest time frame, and during this time, I will be implementing the project as depicted in the design specification I submitted and approved by the AbiWord windows application maintainers and the other developers. I will consider the existing bugs on the windows application and will make sure my build solves them. I will use Visual Studio 2008 as my IDE, as it would save me a lot of time debugging.
July 6 to July 13
I would be able to complete the windows application, which is ported to Unicode at the mid evaluation period and submit the application for the team. During this period, I will write a testing plan, and make sure to include all the possible test cases into both unit testing and integration testing.
July 14 to Aug 1
During this period I will fine-tune the code with the suggestions and issues received from the mid evaluations. Practically porting an ANSI application completely to Unicode needs a strong testing. I would also use this time to test the Unicode build for any explicit issues. I would make sure, that the issues reported in the mailing lists and bugZilla regarding the current windows build are resolved in this period, as that is the major goal of my project. I will also fix the bugs I find in my code during this period. For the testing I would use Unicode languages including (but not limited to) Indic Languages like Tamil and Sinhala.
Aug 1 to Aug 10
I will use this time for a release level testing. I will make the code ready to be committed, and prepare patches to be submitted. I will also make sure that during the end of this period I will have proper documentation for the AbiWord Unicode build for windows.
Aug 10 to Aug 17
I will use this week to polish the documentation and the code. By the end of this week the project will be ready for the release.
A non technical term, which is known as June term, starts at the mid of June at University of Moratuwa. Till that we have vacation. During June terms, we will be having subjects like photography and meditation, and not like the academic semesters the sessions will be only on two days per week, and only a half day also. Since I don’t have any other commitments during this time, I will be almost completely free for the whole summer of code period. Apart from GSoC, I have decided to spend my spare time completing my localization work on AbiWord, and translate it into ta-LK.
Bio
I am a level 3 undergraduate at Department of Computer Science and Engineering, University of Moratuwa. I have a sound knowledge in C, C++, C# and Java with a good experience using the Visual Studio IDE. I have just completed my internship at WSO2, a pioneer in SOA, practicing Free and Open Source software development.
I have practiced Apache process, which is similar to the Agile process, during the six months of my internship. During this period I have involved in the LEAD System project of the Extreme Lab of Indiana University, with the collaboration of WSO2. There we implemented an Event Notifier Service for the LEAD System, and also have transformed the existing WS-Messagebox program which had been developed using XSUL (A SOAP toolkit), to Axis2. These projects use java, axis2, web services, XML and XMLBeans.
I am familiar with quality assurance and have involved in the testing of WSO2 Carbon. I have extensively used Build tools Maven and ant, and used jmeter for performance analysis. I have strong interest in localization. I have involved in localizing Squirrelmail to ta-LK (Tamil - Sri Lanka) and recently started translating AbiWord. I have spent some time learning more about Unicode concepts and used it frequently.
I prefer AbiWord due to its effectiveness and the compatibility with the One Laptop per Child concept, and as an active and interested person in Abiword, I will continue to be a contributor of Abiword in the future as well.
References
[1] http://abiword.com/mailinglists/abiword-dev/2007/Apr/0031.html
[3] http://www.microsoft.com/typography/developers/uniscribe/
[4] http://bugzilla.abisource.com/
[5] Department of Computer Science and Engineering, University of Moratuwa