Resources: Fonts & Word Processing

Reading and producing Hindi or Urdu text on the computer generates a number of common questions and problems. We have listed below some suggestions to help you both set up Hindi Urdu capabilities on your computer and ensure your work is standards compliant and legible. For recommendations on specific unicode Hindi and Urdu fonts please refer to the University of Chicago's South Asia Language Resource Center (SALRC) list of recommended Indic fonts.

  1. An Introduction to Unicode Computing
  2. Operating System set-up
  3. Web Browsing
  4. Email
  5. Word Processing

An Introduction to Unicode Computing

Unicode is a computing industry standard that allows computers to display text from all major world languages in a consistent and correct manner. As opposed to older text systems which limit fonts to 256 characters, Unicode contains a 'universal character set' that assigns a unique number to every character of every major language in the world. Unicode is rapidly becoming the international standard for handling text. It is not a kind of font, it is an industry standard that will eventually allow any application (web browser, email, word processor) on any operating system (Windows, Mac OS) to display any major script in any font (provided the font is standards compliant).

Why is it important?

It is the only way to ensure the legibility and longevity of any non-Western text we type. If you don't use a Unicode compliant font, the only way someone else can read what you've typed is if they have the same font, same operating system, and possibly even the same application. For example. if you type something in InPage for Urdu, the only way someone else can read the document is if they are using Windows and have the same version of InPage installed. Or, as another example, if you type an email in the Jaisalmer Devanagari font using Apple Mail, any recipient checking their email with a Windows based web-browser such as Internet Explorer (the email client of choice for the vast majority of the world) will see a nonsense string of characters such as ???#@*#($? even if they have the Jaisalmer font installed. Unicode is the closest we can get to ensuring cross-font, cross-platform, cross-OS legibility for Indic scripts.

Further Reading on the Unicode Standard:


Operating System Set-up

  1. Viewing Indic Scripts

    Mac OS X, Windows Vista and Windows 7 users are able to view South Asian scripts without any special installation or set-up.

    Windows XP users: Indic language support must be enabled manually.

    If you have your original Windows installation disk:
    • Go to Start > Control Panel.
    • If you are in "Category View" select the icon that says "Date, Time, Language and Regional Options" and then select "Regional and Language Options".
    • If you are in Classic View select the icon that says "Regional and Language Options".
    • Select the "Languages" tab and make sure you select the option saying "Install files for complex script and right-to-left languages (including Thai)". A confirmation message should now appear - press "OK" on this confirmation message.
    • Allow the OS to install necessary files from the Windows XP CD and then reboot if prompted.
    If you do not have your original disk:
    • Download and run the Devanagari Toolkit. It will enable Windows built-in Hindi language support (note: Urdu is not enabled).
  2. Enabling Input of Indic Scripts

    Mac OS X (10.4 onwards):

    Input capabilities for Hindi are built-in to Mac OS X. Unless you disabled non-English languages during your installation of OS X, you should already have the Devanagari MT font and Devanagari keyboards installed on your system.

    To activate the Devanagari keyboard go to: System Preferences > International > Input Menu > check the 'Devanagari QWERTY' option. Also check the 'Show input menu in menu bar' box at the bottom left of the window. Check the keyboard viewer at the top to have the visual keyboard option show up in the input menu.

    The situation is more complicated for Urdu. You can either use the built in Arabic or Persian fonts and keyboards or download this set of Urdu fonts and phonetic Urdu keyboards. There are usability issues with all of them.

    Windows:

    Input capabilities for Hindi and Urdu are built-in to Vista and Windows 7. XP users must enable complex language support (see above). For Hindi, you should already have the Mangal and Arial Unicode fonts installed. For Urdu, you can use Tahoma.

    There are built-in keyboards for both Hindi and Urdu but they are not phonetic.

    To activate the keyboards:
    • Navigate to: Start > Control Panel > Language and Regional Options
    • In the dialog window, select the "Languages" tab.
    • Click on "Details" under the "Text Services and Input Languages" header.
    • Under the "Installed Services" menu, select "Add." This will take you to a dialog window which will allow you to add language input services, including assorted keyboard layouts for South Asian languages. Select the language under "Input language." Press "OK."
    • Repeat steps 4 and 5 to add additional languages. Click "OK" and then "OK" again.
    • At the bottom of your screen you will see a Language bar, which probably says "EN" for English. Click it to change the input language. You can also move between languages by pressing "Alt-Left Shift".
    To display the on-screen keyboard:
    • Select Start > All Programs > Accessories > Accessibility Options > On-Screen Keyboard
    Since the built-in keyboards are not phonetic, we recommend using external keyboard mapping software such as Tavultesoft's Keyman.

Web Browsing

Recommended cross-platform solution: Firefox 3 is an ideal solution for Hindi Urdu web browsing. It comes with the best 'out of the box' Indic language support regardless of your operating system and fonts you have installed.

For Windows users, Internet Explorer 7 should work fine with most sites (for Windows XP users, Indic language support must be enabled) but it is not fully unicode or web standards compliant. IE 8 will be standards compliant. For Mac users, Safari 3 will also render Indic texts correctly.

Note: If you are using Firefox 3 and a website's text renders incorrectly, the page may not have identified its encoding properly. To correct this:

  • Go to View > Character Encoding > Unicode (UTF-8)

If you would like to customize the particular fonts browsers use for particular languages, Chicago's South Asia Language Resource Center (SALRC) offers detailed instructions.


Email

Mac OS X:

Apple Mail, OS X 10.4 and onwards, is unicode savvy. You will be able to read emails sent in any Indic script (provided the sender used a unicode compliant font and application). Your sent messages may not display correctly in applications that use Microsoft's rendering engine (Outlook, Internet Explorer).

You can change the encoding for received messages from the Message/Text Encodings menu. This can also be selected for outgoing messages. The range of encodings you have to choose from in Mail depends on the languages you have on the list in System Preferences>International>Languages (which you can change using the Edit button). One shortcoming is that Mail cannot set the default encoding for incoming messages, which is tedious if you get a lot of mail with the wrong charset specified. The default encoding for outgoing messages in Mail is sensitive to the order of languages in System Preferences>International>Languages.

Other Mac email clients such Microsoft Entourage and Mozilla Thunderbird do not support Apple's rendering engine for complex scripts and are therefore not recommended. Web-based mail is dependent on your browser and email service. UT Webmail and Gmail in Firefox 3 will ensure unicode compliant emails.

Windows:

MS Outlook and Windows Mail (Vista) are unicode compliant but do not encode sent messages in unicode by default. Change the message encoding settings to UTF-8 to ensure unicode encoding. UT Webmail and Gmail in Firefox 3 will also work.


Word Processing

Mac OS X:

Microsoft and Adobe applications for Mac do not support the Mac OS rendering engine (ATSUI/Core Text), hence AAT unicode fonts (such as Devanagari MT) do not even appear in MS Office for Mac v11 and later. Conversely, Mac OS X does not support Microsoft's rendering engine (Uniscribe) so documents containing OpenType unicode fonts (such as Jaipur Unicode) do not display properly in any application on the Mac platform (including MS Word). In other words, it impossible to read or produce Hindi Urdu documents in Microsoft Office for Mac applications.

The are working alternatives to MS Office. OpenOffice is a free, cross-platform, open-source office suite. Apple's iWork is also a popular alternative. If you receive a Word document with Opentype unicode fonts, open it with one of these two applications and convert the text's font to Devanagari MT. The text will be correctly rendered. If you are simply typing a document for print or PDF distrubtion both of these applications will work fine.

If all else fails, OS X's built-in TextEdit is a little, swiss-army text editor that can handle almost anything you throw at it. It does not, however, have many of the convenient features of a full fledged word processor.

Windows:

MS Office is the most common application suite used for Hindi Urdu word processing in Windows. Use MS Word and the input methods outlined above and you'll be fine. To save documents as PDFs you'll need either the Microsoft 'Save as PDF' plugin (this often messes up Nagari and Nashtaliq formatting) or the more accurate, free PDF Creator.

OpenOffice is a free, open-source alternative. It has a built in pdf creation function.

Cross-Platform:

Google Docs and Adobe Buzzword represent an intriguing possibility for creating cross-platform, editable, unicode Hindi-Urdu documents. Google Docs can import and export most document formats (including Word docs) and display most Microsoft Opentype fonts correctly even on a Mac (provided you are using a unicode browser such FireFox 3).

There's one more little online tool of interest: the Google Indic Transliteration tool has great potential. If you do not want to go through any of the keyboard and input set-up with your operating system, you can visit this site, type in roman characters and the tool will convert each word into unicode, Devanagari text.