The (unofficial) Unicode Indic FAQ Errata

By Andy White

The information contained in this document was first published on 17th November 2002.

 

Introduction

This document was created simply because the length of time that the errors in the Unicode Indic FAQ have existed suggest that it is necessary.


Errors

For ease of reference the 'errors' are numbered: 1, 2, 3, 4 & 5

1. Under 'How does Unicode differ from ISCII?':

"Unicode and ISCII differ slightly but are easily converted back and forth without loss of information"

Unicode & ISCII cannot be interchanged without loss of information. This is because ISCII contains letters not in Unicode (e.g. Bengali letter Va) and Unicode contain letters not in ISCII (e.g. Oriya letter Wa)


2. Under How do the Indic scripts work in Unicode?):

"[The Unicode Indic] model is the same as the ISCII model"

This is only partially true because Unicode does not contain controlling ISCII letters such as the invisible letter (INV).


3. under 'How does Unicode differ from ISCII?'

It is stated that ISCII 'consonant+Virama+Virama' is encoded as 'consonant Virama ZWJ' in Unicode.

This is wrong! ISCII Consonant+Virama+Virama is encoded as Consonant Virama ZWNJ


4. The section in the Unicode Indic FAQ that deals with the invisible letter is wrong:

The ISCII standard does not state that the INV letter is required to form vocalic L, LL & Ri etc. However, in some ISCII applications, INV may be required to form Isolated Vowel Sign Ri.

5. Also in the above section (where the Reph is mentioned):

The FAQ states that ISCII, 'INV Virama Ra' is to be encoded as 'Space Virama Ra' in Unicode. This begs the question, How does one encode 'Space Virama Ra'? This statement must be wrong.

The isolated Ra-kar (AKA Raphalaa, Vattu etc.) indicated by 'INV Virama Ra' will be better encoded in Unicode with the aid of control characters. The suggested sequence for isolated rakar is 'ZWNJ ZWJ Virama Ra'.

.

This document last updated March 11, 2003