Gujarati script

From Wikipedia, the free encyclopedia

Gujarati
Type	Abugida
Spoken languages	Gujarati Kutchi
Time period	c. 1600–present
Parent systems	Proto-Canaanite alphabet ^[a] → Phoenician alphabet ^[a] → Aramaic alphabet ^[a] → Brāhmī → Gupta → Nāgarī → Gujarati
Sister systems	Ranjana Moḍī
Unicode range	U+0A80–U+0AFF
ISO 15924	`Gujr`
[a] The Semitic origin of the Brahmic scripts is not universally agreed upon.
Note: This page may contain IPA phonetic symbols in Unicode.

The Brahmic script and its descendants
Brāhmī Northern Brahmic Kusan Gupta Śāradā Landa Old Kashmiri Gurmukhī Takri Dogri Chameali Siddhaṃ Nāgarī Devanāgarī Nandināgarī Gujarati Proto-Bengali Eastern Nagari Bengali Assamese Mithilakshar Oriya Nepal Bhujimol Ranjana Tibetan ’Phagspa Southern Brahmic Tamil Brahmi Vatteluttu Tamil Pallava Grantha Malayalam Tulu Khmer Thai Lao Old Kawi Balinese Javanese Mon Burmese Kalinga Kadamba Old Kannada Kannada Telugu Sinhala

The Brahmic script and its descendants

Brāhmī

Northern Brahmic
- Kusan
- Gupta
  - Śāradā
    - Landa
      - Old Kashmiri
      - Gurmukhī
    - Takri
      - Dogri
      - Chameali
  - Siddhaṃ
    - Nāgarī
      - Devanāgarī
      - Nandināgarī
      - Gujarati
    - Proto-Bengali
  - Nepal
    - Bhujimol
    - Ranjana
  - Tibetan
    - ’Phagspa
Southern Brahmic
- Tamil Brahmi
  - Vatteluttu
    - Tamil
- Pallava Grantha
  - Malayalam
  - Tulu
  - Khmer
    - Thai
      - Lao
  - Old Kawi
    - Balinese
    - Javanese
  - Mon
    - Burmese
- Kalinga
- Kadamba
  - Old Kannada
    - Kannada
    - Telugu
- Sinhala

The Gujarati script (ગુજરાતી લિપિ , Gujǎrātī Lipi), which like all Nāgarī writing systems is strictly speaking an abugida rather than an alphabet, is used to write the Gujarati and Kutchi languages. It is a variant of Devanāgarī script differentiated by the loss of the characteristic horizontal line running above the letters and by a small number of modifications in the remaining characters.

With a few additional characters, added for this purpose, the Gujarati script is also often used to write Sanskrit.

Gujarati numerical digits are also different from their Devanagari counterparts.

1 Origin
2 Overview
3 Gujarati characters, diacritics, and numerals
4 Conjuncts
5 Romanization
6 Gujarati in Unicode
7 Gujarati keyboard layouts
- 7.1 Inscript keyboard layout
- 7.2 Keyboard and script resources
8 References
9 Bibliography
10 See also
11 External links

[edit] Origin

Gujarati script is descended from Brahmi and is part of the Brahmic family.

The Gujarātī script was adapted from the Devanāgarī script to write the Gujarātī language. The earliest known document in the Gujarātī script is a handwritten manuscript dating from 1592, and the script first appeared in print in a 1797 advertisement. Until the 19th century it was used mainly for writing letters and keeping accounts, while the Devanāgarī script was used for literature and academic writings. It is also known as the śarāphī (banker's), vāṇiāśāī (merchant's) or mahājanī (trader's) script.^[1]

[edit] Overview

Excerpt from "My experiments with truth" - the autobiography of Mahatma Gandhi in its original Gujarati.

The Gujarati writing system is an abugida, in which each base consonantal character possesses an inherent vowel, that vowel being a [ə]. For postconsonantal vowels other than a, the consonant is applied with diacritics, while for non-postconsonantal vowels (initial and post-vocalic positions), there are full-formed characters. With a being the most frequent vowel^[2], this is a convenient system in the sense that it cuts down on the width of writing.

Following out of the aforementioned property, consonants lacking a proceeding vowel may condense into the proceeding consonant, forming compound or conjunct letters. The formation of these conjuncts follows a system of rules depending on the consonants involved.

In accordance with all the other Indic scripts, Gujarati is written from left to right, and is not case-sensitive.

The Gujarati script is basically phonemic, with a few exceptions.^[3] First out of these is the written representation of non-pronounced as, which are of three types.

Word-final as. Thus ઘર "house" is pronounced ghar and not ghara. The as remain unpronounced before postpositions and before other words in compounds: ઘરપર "on the house" is gharpar and not gharapar; ઘરકામ "housework" is gharkām and not gharakām. This non-pronunciation is not always the case with conjunct characters: મિત્ર "friend" is truly mitra.
Naturally elided as through the combination of morphemes. The root પકડ pakaṛ "hold" when inflected as પકડે "holds" remains written as pakaṛe even though pronounced as pakṛe. See Gujarati phonology#.C9.99-deletion.
as whose non-pronunciation follows the above rule, but which are in single words not resultant of any actual combination. Thus વરસાદ "rain", written as varasād but pronounced as varsād.

Secondly and most importantly, being of Sanskrit-based Devanagari, Gujarati's script retains notations for the obsolete (short i, u vs. long ī, ū; r̥, ru; ś, ṣ), and lacks notations for innovations (/e/ vs. /ɛ/; /o/ vs. /ɔ/; clear vs. murmured vowels).^[4]

Contemporary Gujarati uses European punctuation, such as the question mark, exclamation mark, comma, and full stop. Apostrophes are used for the rare(ly written) clitic. Quotation marks are not as often used for direct quotes. The full stop replaced the traditional vertical bar, and the colon, mostly obsolete in its Sanskritic capacity (see below), follows the European usage.

[edit] Gujarati characters, diacritics, and numerals

[edit] Vowels

Vowels (svara), in their conventional order, are historically grouped into "short" (hrasva) and "long" (dīrgha) classes, based on the "light" (laghu) and "heavy" (guru) syllables they create in traditional verse. The historical long vowels ī and ū are no longer distinctively long in pronunciation. Only in verse do syllables containing them assume the values required by meter.^[5]

Finally, a practice of using inverted mātras to represent English [æ] and [ɔ]'s has gained ground.^[3]

Independent	Diacritic	Diacritic of ક	Rom.	IPA	Name of diacritic^[6]
અ		ક	a	ə
આ	ા	કા	ā	ɑ̈	kāno
ઇ	િ	કિ	i	i	hrasva-ajju
ઈ	ી	કી	ī	i	dīrgha-ajju
ઉ	ુ	કુ	u	u	hrasva-varaṛũ
ઊ	ૂ	કૂ	ū	u	dīrgha-varaṛũ
ઋ	ૃ	કૃ	r̥	ɾu
એ	ે	કે	e, ɛ		ek mātra
ઐ	ૈ	કૈ	ai	əj	be mātra
ઓ	ો	કો	o, ɔ		kāno ek mātra
ઔ	ૌ	કૌ	au	əʋ	kāno be mātra
ઍ	ૅ	કૅ	â	æ
ઑ	ૉ	કૉ	ô	ɔ

ર r, જ j and હ h form the irregular forms of રૂ rū, જી jī and હૃ hṛ.

[edit] Consonants

Consonants (vyañjana) are grouped in accordance with the traditional, linguistically-based Sanskrit scheme of arrangement, which considers the usage and position of the tongue during their pronunciation. In sequence, these categories are: velar, palatal, retroflex, dental, labial, sonorant and fricative. Among the first five groups, which contain the stops, the ordering starts with the unaspirated voiceless, then goes on through aspirated voicless, unaspirated voiced, and aspirated voiced, ending with the nasal.

	Plosive												Nasal			Sonorant			Sibilant
	Voiceless						Voiced
	Unaspirated			Aspirated			Unaspirated			Aspirated
Velar	ક	ka	kə	ખ	kha	k^hə	ગ	ga	gə	ઘ	gha	g^ɦə	ઙ	ṅa	ŋə
Palatal	ચ	ca	tʃə	છ	cha	tʃ^hə	જ	ja	dʒə	ઝ	jha	dʒ^ɦə	ઞ	ña	ɲə	ય	ya	jə	શ	śa	ʃə
Retroflex	ટ	ṭa	ʈə	ઠ	ṭha	ʈ^hə	ડ	ḍa	ɖə	ઢ	ḍha	ɖ^ɦə	ણ	ṇa	ɳə	ર	ra	ɾə	ષ	ṣa	ʃə
Dental	ત	ta	t̪ə	થ	tha	t̪^hə	દ	da	d̪ə	ધ	dha	d̪^ɦə	ન	na	nə	લ	la	lə	સ	sa	sə
Labial	પ	pa	pə	ફ	pha	p^hə	બ	ba	bə	ભ	bha	b^ɦə	મ	ma	mə	વ	va	ʋə

Guttural	હ	ha	ɦə
Retroflex	ળ	ḷa	ɭə
	ક્ષ	kṣa	kʃə
	જ્ઞ	jña	gnə

Letters can take names, by suffixing કાર kār. ર ra is an exception; it's called રેફ reph.^[7]
Starting with ક ka and ending with જ્ઞ jña, the order goes^[8]:

Plosives & Nasals (left to right, top to bottom) → Sonorants & Sibilants (top to bottom, left to right) → Bottom box (top to bottom)

The final two are compound characters that happen to be traditionally included in the set. They are indiscriminable as to their original constituents, and they are the same size as a single consonant character.
Written (V)hV sets in speech result in murmured V̤(C) sets (see Gujarati phonology#Murmur). Thus (with ǐ = i or ī, and ǔ = u or ū): ha → [ə̤] from /ɦə/; hā → [a̤] from /ɦa/; ahe → [ɛ̤] from /əɦe/; aho → [ɔ̤] from /əɦo/; ahā → [a̤] from /əɦa/; ahǐ → [ə̤j] from /əɦi/; ahǔ → [ə̤ʋ] from /əɦu/; āhǐ → [a̤j] from /ɑɦi/; āhǔ → [a̤ʋ] from /ɑɦu/; etc.

[edit] Non-vowel diacritics

Diacritic	Name	Function
ં	anusvāra	Represents vowel nasality or the nasal stop homorganic with the following stop.^[8]
ઃ	visarga	A silent, rarely used Sanskrit holdover originally representing [h]. Romanized as ḥ.
્	virāma	Strikes out a consonant's inherent a.^[9]

[edit] Digits

0	૦	mīṇḍuṃ
1	૧	ekaṛo
2	૨	bagaṛo
3	૩	tragaṛo
4	૪	cogaṛo
5	૫	pāṃcaṛo
6	૬	chagaṛo
7	૭	sātaṛo
8	૮	āthaṛo
9	૯	navaṛo

[edit] Conjuncts

As mentioned, successive consonants lacking a vowel in between them may physically join together as a 'conjunct'. The government of these clusters ranges from widely to narrowly applicable rules, with special exceptions within. While standardized for the most part, there are certain variations in clustering, of which the Unicode used on this page is just one scheme. The rules^[3]:

23 out of the 36 consonants contain a vertical right stroke (ખ, ધ, ળ etc.). As first or middle fragments/members of a cluster, they lose that stroke. e.g. ત + વ = ત્વ, ણ + ઢ = ણ્ઢ, થ + થ = થ્થ.
- શ ś(a) appears as a different, simple ribbon-shaped fragment preceding વ va, ન na, ચ ca and ર ra. Thus શ્વ śva, શ્ન śna, શ્ચ śca and શ્ર śra. In the first three cases the second member appears to be squished down to accommodate શ's ribbon fragment. In શ્ચ śca we see ચ's Devanagari equivalent of च as the squished-down second member. See the note on ર to understand the formation of શ્ર śra.
ર r(a)
- as a first member it takes the form of a curved upward dash above the final character or its kāno. e.g. ર્વ rva, ર્વા rvā, ર્સ્પ rspa, ર્સ્પા rspā.
- as a final member
  - with ટ, ઠ, ડ, ઢ and દ, it is two lines below the character, pointed downwards and apart. Thus ટ્ર, ઠ્ર, ડ્ર, ઢ્ર and દ્ર.
  - elsewhere it is a diagonal stroke jutting leftwards and down. e.g. ક્ર, ગ્ર, ભ્ર. ત ta is shifted up to make ત્ર tra.
Vertical combination of geminates ṭṭa, ṭhṭha, ḍḍa and ḍhḍha: ટ્ટ, ઠ્ઠ, ડ્ડ, ઢ્ઢ. Also, ટ્ઠ ṭṭha and ડ્ઢ ḍḍha.
As first shown with શ્ચ śca, while Gujarati is a separate script with its own novel characters, for compounds it will often use the Devanagari versions.
- દ d(a) as द preceding ગ ga, ઘ gha, ધ dha, બ ba (as ब), ભ bha, વ va, મ ma and ર ra. The first six second members are shrunken and hang at an angle off the bottom left corner of the preceding દ/द. Thus દ્ગ dga, દ્ઘ dgha, દ્ધ ddha, દ્બ dba, દ્ભ dbha, દ્વ dva, દ્મ dma and દ્ર dra.
- હ h(a) as ह preceding ન na, મ ma, ય ya, ર ra, વ va and ઋ ṛ. Thus હ્ન hna, હ્મ hma, હ્ય hya, હ્ર hra, હ્વ hva and હૃ hṛ.
- when ઙ ṅa and ઞ ña are first members we get second members of ક ka as क, ચ ca as च and જ ja as ज. ઙ forms compounds through vertical combination. ઞ's strokeless fragment connects to the stroke of the second member, jutting upwards while pushing the second member down. Thus ઙ્ક ṅka, ઙ્ગ ṅga, ઙ્ઘ ṅgha, ઙ્ક્ષ ṅkṣa, ઞ્ચ ñca and ઞ્જ ñja.
The remaining vertical stroke-less characters join by squeezing close together. e.g. ક્ય kya, જ્જ jja.
Outstanding special forms: ન્ન nna, ત્ત tta, દ્દ dda and દ્ય dya.

The role and nature of Sanskrit must be taken into consider to understand the occurrence of consonant clusters. The orthography of written Sanskrit was completely phonetic, and had a tradition of not separating words by spaces. Morphologically it was highly synthetic, and it had a great capacity to form large compound words. Thus clustering was highly frequent, and it is Sanskrit loanwords to the Gujarati language that are the grounds of most clusters. Gujarati, on the other hand, is more analytic, has phonetically smaller, simpler words, and has a script whose orthography is slightly imperfect (a-elision) and separates words by spaces. Thus evolved Gujarati words are less a cause for clusters. The same can be said of Gujarati's other longstanding source of words, Persian, which also provides phonetically smaller and simpler words.

An example attesting to this general theme is that of the series of d- clusters. These are essentially Sanskrit clusters, using the original Devanagari forms. There are no cluster forms for formations such as dta, dka, etc. because such formations weren't permitted in Sanskrit phonology anyway. They are permitted under Gujarati phonology, but are written unclustered (પાદતું pādtuṃ "farting", કૂદકો kūdko "leap"), with patterns such as a-elision at work instead.

[edit] Romanization

Gujarati is romanized throughout Wikipedia in "standard orientalist" transcription as outlined in Masica (1991:xv). Being "primarily a system of transliteration from the Indian scripts, [and] based in turn upon Sanskrit" (cf. IAST), these are its salient features: subscript dots for retroflex consonants; macrons for etymologically, contrastively long vowels; h denoting aspirated stops. Tildes denote nasalized vowels and underlining denotes murmured vowels.

Vowels and consonants are outlined in the tables below. Hovering the mouse cursor over them will reveal the appropriate IPA symbol. Finally, there are three Wikipedia-specific additions: f is used interchangeably with ph, representing the widespread realization of /p^h/ as [f]; â and ô for novel characters ઍ [æ] and ઑ [ɔ]; ǎ for [ə]'s where elision is uncertain. See Gujarati phonology for further clarification.

**Vowels**
	Front	Central	Back
Close	i/ī		u/ū
Mid	e		o
Mid	ɛ	a	ɔ
Open		ā

**Consonants**
	Bilabial		Labio- dental	Dental		Alveolar	Retroflex		Post-alv./ Palatal		Velar
Stop	p ph	b bh		t th	d dh		ṭ ṭh	ḍ ḍh			k kh	g gh
Affricate									c ch	j jh
Nasal	m					n	ṇ		ñ		ṅ
Fricative						s	ṣ		ś				h
Tap or Flap						r	ṛ ṛh
Approximant			v						y
Lateral approximant						l	ḷ

[edit] Gujarati in Unicode

The Unicode range for Gujarati script is from U+0A80 to U+0AFF. The ISCII Code-page identifier for Gujarati script is 57010.

The table below shows the glyphs that are implemented in Unicode standard 4.0.0. Gray boxes indicate the code-points that are undefined/unused.

Gujarati Unicode.org chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+0A8x		ઁ	ં	ઃ		અ	આ	ઇ	ઈ	ઉ	ઊ	ઋ	ઌ	ઍ		એ
U+0A9x	ઐ	ઑ		ઓ	ઔ	ક	ખ	ગ	ઘ	ઙ	ચ	છ	જ	ઝ	ઞ	ટ
U+0AAx	ઠ	ડ	ઢ	ણ	ત	થ	દ	ધ	ન		પ	ફ	બ	ભ	મ	ય
U+0ABx	ર		લ	ળ		વ	શ	ષ	સ	હ			઼	ઽ	ા	િ
U+0ACx	ી	ુ	ૂ	ૃ	ૄ	ૅ		ે	ૈ	ૉ		ો	ૌ	્
U+0ADx	ૐ
U+0AEx	ૠ	ૡ	ૢ	ૣ			૦	૧	૨	૩	૪	૫	૬	૭	૮	૯
U+0AFx		૱

For further details regarding Unicode Code-points and standards, you may refer to Unicode Code-chart — Standard 4.1. For further details regarding how to use Unicode for creating Gujarati script can be found on Wikibooks: b:How to use Unicode in creating Gujarati script.

[edit] Gujarati keyboard layouts

[edit] Inscript keyboard layout

[edit] Keyboard and script resources

The India Linux Project - Gujarati
MS Windows keyboard layout reference for major world languages
Sun Microsystem reference: Indic keyboard layouts
Linux: Indic language support
Microsoft — Indic language website: Use of Gujarati Input Method Editor (IME) (free download)
How To: Set your existing keyboard as Gujarati (Unicode) keyboard in Windows XP
Indic Multilingual Project by Centre for Development of Advanced Computing — C-DAC India

[edit] References

^ (Mistry 1996, p. 391)
^ (Tisdall 1892, p. 19)
^ ^a ^b ^c (Mistry 1996, p. 393)
^ (Mistry 2001, p. 274)
^ (Mistry 1996, pp. 391-392)
^ (Tisdall 1892, p. 20)
^ (Dwyer 1995, p. 18)
^ ^a ^b (Cardona & Suthar 2003, p. 668)
^ (Mistry 1996, p. 392)

[edit] Bibliography

Cardona, George & Babu Suthar (2003), "Gujarati", in Cardona, George & Dhanesh Jain, The Indo-Aryan Languages, Routledge, ISBN 9780415772945, <http://books.google.com/books?id=jPR2OlbTbdkC&pg=PA659&dq=indo-aryan+languages&sig=69z4DJxBuD4SPTTINIbzK_YW6ac>.
Dwyer, Rachel (1995), written at London, Teach Yourself Gujarati, Hodder and Stoughton, <http://www.racheldwyer.com/publications.html>.
Masica, Colin (1991), written at Cambridge, The Indo-Aryan Languages, Cambridge University Press, ISBN 9780521299442, <http://books.google.com/books?id=J3RSHWePhXwC&printsec=frontcover&dq=indo-aryan+languages>.
Mistry, P.J. (2001), "Gujarati", in Garry, Jane & Carl Rubino, An encyclopedia of the world's major languages, past and present, New England Publishing Associates.
Mistry, P.J. (1996), "Gujarati Writing", in Daniels & Bright, The World's Writing Systems, Oxford University Press.
Tisdall, W.S. (1892), A Simplified Grammar of the Gujarati Language, <http://www.archive.org/details/simplifiedgramma00tisdiala>.

[edit] See also

Wikibooks: How to use Unicode in creating Gujarati script
Unicode and HTML
Yudit - open source tool for editing in Gujarati and other Unicode scripts.
Gujarati course in Wikibooks

[edit] External links

TDIL: Ministry of Communication & Information Technology, India
Gujarati Wiktionary
Gujarati Editor
Send email in Gujarati script (No fonts required)

v • d • e Topics related to the Gujarati language

Grammar • Phonology • Script • Literature

v • d • e

Writing systems

Overview

History of writing · Graphemes

Lists

List of writing systems · List of languages by writing system · List of writing systems by number of native users · List of languages by first written accounts · List of undeciphered writing systems · List of inventors of writing systems

Types

Abjads (Numerals)	Aramaic · Arabic · Hebrew · Jawi · Nabatean · Pahlavi · Phoenician · Proto-Canaanite · Psalter · Sabaean · Samaritan · South Arabian · Sogdian · Syriac · Tifinagh · Ugaritic

Abugidas	Brahmic family: Ahom · Balinese · Batak · Baybayin · Brāhmī · Buhid · Burmese · Chakma · Cham · Devanāgarī · Dhives Akuru · Eastern Nagari · Grantha · Gujarati · Gupta · Gurmukhī · Hanunó'o · Javanese · Kadamba · Kaithi · Kalinga · Kannada · Khmer · Lanna · Lao · Lepcha · Limbu · Lontara · Malayalam · Meitei Mayek · Mithilakshar · Modi · Mon · Nāgarī · Nepal · Old Kawi · Old Sundanese · Oriya · Pallava · Phagspa · Ranjana · Rejang · Śāradā · Saurashtra · Sinhala · Siddhaṃ · Soyombo · Sundanese · Sylheti Nagari · Tagbanwa · Tai Dam · Tai Le · Takri · Tamil · Telugu · Thai · Tibetan · Tocharian · Varang Kshiti Other: Boyd's syllabic shorthand · Canadian Aboriginal · Ge'ez · Japanese braille · Kharosthi · Meroitic · Pitman shorthand · Pollard script · Sorang Sompeng · Thaana · Thomas Natural Shorthand

Alphabets	Linear: Armenian · Avestan · Beitha Kukju · Coptic · Cyrillic · Eclectic shorthand · Elbasan · Fraser · Gabelsberger shorthand · Georgian · Glagolitic · Gothic · Gregg shorthand · Greek · Greco-Iberian alphabet · Hangul · International Phonetic · Latin · Manchu · Mandaic · Mongolian · Neo-Tifinagh · N'Ko · Ogham · Ol Chiki · Old Hungarian · Old Italic · Old Permic · Orkhon · Osmanya · Runic · Shavian alphabet · New Tai Lue · Bassa Vah · Visible Speech Non-linear: Braille · Hebrew braille · Korean braille · Maritime flags · Morse code · New York Point · Semaphore line · Flag semaphore · Moon type

Ideo- & Pictograms	Aztec · Blissymbol · DanceWriting · Dongba · Mi'kmaq · New Epoch Notation Painting · SignWriting

Logograms	Chinese characters: Traditional · Simplified · Chữ Nôm · Hanja · Kanji Chinese-based: Jurchen · Khitan · Zhuang Other logo-syllabic: Anatolian · Cuneiform · Maya · Tangut script · Yi Logo-consonantal: Demotic · Hieratic · Hieroglyphs Numerals: Hindu-Arabic, abjad numerals, alphabetic numerals (Greek, Roman)

Semi-syllabaries	Full semi-syllabaries: Celtiberian script · Northeastern Iberian script · Southeastern Iberian script Redundant semi-syllabaries: Southwestern script Pahawh Hmong · Zhùyīn fúhào

Syllabaries	Afaka · Cherokee · Cypriot · Geba · Hiragana · Katakana · Kikakui · Kpelle · Linear B · Man'yōgana · Nü Shu · Old Persian Cuneiform · Vai · Woleaian · Yi · Yugtun