Nepali Phonetics and Character Mapping for Building Speech Technology
Script to Phoneme Mapping
| Script | Phoneme (IPA) | Script | Phoneme (IPA) |
|---|---|---|---|
| अ | /ʌ/ | आ | /aː/ |
| इ | /i/ | ई | /iː/ |
| उ | /u/ | ऊ | /uː/ |
| ए | /e/ | ऐ | /ẽ/ |
| ओ | /o/ | औ | /au/ |
| अं | /ʌ̃/ | अँ | /ã/ |
| क | /k/ | ख | /kʰ/ |
| ग | /ɡ/ | घ | /ɡʱ/ |
| ङ | /ŋ/ | च | /t͡s/ |
| छ | /t͡sʰ/ | ज | /d͡z/ |
| झ | /d͡zʱ/ | ञ | /ɲ/ |
| ट | /ʈ/ | ठ | /ʈʰ/ |
| ड | /ɖ/ | ढ | /ɖʱ/ |
| ण | /ɳ/ | त | /t̪/ |
| थ | /tʰ/ | द | /d̪/ |
| ध | /dʱ/ | न | /n/ |
| प | /p/ | फ | /pʰ/ |
| ब | /b/ | भ | /bʱ/ |
| म | /m/ | य | /j/ |
| र | /r/ | ल | /l/ |
| व | /w/ | स | /s/ |
| ष | /ʂ/ | ह | /ɦ/ |
| श्र | /ʃr/ | ज्ञ | /d͡zɲ/ |
| क्ष | /kʃ/ |
JSON Representation
{
"vowels": {
"अ": "/ʌ/",
"आ": "/aː/",
"इ": "/i/",
"ई": "/iː/",
"उ": "/u/",
"ऊ": "/uː/",
"ऋ": "/r̩/",
"ॠ": "/r̩ː/",
"ए": "/e/",
"ऐ": "/ẽ/",
"ओ": "/o/",
"औ": "/au/",
"अं": "/ʌ̃/",
"अँ": "/ã/",
"अ:": "/ʌh/"
},
"diacritics": {
"ा": "/aː/",
"ि": "/i/",
"ी": "/iː/",
"ु": "/u/",
"ू": "/uː/",
"ृ": "/r̩/",
"ॆ": "/e/",
"े": "/e/",
"ै": "/ẽ/",
"ॉ": "/o/",
"ो": "/o/",
"ौ": "/au/",
"्": "virama",
"ः": "/h/"
},
"consonants": {
"क": "/k/",
"ख": "/kʰ/",
"ग": "/ɡ/",
"घ": "/ɡʱ/",
"ङ": "/ŋ/",
"च": "/t͡s/",
"छ": "/t͡sʰ/",
"ज": "/d͡z/",
"झ": "/d͡zʱ/",
"ञ": "/ɲ/",
"ट": "/ʈ/",
"ठ": "/ʈʰ/",
"ड": "/ɖ/",
"ढ": "/ɖʱ/",
"ड़": "/ɽ/",
"ढ़": "/ɽʱ/",
"ण": "/ɳ/",
"त": "/t̪/",
"थ": "/tʰ/",
"द": "/d̪/",
"ध": "/dʱ/",
"न": "/n/",
"प": "/p/",
"फ": "/pʰ/",
"ब": "/b/",
"भ": "/bʱ/",
"म": "/m/",
"य": "/j/",
"र": "/r/",
"ल": "/l/",
"व": "/w/",
"श": "/ʃ/",
"ष": "/ʂ/",
"स": "/s/",
"ह": "/ɦ/",
"श्र": "/ʃr/",
"ज्ञ": "/d͡zɲ/",
"क्ष": "/kʃ/"
},
"digits": {
"०": "0",
"१": "1",
"२": "2",
"३": "3",
"४": "4",
"५": "5",
"६": "6",
"७": "7",
"८": "8",
"९": "9"
},
"punctuation": {
"!": "exclamation",
"\"": "quotation",
"'": "apostrophe",
",": "comma",
".": "period",
":": "colon",
"?": "question_mark",
"।": "danda",
"_": "underscore"
}
}
What were the rules ?
Rules were inspired from wikipedia article: https://en.wikipedia.org/wiki/Nepali_phonology
Consonants
Spoken Nepali has 30 consonants in its native system, though some have tried to limit the number to 27.
| Bilabial | Dental | Alveolar | Retroflex | Dorsal | Glottal | |||
|---|---|---|---|---|---|---|---|---|
| Nasal | m (म) | n (न/ञ) | (ɳ (ण)) | ŋ (ङ) | ||||
| Plosive/ Affricate |
Voiceless | Unaspirated | p (प) | t̪ (त) | t͡s (च) | ʈ (ट) | k (क) | |
| Aspirated | pʰ (फ) | tʰ (थ) | t͡sʰ (छ) | ʈʰ (ठ) | kʰ (ख) | |||
| Voiced | Unaspirated | b (ब) | d̪ (द) | d͡z (ज) | ɖ (ड) | ɡ (ग) | ||
| Aspirated | bʱ (भ) | dʱ (ध) | d͡zʱ (झ) | ɖʱ (ढ) | ɡʱ (घ) | |||
| Fricative | s (स/श/ष) | ɦ (ह) | ||||||
| Trill | r (र) | |||||||
| Approximant | (w (व)) | l (ल) | (j (य)) | |||||
Vowels
Nepali has 11 phonologically distinctive vowels, including 6 oral vowels and 5 nasal vowels. In some contexts, intervocalic "h" leads to breathy-voiced vowels.
| Front | Central | Back | ||||
|---|---|---|---|---|---|---|
| Oral | Nasal | Oral | Nasal | Oral | Nasal | |
| Close | i (इ) | ĩ (ई) | u (उ) | ũ (ऊ) | ||
| Close-mid | e (ए) | ẽ (ऐ) | o (ओ) | |||
| Open-mid | ʌ (अ) | ʌ̃ (अँ) | ||||
| Open | ä (आ) | ã (आँ) | ||||
Normalization Rules
- Handling Halanta (्): A consonant with a halanta removes the inherent vowel "अ".
- Combining Matras: Replace the inherent vowel "अ" with the vowel corresponding to the matra.
- Clusters and Special Cases: Map clusters like क्ष, ज्ञ, श्र appropriately.

Post a Comment