Machine Transcription Conversion Between Perso-Arabic and Romanized Writing Systems
Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE creditsStudent thesis
Perso-Arabic script is the official writing system in Iran. Romanized transcriptions, based on phonology of Persian, have been extensively used in electronic communications especially on Internet. Dealing with the conversion between these two types of writing systems has been an interesting topic in Natural Language Processing. Similar to Machine Translation, these conversions can be applied at different grammatical layers; such as sentence, phrase or word layer. In this thesis, by choosing Dabire as a standard Romanized transcription, we introduce two approaches to achieve such conversions at word level. In Lexicon-based approach we use Finite State Technology for bi-directional conversion between Perso-Arabic and Dabire. The second approach uses association analysis for statistical conversion from Perso-Arabic to Dabire.
Place, publisher, year, edition, pages
2010. , 50 p.
Engineering and Technology
IdentifiersURN: urn:nbn:se:liu:diva-61029ISRN: LIU-IDA/LITH-EX-A--09/042--SEOAI: oai:DiVA.org:liu-61029DiVA: diva2:360341