About

Most applications in the area of natural language understanding (NLU) of text start with parsing. Grammar-checking in Word Processors, Machine Translation, Text-to-Speech, Question Generation, Search, etc. all require a sophisticated parser to analyze sentences into their constituent parts. It is important for the parser to get the constituents right, otherwise parser errors can get compounded as they are processed by downstream applications (like the broken telephone game).

During the past several years a unique rule-based parser for Classical Sanskrit has been under construction, that can parse complex non-prose texts (click here for the tasks involved in Sanskrit parsing). The need for such a parser arose because there is no other software tool at present that can split a complex Sanskrit verse into clauses, and thereafter identify the components of each clause. While a single verse consisting of around 10-15 terms may result in a combinatorial expansion into hundreds of thousands of 'candidate' grammatical analyses, the parser must output one analysis that 'best' satisfies the rules of grammar. Clearly, therefore, the parser must be made aware of the rules of Sanskrit grammar.

The construction of this Sanskrit parser required a serious study of the key elements of the Ashtadhyayi ('the eight lessons'), a work of extraordinary genius that was written by Panini (~500 BCE) as a descriptive grammar for Sanskrit. The Ashtadhyayi incorporates elements that were the precursors of several concepts of modern linguistics (such as 'production systems', 'phonological features', 'morphemes', 'phonotactics', 'elision', 'trace', 'government', 'agreement', etc.). Among others, this system describes a systematic sequence of computational operations that applies abstract morphemes to transform an underlying abstract representation of a verb or a noun into its concrete phonological form i.e. a verb conjugation or a nominal declension (click here for the computational operations involved in a simple nominal declension; refer to [JRB1995] [1] for an English translation of an abridged form of the Ashtadhyayi).


References

  1. [jrb1995] Ballantyne J.. Laghu Kaumudi of Varadaraja: A Sanskrit grammar. Motilal Banarsidass Publishers:New Delhi; 1995.