Upload
neeraj-gangwar
View
295
Download
3
Embed Size (px)
DESCRIPTION
A presentation on why Sanskrit can be used as a computer language.
Citation preview
Sanskrit as a Computer Language
Contents
About the topic
Sanskrit
Computational Linguistics
Language Processing
Sanskrit as a candidate for Computer Language
About the topic
Transaction Processing involves Receipt, Storage, Manipulation/Processing, Transfer and Retrieval of Information.
The idea of using a natural language for computer programming is to make it easier for people to talk to computers in their native tongue and spare them the pain of learning a computer friendly language like assembly/C/Java.
In natural language processing and related fields, study of complex problems is required.
Sanskrit is considered to be one of the best structured language.
Richness, strength, accuracy, structure, flexibility and the extant work available in Sanskrit.
So Sanskrit is a candidate for computer programming, in the fields of natural language processing and Artificial Intelligence.
Sanskrit
A historical Indo-Aryan language, the primary liturgical language of Hinduism and a literary and scholarly language in Buddhism and Jainism.
Today, it is listed as one of the 22 scheduled languages of India and is an official language of the state of Uttarakhand.
Member of the Indo-Iranian sub-family of the Indo-European family of languages.
Closest ancient relatives are the Iranian languages Old Persian and Avestan.
The earliest known linguistic activities date to Iron Age India(~8th century BC)with the analysis of Sanskrit.
Computational Linguistics
Computational linguistics is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective.
When machine translation failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed.
It was born as a new field of study devoted to algorithms for intelligently processing language data.
In order to translate one language into another:
1. one had to understand the grammar of both languages, including both morphology and syntax.
2. In order to understand syntax, one had to also understand the semantics and the vocabulary, and even to understand something of the pragmatics of language use.
3. Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers.
Steps followed are:
1. Morphological and Lexical Analysis
2. Syntactic Analysis
3. Semantic Analysis
4. Discourse Integration
5. Pragmatic Analysis
Language Processing
Sanskrit as a candidate for Computer Language
Sanskrit is a strong candidate for Computer Language, in the fields of Natural Language Processing and Artificial Intelligence. Because:
1. Phonetics
2. Analysis of Parts and forms of speech
3. Flexibility of Word-Formation
4. Structure of Grammar
5. Disambiguation Rules
6. Variety and richness of technical literature
Phonetics
One of the significant advantages of Sanskrit is that the grammar ensures total precision and guards against ambiguity, miss-spelling and miss-pronunciation as the meanings are bound to get altered otherwise.
The real advantage is that the correlation between and spoken forms is one, the two forms of input can be exchange ably used.
The analysis of alphabets (characters) is based on sound production from well-defined places of utterances.
All valid words have proper derivation/deduction from finite set of well grouped verb roots and noun bases so that what is meant is uniquely determined and accuracy ensured.
Speech synthesis can possibly benefit by this feature immensely since accent, frequency, emphasis and timing oriented discrepancies associated with other natural language speech inputs are absent here.
Analysis of parts & forms of speech
From various categories of words in Sanskrit, a matrix of all possible valid word forms can be generated by formalization of the grammar rules.
The grammar has simple effective rules for conversion between various forms and sentences.
Flexibility in word-formation
All valid word-forms have 2 significant parts: the stem or substrate and the affixes.
With respect to the admissible combinations of substrates and affixes, there are elaborate but clear rule specifying these with the attendant changes in the meanings denoted, the latter being derivable straightway.
There cannot be distortions in Sanskrit either written or spoken and violations of it will be transparent to the linguists.
Structure of grammar
All technical literature in Sanskrit have a fundamental set of ‘Aphorisms’. These are termed as “Sutras”
Grammar rule are in this ‘sutra’ style which greatly condenses the amount of instructions or information to be given to precisely convey a particular aspect.
They are contained in eight chapters with four quarters per chapter.
Verb-roots are grouped into 10 categories each having a given group suffix, besides verb forming suffix added to the roots to form verbs.
Classification of verbs consists of six tenses and four moods in which verb can be expressed.
There are 3 numbers as in the case of nouns and verb-roots can take one of 3 persons, ie first, second and third (I, you and It).
Disambiguation Rules
The absence of syntax in Sanskrit is a definite plus point in its favor. The semantics also can be extracted by well laid out procedures.
Here the rule of syllogism and mimamsa are utilized. Mechanism of associating meanings with words is dealt with in detail and guidelines for establishing meanings at word, sentence and discourse level are given.
Rules to guide priority, conflicting handling, exception special cases are well defined to ensure precision and accuracy.
There are numerous illustrations provided in works which includes commentaries, treaties, notes and expositions, to explain the fundamentals contained in sutras.
Variety and Richness of Technical Literature
Technical literature comprises of 14 branches of learning.
4 vedas: rig veda, yajur veda, sama veda and atharveda veda
6 vedic auxiliaries: Phonology grammar, prosody, etymology, astronomy and ritualry, study of vedic texts, syllogism, epics and codes of moral rectitude.
There is a great treasure of knowledge contained in these in an efficient and streamlined manner.
Difference in approach to Language Analysis
The analysis of sentence was not based on Noun-Phrase model.
Sentence description was phrased in terms of a generative model: From a number of primitive syntactic categories (verbal action, agents, object, etc.) the structure of the sentence was derived so that every word of a sentence could be referred back to the syntactic input categories.
Conclusion
Be it knowledge representation or speech synthesis, natural language processing or machine translation, intelligent tutoring systems or unambiguous semantic extraction, study of complex mathematical problems or linguistics, in virtually any field, one can think of utilizing the richness, strength, accuracy, efficiency, structure, flexibility and the extant works available in the Sanskrit language.
Though variety of literature is available, we need to dispassionately study these and take what is worth.