94

tdiljan2002

Embed Size (px)

Citation preview

Page 1: tdiljan2002
Page 2: tdiljan2002

WebSite : http://tdil.mit.gov.in

PatronRajeeva Ratna Shah, SecretaryDepartment of Information TechnologyMinistry of Communications & Information Technology(Government of India)6, CGO Complex, New [email protected]

Contents Jan. 2001, ek?k1. Message of Hon’ble Minister Sh. Pramod Mahajan 1-L

2. TDIL Programme 1-R

3. Overcoming the Language Barrier 2-L

4. Achievements 2-L

5. Resource Centres for Language Technology Solutions 3-R

6. Potential Products & Services 4-L

7. TDIL Website 4-L

8. Implementation Strategy 4-L

9. International Programmes in 4-R

Multilingual Computing

10. MNC Products Supporting Indian Languages 4-R

11. Tentative List of Indian Language Products 5-L

12. Portals Supporting Indian Languages 7-L

13. Other Efforts 7-L

14. Major Events of Year 2000 7-R

15. Indian Language Technology Vision 2010 9-L

Contents May 2001, T;s�B

Special Issue on Language Technology Business MeetPatrons Message 1-1Programme Schedule 1-1

I Machine Aided Translation (MAT) 2-3II Operating System (OS) 3-3III Human Machine Interface System (HUMIS) 4-16IV Tools 16-29V e-Content 30-34VI Other Milestones 37-38

Quick Reference Guide 39-39

1. Calendar of Events 1-L2. TDIL Website 2-L3. TDIL Meet 2001 4-L4. UNESCO Expert Group on Multilingualism in Cyberspace 9-L5. Lexical Resources for Natural Language Processing 10-L6. Symposium on Translation Support System 10-L7. Universal Networking Language 10-R8. Indo-UK workshop on Language Engg. for South Asian Languages 11-R9. New Software Testing Facility 12-R10. Now Domain Name in Regional Languages 12-R11. Book Shelf 13-L12. Resource Center(s) for Indian Language 13-L

Technology Solutions 1st Year (2000 - 2001) Progress13. Impediments in IT Localization & Penetration 15-L14. Feedback on UNICODE Standard 3.0 15-R15. Indian Language IT Market 21-L16. MS Office XP with Indian Language Support 21-L17. Call for Technologies 21-R

Contents Sept. 2001, +Éζ´ÉxÉ

1. Calendar of Events-Year 2002 1L-1R

2. Reports2.1 TDIL Vision 2010 2L-3R2.2 LT-Business Meet & TOT 2001 4L-16L2.3 Intellectual Property Rights (IPR) 16R-16R2.4 UNESCO-Symposium on Language 17L-19R

in Cyber Space2.5 SCALLA 2001-Sharing Capability 20L-20R

in Localisation & Human LanguageTechnologies

2.6 UNESCO - Workshop on Medium 21L-22LTerms Strategy for Communicatons& Information

2.7 The Asia Pacific Development 22R-23LInformation Programme (APDIP)

2.8 Workshop on Corpus-based 23R-23RNatural Language Processing

2.9 1st Workshop on Indian Language OCR 24L-24R

2.10 1st International Conference on 25L-25RGlobal WordNet

3. Standardization3.1 Revision of Unicode Standard-3.0 26L-37R

for Devanagari Script3.2 Design Guides (Sanskrit, Hindi, 38L-75R

Marathi, Konkani, Sindhi, Nepali)3.3 Indian Standard Font Code (INSFOC) 76L-77R3.4 Indian Standard Lexware Format 78L-86R

4. 4.1 Reader’s Feedback 87L-87R4.2 Frequently Asked Questions 88L-91R

Contents January 2002, ek?k

Overcoming Language Barrier…

[email protected]

�Unity in Diversity� is the cultural characteristic of India. This isfacilitated by Information & Communication Technologies (ICT),through evolving and adhering to Standards for informationinterchange across Indian languages. Participation of industry isessential in rapid dissemination of technology for the people at large.

This issue deals with standardisation issues of internal encoding ofDevanagari script which is used for Hindi, Marathi, Sanskrit,Konkani as well as Sindhi. Revisions in the UNICODE arepresented subsequent upon the feedback from the language expertsand the language directorates of the State Governments on thedraft published in VishwaBharat@tdil, No.2, May 2001.Language design guides will be useful in designing languagetechnology products. Industry consensus emerged first time this yearfor standardisation of fonts. This will prove revolutionary milestonein ICT for masses.

This issue entails also the outcome of the Language TechnologyBusiness Meet held first time with prospective IT industries onNovember 7-8, 2001, in which 40 technology handshakes weresigned.

The joint efforts of Government, Academic and Industry (GAI) willlay the foundation for emerging knowledge-based society.Consolidation of decade-long efforts in the field of languagetechnology raises optimism of collaborative GAI Venture to acceleratebenefits of ICT for masses and to improve knowledge generationcapacities.

Editorial Team

Om Vikas [email protected]. Chaturvedi [email protected]. Sharma [email protected] Jain [email protected]

Page 3: tdiljan2002

Contents Page 1

1. Calendar of Events-Year 2002

n 1st International Wordnet Conference, Central Institute ofIndian Languages, Mysore, India , Jan 21-25, 2002.Web: http://www.ciil.org

l 20th International Unicode Conference, Washington DC, Jan28 – Jan 31, 2002.Web: http://www.unicode.org/iuc/iuc20/

n Workshop on Indian Language OCR , University ofHyderabad, 1-3 February 2002. Web: http://www.iit.net

l Unicode Technical Committee Meeting # 90 Hosted byMicrosoft, Mountain View, CA, Feb 11-14, 2002.Web: http://www.unicode.org

n A Workshop on Computational Linguistics, Anna University,Chennai, February 15-16, 2002.Web : http://annauni.edu/rctamil

n A Short term course on windows, MS Office, Coding andKeyboard overlays for Indian Scripts & Familiarisation ofIndian Language Processing tools, ER&DCIThriuvananthapuram, Feb. 15-Mar. 27, April 15-May 25,June 17-July 29, Aug. 19-Sept. 27, Oct. 21-Nov.29.Web : http://www.malyalamresourcentre.org

l Third International Conference on Intelligent Text Processingand Computational Linguistics, Mexico City, Mexico ,February 17 to 23, 2002.Web: http://www.cicling.org/2002

n School on Recent Trends in Inteligent Techniques, ComputerCentre, Bijni Complex, North-Eastern Hill University,Shillong, March 11-15, 2002.Web : www.isical.ac.in/~cvpr/events

n A Training Course on Indian Language Technology, IITKanpur, March 13-14, 2002, a full half-day presentation onIndian Language coding issues on 13 March and a tutorial onlexical data base design on 15 March 2002.Email: [email protected]

n Symposium on Translation Support Systems (STRANS2002), IIT Kanpur, 15-17 March, 2002.Web: http://www.cse.iitk.ac.in

n Workshop on Technology Development for NorthEastern Languages (2-3 day), IIT Guwahati, 2nd week ofMarch 2002.E-mail : [email protected]

l The 9th Conference on Theoretical and Methodological Issuesin Machine Translation, Keihanna, Japan, March 13 – 17,2002.Web: http://www.kecl.ntt.co.jp/events/tmi/

n Seminar on Language Technologies with specific focus onTelugu, Osmania University, Hyderabad, 18-19 March 2002(tentative). Web: http://www.iiit.net

l Human Language Technology Conference, San Diego,California, March 24-27, 2002.Web: http://hlt2002.org

l Language Technology For Business Information Systems (BIS2002), Poznan, Poland, April 24-25, 2002.Web: http://bis.kie.ae.poznan.pl

l Unicode Technical Committee Meeting # 91 hosted byPeopleSoft, Pleasanton, CA, May 7- May 10, 2002.Web: http://www.unicode.org

l Fifth symposium on Natural Language Processing 2002Oriental COCOSDA Workshop 2002, 9-11 May 2002.web: http://www.siit.net/snlp-o-cocosda2002/

l 21st International Unicode Conference, Dublin, Ireland May14 – May 17, 2002.Web: http://www.unicode.org

n A Training Programme on Multimedia Techniques for ContentDevelopment, Anna University, Chennai, May 20-30, 2002.Web : http://annauni.edu/rctamil

l Machine Translation Evaluation : Human Evaluators MeetAutomated Metrics 27 May 2002 and a hands-on evaluationworksop at LREC 2002 (29 May-2 June 2002), Las Palmas,Canary Islands – Spain.Web: http://www.elda.fr

n National Wordnet workshop, IIT Mumbai, June 2002Web : http://www.cse.iitb.ernet.in/~pb

l The Second International Natural Language GenerationConference (INLG2002), Arden Conference Center,Ramapo mountains, New York City, USA, July 1 to 3, 2002.Web: http://www.research.att.com/~rambow/inlg.html

l Student Research Workshop, July 6-11, 2002,Workshop onWord Sense Disambiguation : Recent Successes and FutureDirections, July 11, 2002, Workshop on Effective tools andmethodologies for teaching NLP and CL, July 7, 2002,Workshop on Speech-to-Speech translation : Algorithms &Systems July 11, 2002,Workshopon unsupervised LexicalAcquisition, July 12,2002,at ACL-02, Philadelphia,Pennsylvania, USA.Web: http://www.acl02.org

l 40th Annual Meeting of the Association for ComputationalLinguistics, Philadelphia, PA, USA, 7 – 12 July, 2002.Web: http://www.acl02.org

l Summer Workshop on Language Engineering, to be held inBaltimore, MD, USA, from July 8 to August 16, 2002.Email: [email protected]

l Natural Language Processing in the Biomedical Domain,University of Pennsylvania, Philadelphia, PA from 11-12July, 2002.

l Lexicom@ITRI Workshop will be held at InformationTechnology Research Institute, University of Brighton,England from 14-19 July 2002.Web : http://www.itri.bton.ac.uk/lexicom/

l 19th International Conference on Computational Linguistics,Howard International House, Taipei, Taiwan, August 24 –September 1, 2002.Web: http://www.coling2002.sinica.edu.tw/

n Marathi Word Processing for Govt. Officials, IIT Mumbai,June 2002. Web : http://www.cse.iitb.ernet.in/~pb

l Association for Machine Translation in America-2002,Tiburon, California , October 8-12, 2002.Web: www.amtaweb.org

n UNL symposium along with a Machine TranslationWorkshop, IIT Mumbai, June 2002Web : http://www.cse.iitb.ernet.in/~pb

n ICON (Indian Conference on NLP)-2002, NCST Mumbai,Dec. 2002. Web : http://www.ncst.org

n A Workshop on NLP and Applications, Anna University,Chennai, Dec. 10-21, 2002.Web : http://annauni.edu

Note : n Indicates conferences in India.

Page 4: tdiljan2002

2. Reports

Contents Page 2

2.1 TDIL Vision 2010

Vision statementDigital unite and knowledge for all.

Mission statementCommunicating & moving up the knowledge chainovercoming language barrier.

Objectivesw To develop information processing tools to

facilitate human machine interaction in Indianlanguages and to create and access multilingualknowledge resources/content.

w To promote the use of information processingtools for language studies and research.

w To consolidate technologies thus developed forIndian languages and integrate these to developinnovative user products and services.

Major Initiativesw Knowledge Resources

(Parallel Corpora, Multilingual Libraries/Dictionaries)

w Knowledge Tools(Portals, Language Processing Tools, TranslationMemory Tools)

w Translation Support Systems(Machine Translation, Multilingual InformationAccess, Cross Language Information Retrieval)

w Human Machine Interface System(Optical Character Recognition Systems, VoiceRecognition Systems, Text-to-Speech System)

w Localization(Adapting IT Tools and solutions in IndianLanguages)

w Language Technology Human Resource Development(Manpower Development in Natural LanguageProcessing)

w Standardization(ISCII, Unicode, XML, TMX, ISFOC etc.)

TDIL Programme Goals

Short Term Goalsw Standardization of code, font, keyboard etc.

w Fonts and basic software utilities in publicdomain.

w Corpora creation and analysis

w Smart content creation.

w Language Technology be integrated into ITcurricula.

w Collaborative development of Indian languagelexical resources.

w Writing aids (Spell checks, grammar checks andtext summarization utilities).

w Sharing of standardized lexware &developmentof lexware tools.

w Training programs on ILT awareness, lexwaredevelopment, and computational linguistics.

Medium Term Goalsw Indian language speech database

w Multilingual, multimedia, content developmentwith semantic indexing, classical and multi fontand decorative fonts, offline/online OCR.

w Cross lingual information retrieval (CLIR) tools.

w Human speech encoding

w Speech Engine : Speech recognition, specificspeech I/O.

w Indian language support on Internet appliances.

w Understanding and Acquisition of languages,knowledge representation, gisting and interfacing.

w Distinguished achievement awards for M.Tech/MCA/Ph.D. level in Indian LanguageTechnologies.

w Machine aided translation: English to Indianlanguages, among Indian languages, Indianlanguages to English and other foreign languages.

w On line rapid translation, gisting andsummarization.

Long Term Goalsw Speech to speech translation.

w Human Inspiring Systems.

Page 5: tdiljan2002

Contents Page 3

Resource Centres for LanguageTechnology Solutions

The MCIT has established thirteen Resource Centresfor Indian Language Technology Solutions coveringall the constitutional languages.

The core objectives of these Resource Centres are:

w To act as a repository of all knowledge tools andproducts concerned with computer processing ofIndian Languages and bring out yearly resourcedocuments.

w To develop the methodologies and tools for seamlessintegration of language processing tools with existingand evolving software development environment.

w To network with centres concerned withcomputer processing of Indian Languages andpotential user agencies.

w To create content and databases on the resourceinformation available in Indian languages and toput some respected books (related to IndianHeritage) in Indian language on the web. Also towork with local news papers and to make itavailable on-line.

w To create awareness and organize trainingprogrammes for agencies and personnelconcerned with the deployment of Indianlanguage processing systems.

w To facilitate language technology research inMachine Aided Translation, Optical CharacterRecognition, Text-to-Speech and SpeechRecognition for Hindi.

w To organize IT localization clinics for small businessto provide consultancy on use of Indian languagetools in developing IT solutions and to take updevelopment of requisite niche technologies.

Organizations and associated Languagesw Indian Institute of Technology, Kanpur.

(Hindi, Nepali)Prof. R.M.K.SinhaE-mail : [email protected]

w Indian Institute of Technology, Mumbai.(Marathi, Konkani)Prof. Pushpak BhattacharyaE-mail : [email protected]

w Indian Institute of Technology, Guwahati.(Assamese, Manipuri)Prof. Gautam BaruaE-mail : [email protected]

w Indian Institute of Science, Bangalore.(Kannada, Sanskrit Cognitive Models)Prof. N.J. RaoE-mail : [email protected]

w Indian Statistical Institute, Kolkata. (Bengali)Prof. B.B. ChaudharyE-mail : [email protected]

w Jawaharlal Nehru University, New Delhi.(Foreign Languages Japanese, Chinese & SanskritLanguage Learning Systems)Prof. G.V. SinghE-mail : [email protected]

w University of Hyderabad, Hyderabad. (Telugu)Prof. K. Narayan MurthyE-mail : [email protected]

w Anna University, Chennai. (Tamil)Dr. T.V. GeethaE-mail : [email protected]

w MS University, Baroda. (Gujarati)Shri Sitanshu Y. MehtaE-mail : [email protected]

w Utkal University and Orissa ComputerApplication Centre (OCAC), (Oriya)Prof. Ms. Sanghmitra MohantyE-mail : [email protected]. A.K. PujariE-mail : [email protected]

w Thapar Institute of Engg. & Tech., Patiala.(Punjabi)Prof. G.S. LehalE-mail : [email protected]

w ER&DC, Trivendrum.(Malayalam)Prof. Ravinder KumarE-mail : [email protected]

w C-DAC, Pune.(Urdu, Sindhi, Kashmiri)Sh. M.D. KulkarniE-mail : [email protected]

Page 6: tdiljan2002

Contents Page 4

2.2 Proceedings of the Language TechnologyBusiness Meet ( LTBM) and Technology

Handshakes,on Nov.7-8, 2001

at Department of Information TechnologyMinistry of Communications & Information Technology,

Electronics Niketan, 6 C.G.O Complex,New Delhi

The participants consisted of 80 members fromthe Industry, 39 from Academia and 33 fromGovernment.

I. Inaugural SessionShri Rajeeva Ratna Shah, Secretary, Ministry of IT,declared the LTBM open on Nov. 7, 2001 by clickingthe mouse displaying the message of bringingtogether the Government, the Industry and theAcademia for “Digital Unite & Knowledge for all”.Dr. Om Vikas presented the genesis of LTBM andmentioned the chronological development ofLanguage Technology in India, which could becategorized into three phases:

l 1976-1990: A-Technology PhaseFocus was on Adaptation of technologies; abstractionof requisite technological designs and competencebuilding in R & D institutions.

l 1991-2000: B-Technology PhaseFocus was on developing Basic technologies- genericinformation processing tools, interface technologiesand cross-compatibility conversion utilities. TDIL(Technology Development for Indian Languages)programme was initiated during this phase.

l 2001-2010: C-Technology PhaseFocus is on developing Creative Technologies in the

context of convergence of computing,communication and content technologies.Collaborative technology development is beingencouraged to realize.

Shri Rajeeva Ratna Shah released the 3rd issue ofVishwaBharat@TDIL, which details over 50technologies for possible transfer/ collaborativedevelopment.

Release of Newsletter VishwaBharat@tdil

During the inaugural address, Sh. Rajeeva RatnaShah, Secretary made the following remarks:

l The formation of the Indian LanguageTechnology Industry Consortium is a welcomesign. It will help in commercialization of thetechnologies developed under TDIL programme.The consortium could also provide a platformto facilitate a dialogue between technologyproviders and technology takers in the industry.

l Indian language technology development shouldemerge soon to become at par with similartechnology available for English, All literarycontent in different Indian languages may bemade available in electronic form. The works ofThiruvallavar and Tulsidas should be accessibleto all without any language barrier with the helpof Machine Aided Translation (MAT) Tools.Indian multi-lingual culture is much superiorunlike American multi-lingual culture which hasmerged into Americanism. We retain our basicculture, which has two portions - literature andcommunication. IT will facilitate thecommunication part and in this context text-to-speech and speech–to-text are important.

l MAT (Machine Aided Translation) especially withmulti-media and the scope for translation ofIndian literature for making these available in

Page 7: tdiljan2002

Contents Page 5

other Indian languages. This will accelerate theprocess of communication. Britanica has a sitewhich contains originals of the “Great Books ofthe World” on 50 subjects and they claim tohave covered the entire knowledge of the worldwith cross references. But we note that it doesnot contain works of any of the Indian thinkersor other thinkers like Confucius, and hence thishas missed the precious knowledge and literatureof Oriental Nations. Our vision for TDILprogramme should include creation of a seriesof “Great Books of India” in Hindi withconversion facility from one Indian language toanother as well as English. All this great workshould be available on the internet.

Opening of Exhibition-la;kstu % CombiningAcademia, Industry & Government efforts.

l Sh. R.R. Shah, Secretary MIT opened theexhibition by joining the Red, Green and Blueribbons representing Government, Industry andA c a d e m i arespectively, andSymbolizing a jointendeavor. Theme ofthe exhibition wasDigital Unite andKnowledge for all.Secretary interactedclosely with thet e c h n o l o g yexhibitors. Heappreciated theefforts of 21 organi-zations who demon-strated over 50 different language technologies.

II. Technology PresentationsDate: Nov. 7, 2001Session II Machine Aided Translation (MAT)It was chaired by Dr. K.P.A Menon, Chancellor,LBSS Vidyapeeth (formerly Secretary of MinistryDefence), who is considered to be the fastesttranslator of Sanskrit text in the world. Hemoderated the presentations and discussions forabout three technologies presented during the sessionon Machine Aided Translation systems. Thesepresentations include that of ER&DCI - Noida, IIT-Kanpur, NCST- Mumbai and C-DAC- Pune.

Date: Nov. 7, 2001Session III : Optical Character Recognition (OCR)It was chaired by Prof. K.K. Aggarwal, vice-chancellor, IP University. In this session fiveinstitutions discussed their OCR technologies forHindi, Bangla, Punjabi, Oriya and Telugu. TheIndustry raised the question about the TOT fees andtime for completion to reach a prototype.

Date: Nov. 7, 2001Session IV: Text to SpeechIt was chaired by Shri R. Ravindra Gupta, Secretary,Ministry of Heavy Industry. In this session, fiveinstitutions presented their technologies specially forHindi, Bangla, Oriya. Voice recognition s/w werealso demonstrated.

Date: Nov. 8, 2001

Session V: Linux & ToolsIt was chaired by Shri M. Shankar, Secretary, Dept.of Official Language. In this session Linux for Indian

Languages was presentedby NCST & IIT Kanpur.Four institutionspresented variouslanguage technology toolsfor Bangla, Tamil, Telguand Devanagari.

Date: Nov. 8, 2001Session VI: TOTDialogue & Resolutionfor CollaborativeD e v e l o p m e n t :Technology Handshakes It was chaired by Prof.

P.V.S. Rao of Tata Infotech who is a pioneer in the

Page 8: tdiljan2002

Contents Page 6

area of ComputerSystems & speechtechnology in thecountry. Hereiterated thatG o v e r n m e n t ,academia andindustry havejoined together tomake concertedefforts towardstaking thel a n g u a g etechnology to themasses at theinitiative ofMinistry of Communications & InformationTechnology. The success of this meet will addressmuch more complicated technology developmentprocess to bring Indian languages at par with Englishin terms of the availability and affordability of IT tothe common-man in India. Intensive technologypresentations, dialogue and technology transfernegotiations over these two days have facilitated tie-ups of mutual interest for technology transfer,identification of gap technologies and collaborativedevelopment in the areas of Machine AidedTranslation (MAT), OS, HUMIS, Tools and e-Content. The impact of these technologies will befar reaching:

l The translation tools will help the translators inreducing their translation effort in terms of man-hour effort by orders of magnitude. The machineaided translation for alien language pair i.e. fromEnglish to Hindi for simple sentences today hasbecome a reality.

l The Hindi enabled Linux will be in publicdomain for proliferation of it’s usage.

l (i) Optical Character Recognition(OCR) ofDevanagari is going to revolutionize the contentcreation effort in Hindi by reducing it from 2000hrs efforts to 2 hrs. effort. This will help in makingavailable archival data and heritage data easily onthe Internet.

(ii) The Devanagari Text-to-Speech systems will helpmany of visually impaired children in education.

l For variousoperating systemp l a t f o r m s ,language toolsranging from officesuite, AuthoringSystems, SearchEngines, Multi-lingual communi-cation Tools, andNatural LanguageProcessing Toolswill becomeavailable.

l Multilingualk n o w l e d g e

resources are also available in public domainfacilitating it’s use by educationists, researchers,publishers, etc.

About 40 Technology Handshake agreements weresigned for Transfer of Technology(TOT) orcollaborative technology development. MIT will takenecessary action to expedite the negotiations withregard to these technology handshakes for TOTaiming at early commercialization. On Technologytransfer, certain issues by the Industry were raisedwhich were clarified during the discussion. Theseissues mainly included the following points:

l TOT fee should be reasonable in view of thepromotional role of MIT and MIT shouldmediate in finalising the details of TechnologyTransfer with time targets.

l The methodology of supporting the technologydevelopment for future upgrades with the concernedR&D organizations should be worked out.

l The negotiations with regard to the TechnologyHandshakes will be finalized with the help ofIndian Language Industry Consortia.

MAIT Language Technology Consortium

Background

The MAIT – IIIT Bangalore Study (December1999) on potential for local language applicationsand software revealed that there existed a significantdemand for such applications (to the tune of Rs.500-

Technology Handshake

Page 9: tdiljan2002

Contents Page 7

600 crores). However, the market has been severelyconstrained due to lack of standards for characters,fonts and keyboard layout.

With increased focus on e-Governance by the Centreand the various State Governments it has becomecritical to develop standards for Indian languages aswithout standards it would not be possible to sharethe various databases. For e-Governance to succeed,it is necessary to promote and develop the marketfor local language applications, solutions andproducts.

With this background an increasing need was feltby the IT Industry in India to form a Consortiumthat would focus attention on development of LocalLanguage Technology standards and other issues ofmarket development.

The MAIT Consortium on Local LanguageTechnology since its inception in September 2001,has been actively co-ordinating various activities withthe Industry and the TDIL Programme of theDepartment of the Information Technology. Someof the highlights of the activities of Consortium arestated below:

(i) Recommendations for Modification ofUNICODE

A 2-day meeting of Consortium was held under theaegis of Ministry of Communications & InformationTechnology on 10th and 11th of September, 2001.UNICODE is a 16 bit standard and is by defaultemerging as the global standard for local languagecomputing. However, it needs to be modified incontext of Indian language computing. TheConsortium recommended modifications of fontlayouts of the following languages:

Devanagari (Hindi, Konkani, Marathi, Nepali,Sanskrit, Sindhi), Gurmukhi (Punjabi), Gujarati,Bengali (Bengali, Assamese, Manipuri), Oriya,Malayalam, Kannada, Telugu, Tamil, Arabic (Urdu,Sindhi, Kashmere).

The Ministry of Communications & InformationTechnology is a voting member of the UNICODEConsortium and will take up these recommendationswith the UNICODE Consortium.

(ii) Font Standards for Indian Languages

The Consortium has been able to successfullydevelop fonts standards for most of the Indianlanguages, an issue that has been pending for thepast 10 years. Bi-lingual and mono-lingual fontlayouts have been developed for Devnagari, Gujarati,Malayalam and Punjabi. The font layouts for theselanguages have been developed by taking intoaccount the support required for Linux. Font layoutsfor Bengali, Assamese and Oriya are underdevelopment while layout standards for Tamil andKannada have already been developed by the StateGovernments of Tamil Nadu and Karnatakarespectively.

The Consortium alongwith the Ministry ofCommunications & Information Technology will takeup this font standards with respective StateGovernments for evolving a national consensus andearly adoption of the same.

(iii) Transfer of Technology

At the Language Technology Business Meet 2001organized by the Ministry of Information Technologyon 7-8 November, 2001, 40 MoU’s were signed byvarious companies for transfer of technology. TheMAIT Local Language Technology Consortium wasgiven the responsibility to facilitate and interfacebetween the organizations for delivery of thetechnologies. Follow-up with the companies revealedthat most of these technologies are in an advancedstage of development and would be ready for transfersoon.

Contact:Mr. Vinnie MehtaPHD House, 4th FloorOpp. Asian Games VillageNew Delhi – 110 016.Ph: 011-6855487/6854284/6866976Fax: 011-6851321E-mail: [email protected]: http://www.mait.com

Page 10: tdiljan2002

Contents Page 8

1. Technology DevelopedOptical Character Recognition System forDevanagari

ä Technology DeveloperIndian Institute of TechnologyDept. of Industrial and ManagementEngineeringKanpur -208016Tel: 0512-597743 Fax: 0512-597553Dr. Veena [email protected]

ä Technology DescriptionThe system works with the help of a Scanner.User puts a piece of paper document printed inDevanagari (Hindi) script under the scanner, runsthe OCR software and gets all the text from thatdocument available inside the computer just as ifit was typed in. The data is stored in ISCII code.The system is developed using C programminglanguage. The technology can be used withLINUX platform. It can be easily ported toWindows platform. The OCR software can beintegrated with a Hindi Speech Synthesis Systemto make a Text to Speech system in Hindi. It canbe used as front end for a Machine AidedTranslation System. Potential beneficiaries areNewspaper (printed in Devanagari script) Houses,Libraries, Offices looking for office automation,Linguistic Community (for creating Corpus),Blind People, etc.

ä Technology Recipientsl Shri Ashish Pandey,

Maitrey Infonet Private Limited 34/1, Lower Ground Floor, Yusuf Sarai

New Delhi 110016Tel: 6535915, [email protected] : Optical Character RecognitionSystem for Devanagari

l Shri Manoj R. AnnaduraiChennai Kabigal,2, Reddy Colony, RamalingapuramChennai 600012Tel: 6449139/[email protected] : Optical Character RecognitionSystem for Devanagari

Transfer of Technology Handshakes

2. Technology DevelopedOCR system for Devanagari

ä Technology DeveloperIndian Statistical InstituteComputer Vision and Pattern Recognition Unit203,Barrackpore Trunk Road, Kolkata 700035Tel: 033-5778085, 5777694Fax: 5776680, 5773035Prof. B.B. Chaudhary, [email protected]&Technology PromoterCentre for Development of Advanced ComputingPune University Campus, Ganesh Khind RoadPune-411007Tel: 020-5694060(D)Fax: 5694059Shri Aditya Gokhale

ä Technology DescriptionThe system works with the help of a Scanner.User puts a piece of paper document printed inDevanagari (Hindi) script under the scanner, runsthe OCR software and gets all the text from thatdocument available inside the computer just as ifit was typed in. The data is stored in ISCII code.The system is developed using C programminglanguage. The technology can be used withLINUX platform. It can be easily ported toWindows platform. The OCR software can beintegrated with a Hindi Speech Synthesis Systemto make a Text to Speech system in Hindi. It canbe used as front end for a Machine AidedTranslation System. Potential beneficiaries arelike Newspaper (printed in Devanagari script)Houses, Libraries, Offices looking for officeautomation, Linguistic Community (for creatingCorpus), Blind People, etc.

ä Technology Recipientsl Shri N. Srikumar

Pyramid Cyberway Pvt Ltd.909, Ansal Bhawan16, Kasturba Gandhi MargNew Delhi 110001Tel: 3314156, 3351804, 3355552Fax: [email protected] : OCR system for Devanagari

Page 11: tdiljan2002

Contents Page 9

3. Technology DevelopedMAT (English to Hindi) : Machine AidedTranslation System

ä Technology DeveloperIndian Institute of TechnologyDepartment of Computer Science & EngineeringKanpur-208 016 Tel: 0512-598254(R),597174(O)Fax:0512-597553Dr. Ajay [email protected]

ä Technology DescriptionIt is a Machine aided Translation System based onAnglaBharti approach of IIT, Kanpur. TheAnglaBharti is a rule based system which currentlycan handle simple sentence translation from Englishto Hindi. The lexicon of 20000 words relating toHealth and IT domain has been integrated into thissystem. Besides this, it can also handle generalpurpose translation. User friendly Pre-processor andPost-editing modules are being integrated. Thesystem has been developed using ISCII Code. It isbeing expanded to handle complex Sentences.Efficiency of the system can be drastically improvedby integrating customised domain specificdictionaries. It can be integrated into manycommercial products. Installation Guide and User’sGuide is ready. Patent is being applied by IIT,Kanpur. Feed back from Beta Site testing at CentralTranslation Bureau (CTB), New Delhi is beingcontinuously used to improve the system.Potentialbeneficiaries will be like Translators, Linguists, andGovt. Offices, CTB and other Translation Units.

ä Technology Recipientsl Shri N. Srikumar

Pyramid Cyberway Pvt Ltd.909, Ansal Bhawan16, Kasturba Gandhi Marg, New Delhi 110001Tel: 3314156, 3351804, 3355552Fax: [email protected] : MAT (English to Hindi)

l Shri Kaushal Pandey Crystal Hues Ltd546-A, Chiragh Delhi,New Delhi 110017Tel: 6226255, 6401506, 6425710Fax: [email protected] : MAT (English to Hindi)

4. Technology DevelopedMAT (English to Hindi) : Machine AidedTranslation System

ä Technology DeveloperCentre for Development of Advanced ComputingPune University Campus, Ganesh Khind RoadPune-411007Tel: 020-5694060(D)Fax: 5694059Shri Mahendra Kumar [email protected]

ä Technology DescriptionMANTRA, Machine assisted Translation tool: Ittranslates the English text into Hindi in a specifieddomain of Personal Administration, specificallyGazette Notifications, Office Orders, OfficeMemorandums and Circulars. The strategyadopted in ManTra is Not Word To Word……..Not Rule To Rule But Lexical Tree To LexicalTree. Mantra uses Lexicalized Tree AdjoiningGrammar (LTAG) formalism to represent theEnglish as well as Hindi Grammar. The storagecode is ISCII. The MANTRA Technology isbeing expanded for translating the English textsinto other Indian languages such as Gujarati,Bengali, and Telugu. Potential beneficiaries areTranslators, Linguists and Govt. Offices, CentralTranslation Bureau and other Translation Units.

ä Technology Recipientsl Shri N. Anbarasan

Chief Executive OfficerApplesoft (Software developer for IndianLanguages)No. 39, 1st Cross,1st Main Shivanagar,W CRoad,Banglore – 560 010Phone 080-3386167 Fax [email protected] : MAT (English to Hindi)

l Shri N. SrikumarPyramid Cyberway Pvt Ltd.909, Ansal Bhawan16, Kasturba Gandhi MargNew Delhi 110001Tel: 3314156, 3351804, 3355552Fax: [email protected] : MAT (English to Hindi)

Page 12: tdiljan2002

Contents Page 10

5. Technology DevelopedMaTra: Human Aided Machine TranslationTool from English-Hindi

ä Technology DeveloperNational Centre for Software TechnologyGulmohar Cross Road No. 9, Juhu,Bombay – 400 049Tel. 022-7579935/7812120(D)Fax: 022-6232195/6210139Shri Durgesh [email protected]

ä Technology DescriptionA software system for translating English toHindi. There are two versions of the MaTra basedon the amount of interaction they expect fromthe user. MaTra Lite–Fully Automatic On LineTranslator, it is simple web based interface andMaTra Pro- Professional Translators Tool withAuto, Semi-Auto and Manual Modes, GUI andCustomizable lexicon. At present it supportsNCSTs format and unicode. It can be made tosupport ISCII also. The system GUI is designedin Java for portability. In addition to the internalsystem lexicon, there is a user-defined lexicon,which can be enhanced and modified by user.Potential beneficiaries are like Media NewsAgencies, Translation Bureaus and EducationalInstitutions involved in long distance and OnlineEducation.

ä Technology Recipientsl Shri N. Anbarasan

Chief Executive OfficerApplesoft (Software Developer for IndianLanguages)No. 39, 1st Cross,1st Main Shivanagar,W CRoad,Banglore – 560 010Phone 080-3386167 Fax [email protected] : MAT (English-Hindi) MaTra:Human Aided Machine Translation Tool fromEnglish-Hindi

6. Technology DevelopedAnusaaraka – Text to Text converter from oneIndian Language to other &Telugu Spell Checker

ä Technology DeveloperUniversity of HyderabadDepartment of Computer and Information SciencesUniversity of Hyderabad, P.O. Central UniversityHyderabad-500 046Tel: 040-3010500(O),3010846(R)Fax: 040-3010145/[email protected]

ä Technology DescriptionAnusaaraka is a computer software which renders textfrom one Indian language into another. It producesoutput which is comprehensible to the reader,although at times it might not be grammatical. Forexample,a Telugu to Hindi anusaaraka can take aTelugu text and produce output in Hindi which canbe understood by a Hindi reader, but which is notfully grammatical. Therefore, the reader will requiresome amount of training for reading the output.Anusaarakas have been built from Telugu, Kannada,Bengali, Marathi,and Punjabi to Hindi. Beta versionsof all of these have been released for use over theinternet as e-mail servers. The storage code is ISCII.It is open source code under GPL therefore users caneasly adopt it for their use. Anusaaraka can be usedin various scenarios. For example, A reader might beaccessing a web site containing Indian language texts.He comes across a site of interest, and wants to readmaterial on it. However, he does not know thelanguage. He can run anusaaraka and read the text.Normally, the reader motivation is high and he iswilling to put in some effort.

ä Technology Recipientsl Shri Manoj R. Annadurai

Chennai Kabigal,2, Reddy Colony, RamalingapuramChennai 600012Tel: 6449139/[email protected] : Anusaaraka, Spell CheckerComponents (Telugu & Kanada)

l Dr. M.N.CooperThe Managing DirectorModular Infotech Pvt. Ltd.26A, Electronic Cooperative Estate,Pune - Satara Road Pune 411 009Tel. 020- 4223342/4226614/[email protected] : Telugu Spell Checker

Page 13: tdiljan2002

Contents Page 11

8. Technology DevelopedSpeech Processing Technology

ä Technology DeveloperTata Infotech Ltd.,Technology Cognitive Systems Research Lab,Sanpada,Navi Mumbai - 400705.Tel. 022-7903251/56, 022-7682379 Ext-127Fax: [email protected]

ä Technology DescriptionIsolated word recognition system specially tunedfor Indian speakers. Easily scalable for largernumber of words. Useful for fixed vocabularyspeech recognition applications such as speechenabled Interactive Voice Response (IVR)systems, Telebanking, tourism information kiosk,airline reservation, voice portals, voice commands’

recognition for automobile control etc.

ä Technology Recipientsl Prof. (Ms) Sanghmitra Mohanty

Department of computer science & applicationVani Vihar,Utkal UniversityBhubaneshwar – 751 004Tel. 0674-580216 (O), 540865(R)Fax: [email protected] : Speech Processing Technology

7. Technology DevelopedText to speech processing system for OriyaOriya Spell Checker

ä Technology DeveloperUtkal University, BhubaneshwarDepartment of Computer Science &ApplicationsVani Vihar, Utkal UniversityBhubaneshwar – 751 004Tel. 0674-580216(O), 540865(R)Fax: 0674-581850Prof. (Mrs) Sanghmitra [email protected]

ä Technology DescriptionIt is Oriya Speech Recognition System, it convertsOriya Text into Speech. In it Pulse CodeModulation technique is applied to speech signalsthen least mean square technique is applied tothe signals to find the desired parameter and thusthe speech database is prepared. For normalizationaverage of a set of samples is taken to standardizethe database. For data base creation pure phoneticnature of Oriya language is taken into accountand thus words and sentences are uttered. Thestorage code is ISCII. Potential beneficiaries arelike Blind people, Illiterate people.

ä Technology Recipientsl Shri Ambika Prasad Das

Avon Technologies,6-3-563/26/A, Somavarapu Heights,Hilltop Colony, ErramanzilHyderabad 500082 A.P.Tel: 040-6580836/37Fax: [email protected] : Text to speech processing systemfor Oriya

l Dr. M.N.CooperThe Managing DirectorModular Infotech Pvt. Ltd.26A, Electronic Cooperative Estate,Pune - Satara RoadPune 411 009Tel. 020- 4223342/4226614/[email protected] : Oriya Spell Checker & TTS

Page 14: tdiljan2002

Contents Page 12

10.Technology DevelopedHindi Speech Recognition System & Adaptation forBengali

ä Technology DeveloperMegasoft India921, Sector-14, SonipatHaryana-131001Tel : 01264-47807 Fax : 01264-42583Shri Anil [email protected]

ä Technology DescriptionHindi dictation and PC control is dynamic speechrecognition application for two languages, Hindi (firsttime) and Indianized English by Megasoft India. It iscapable of taking continues dictation and controllingPC functions in both languages by simple phrasesand commands which can be extended to any limitlike telephony-PC communication, voice querysystems and lots more using VB, VC++ applications.Hindi dictation and PC control uses L&H DragonNaturally Speaking professional platform. It supportsUnicode. Many speech applications can be made withbusiness potential using Hindi/English speechrecognizer. Present research work regarding vocabularyand Hindi language (context and vocabulary) can beimplemented on other recognizer platform veryquickly with very effective results.

ä Technology Recipientsl Shri Vivek Siegell

HCL Info Systems Ltd.E 4, 5 &6, Sector 11Noida 201301 UPTel: 4550862 (D), 4520977 Extn. 3208Fax: 4533877, [email protected] : Hindi Speech Recognition System

l Shri N. AnbarasanChief Executive OfficerApplesoft (Software developer for Indian Languages)No. 39, 1st Cross,1st Main Shivanagar,W C Road,Banglore – 560 010Phone 080-3386167Fax [email protected] : Hindi Speech Recognition System

l Biswajit SahaER&DCI,Plot E-2/1, Block GP, Sector-VBidhannagar, Kolkata 700091Tel: 3579846, 3575989, 3573581Fax: [email protected]@giascl01.vsnl.net.inTechnology : Adaptation for Bengali

9. Technology DevelopedHindi Speech Recognition System

ä Technology DeveloperIBM India Research LabBlock – I, IIT Campus,Hauz Khas, New Delhi-16Tel. 6861100 Fax : 6861555Shri Ashish Verma &Dr. P.V. Kamesan, Senior [email protected]

ä Technology DescriptionThis is a Hindi Speech recognition system for alarge vocabulary speaker-independent dictationtask. For any given language, the computer firstneeds to learn the sound of spellings in variouscontext which is a training phase. After this systemis ready for speech recognition. Complex signalprocessing and statistical techniques are used tomake the recognition robust to speaker speechvariations and to make it work on continousspeech of a large vocabulary. Compliant withIBM ViaVoice Standards will soon be usingUnicode standard. The system can be customizedto different tasks for further improving theaccuracy. Expandability (in terms of increasingnew words to the system) is available. Potentialbeneficiaries are Hindi speaking population.

ä Technology Recipientsl Shri N. Anbarasan

Chief Executive OfficerApplesoft (Software developer for IndianLanguages)No. 39, 1st Cross, 1st Main Shivanagar, W CRoad,Banglore – 560 010Phone 080-3386167Fax [email protected] : Hindi Speech Recognition System

l Shri Achinto RakshitCyberspace Multimedia,101, Mahalakshmi Mansion941, 21st Main Road, 22nd ‘A’ CrossOpp: BDA Complex Banashankari 2nd stageBangalore 560070Tel: 080-6710925, [email protected] : Hindi Speech Recognition System

Page 15: tdiljan2002

11.Technolog y DevelopedMorphological Analyser and GeneratorsTTS for TamilSpell Checker for Tamil

ä Technology DeveloperAnna UniversitySchool of Computer Science & EngineeringChennai - 600 025Tel. 044-2351723 Fax: 044-2350397Dr. T.V.Geetha/Ms. Ranjani [email protected]

ä Technology DescriptionThis software is a complete morphological Generator& Aanalyzer for Tamil nouns, verbs, adjectives andProposition. Morphological Generators is one of the twoimportant tools for language processing. Morphologicalanalyzer is another basic tool required for languageprocessing. Given any word or a group of words, itanalyzes the word and determines the root and all theother add ones that it has taken. The code has beendeveloped using Java. It uses a specially designed internalcode instead of storing string. The data storage isUnicode compatible. The system is developed in Javahence fully portable. The tool can be used in any of theavailable platforms. The tool is developed using objectoriented concept so it can be expanded as perrequirement. This tool is a collection of modules suchas verb generator and noun generator. Modules can beadded to this tool and also it can be added as a moduleto any other language tool which needs it.

ä Technology Recipientsl Shri N. Anbarasan

Chief Executive OfficerApplesoft (Software developer for Indian Languages)No. 39, 1st Cross,1st Main Shivanagar,W C Road,Banglore – 560 010Phone 080-3386167Fax [email protected] : Morphological Analyser and Generators

l Shri Manoj R. AnnaduraiChennai Kabigal,2, Reddy Colony, RamalingapuramChennai 600012Tel: 6449139/[email protected] : Spell Checker for Tamil

l Dr. M.N.CooperThe Managing DirectorModular Infotech Pvt. Ltd.26A, Electronic Cooperative Estate,Pune - Satara RoadPune 411 009Tel. 020-4223342/4226614/[email protected] : Spellchecker & TTS for Tamil

Contents Page 13

12.Technology DevelopedText to Speech Synthesis System for Hindi,Speech Recognition & Voice Synthesizer

ä Technology DeveloperCentral Electronics Engineering Research InstituteCSIR Complex,NPL Campus, Hillside Road,New Delhi-110 001Tel. 011-5781467/5783172 Fax: 5788347Dr. S. S. Aggarwal, Scientist G and [email protected]

ä Technology DescriptionThe HindiVani is the Windows based software forconverting Hindi Text files into Speech. The text docu-ment is generated using Hindi editor, which supportsISCII standard. The input words are split into syl-lables, using a parser. An acoustic-phonetic databaseof all these syllables is available in the database, whichis subsequently used to create words. The concatena-tion of syllables into words and the superimpositionof quality features is done by developing rules. A cas-cade-parallel format synthesizer developed at CEERIis used to synthesize the speech. The storage code isISCII. The system can be used with any Pentium ma-chine with Windows Operating System and multi-media facility. This system can be expanded to otherspoken languages of India. It can be integrated withOCR system. It is very useful product for Hindispeaking visually handicapped people, Informationretrieval in spoken form, Text Reading Machines.

ä Technology Recipientsl Prof. (Ms) Sanghmitra Mohanty

Department of computer science & applicationVani Vihar, Utkal UniversityBhubaneshwar – 751 004Tel. 0674-580216 (O), 540865(R)Fax: [email protected] : TTS for Hindi

l Shri Kanta VermaAarkay Computer Research Foundation,8/53, Birbal Road, Jangpura Extension,New Delhi – 110014Phone 6440940/1/2/3 Fax: [email protected] : TTS for Hindi

l Dr. Mukul Kumar SinhaPresident,Expert Software Consultants Pvt LtdAlmora BhawanC-17 South Extension Part I

Page 16: tdiljan2002

Contents Page 14

New Delhi 110 066Tel. 011- 4642675/[email protected] : TTS for Hindi

l Shri Achinto RakshitCyberspace Multimedia,101, Mahalakshmi Mansion941, 21st Main Road,22nd ‘A’ CrossOpp: BDA Complex Banashankari 2nd stageBangalore 560070Tel: 080-6710925, [email protected] : TTS for Hindi

l Shri Keyur ShroffNational Centre for Software Technology,Gulmohar Cross Road No. 9, Juhu,Bombay – 400 049Tel. 022-6201606/6201574/6249817022-7579935/7812120(D)Fax: 022-6232195/[email protected] : TTS for Hindi

l Dr. M.N.CooperThe Managing DirectorModular Infotech Pvt. Ltd.26A, Electronic Cooperative Estate,Pune - Satara RoadPune 411 009 Tel. 020- 4223342/4226614/[email protected] : TTS for Hindi

l Shri Ambika Prasad DasAvon Technologies,6-3-563/26/A, Somavarapu Heights,Hilltop Colony, ErramanzilHyderabad 500082 A.P.Tel: 040-6580836/37Fax: [email protected] : Hindi TTS & Speech Recognition

l Shri N. AnbarasanChief Executive OfficerApplesoft (Software developer for Indian Languages)No. 39, 1st Cross, 1st Main Shivanagar,W C Road,Banglore – 560 010Phone 080-3386167Fax [email protected] : Vice Synthesizer

13.Technology DevelopedLinux with Indian Language SupportISPELL

ä Technology DeveloperIndian Institute of TechnologyDepartment of Computer Science &EngineeringKanpur-208 016Tel: 0512-597652Fax: 590725Prof. Rajat [email protected]

ä Technology DescriptionUsing this software it is possible to execute anytext based utilities in Unix environment and usethe Indian language support. It is possible to namethe files and see the directory listing in Hindi.The Hindi files can be edited using terminal basedstandard Unix utilities such as vi, sed,etc. Theconfiguration files supplied with the ITERM arewritten to support ISCII files, Inscript Key boardlayout and Devanagari TeX fonts. Characters canbe coded using ISCII or any other standard. It ispossible to support the ISFOC fonts or any otherfonts. It supports inscript/ phonetic keyboardlayout. Few configuration files can be written sothat it can be used with wide variety of scriptsother then Devanagari. Potential beneficiaries areUsers of Unix platform.

ä Technology Recipientsl Shri V.K. Gupta

VXL INSTS LTD.IInd floor, No. 1 Mohammadpur,Behind Bhikaji Cama PlaceNew Delhi 110066Tel: 6190061/62/63 Fax: [email protected] : Linux with Indian Language Support

l Shri Manoj R. AnnaduraiChennai Kabigal,2, Reddy Colony, RamalingapuramChennai 600012Tel: 6449139/[email protected] : ISPELL

Page 17: tdiljan2002

14.Technology DevelopedINDIX (Localized Linux)

ä Technology DeveloperNational Centre for Software TechnologyGulmohar Cross Road No. 9, Juhu,Bombay – 400 049Tel. 022-7579935/7812120(D)Fax: 022-6232195/6210139Shri Keyur [email protected]

ä Technology DescriptionGraphical User Interface in Indian Languages forLinux Operating System. Indic Script Shaping Engineand open Type Font support has been built into theSystem. System is supporting Unicode encoding andbackward compatibility is provided through UTF-8encoding. The system has Unicode support at the corelevel, so it is highly portable. The system can belocalized for other Indian Languages apart fromDevanagari. Installation guidelines and User’s guideto use and configure the system is ready. It is beingdeveloped under public domain (GPL model). Alphatesting is in Progress. Potential beneficiaries includeend users interested in Linux with localizedApplications

ä Technology Recipientsl Prof. K. Narayan Murthy

Dept. of Computer Science & Information ScienceUniversity of HyderabadP.O. Central UniversityHyderabad 500 046 (AP)Tel. 040-3010500/518 Extn. 4017 (O),3010064(D), 3010374(R)Fax: 040-3010120, [email protected] : INDIX (Localized Linux)

l Shri N. AnbarasanChief Executive OfficerApplesoft (Software developer for IndianLanguages)No. 39, 1st Cross,1st Main Shivanagar, W C Road,Banglore – 560 010Phone 080-3386167 Fax [email protected] : INDIX (Localized Linux)

l Shri V.K. GuptaVXL INSTS LTD.IInd floor, No. 1 Mohammadpur,Behind Bhikaji Cama PlaceNew Delhi 110066Technology : INDIX (Localized Linux)

Contents Page 15

15.Technology DevelopedPunjabi Spell CheckerOfficial Dictionary

ä Technology DeveloperThapar Institute of Engineering & TechnologyDepartment of Computer Science &Engineering(Deemed University), Patiala 147 001Tel: 0175-214868, 393137Fax: 0175-214498, 216391, 212012, 212002Dr. G.S. LehalTel:393374(D),283502(R)[email protected]

ä Technology DescriptionPunjabi spell checker can be integrated with aword processor. It also supports both phoneticand typewriter keyboard layout for Punjabi. Spellchecker can operate on all documents typed inboth Punjabi and English. The system encodesthe output in ASCII encoding format. The systemencodes the output in ASCII encoding format.The technology can be used with any windowsplatform. Potential beneficiaries are likeGovernment sector, Typist, Educationist.

ä Technology Recipientsl Dr. M.N.Cooper

The Managing DirectorModular Infotech Pvt. ltd.26A, Electronic Cooperative Estate,Pune - Satara RoadPune 411 009Tel. 020- 4223342/4226614/[email protected] : Punjabi Spell Checker & OfficialDictionary

l Shri Manoj R. AnnaduraiChennai Kabigal,2, Reddy Colony, RamalingapuramChennai 600012Tel: 6449139/[email protected] : Punjabi Spell Checker

Page 18: tdiljan2002

2.3 Intellectual Property Rights (IPR)Email : [email protected] Tele:4363648, 4363123

xÉ Ê½þ YÉÉxÉ äxÉ ºÉoù¶É Æ {ÉÊ´ÉjÉʨɽþ Ê´ÉtiÉ ä*iÉiº´ÉªÉÆ ªÉÉäMɺÉÆʺÉrù& EòɱÉäxÉÉi¨ÉÊxÉ Ê´ÉxnùÊiÉ` ** xhrk 4.38**

“On earth, there is no purifier as great as knowledge,he who has attained purity of heart through a prolongedpractice of Karmayoga automatically sees the light oftruth in the self in course of time (Geeta 4.38)”

Ministry of MC&IT intends to increase awareness aboutthe IPR matters amongst all class of concernedprofessionals so as to make India participate on equalfooting with Member States of the World TradeOrganisation (WTO) since it has joined the Trade RelatedIntellectual Properties (TRIPs) Agreement and also thePatent Cooperation Treaty (PCT) Geneva. The IPRCell in Ministry of Communications and InformationTechnology is pursuing the following objectives:

(i) Promotion of IPR activities in the field of electronicsand Information Technology

(ii) Create awareness and provide promotional &facilitator support

(iii) Provide value-added patent information support–search and analysis–to assist technology Assessment,Development, Acquisition and Investment Decisions.

(iv) Respond to IPR needs of digital era.

It is a Single Window Facility in MC&IT which assistsS/W Developers, Scientists, Researchers, R&D Managers,Chief Executives of IT and Electronics Units, PolicyMakers and Funding Organisations for Protection of allIntellectual Property Rights in India and Abroad forElectronics and Information Technology for obtainingregistration of :wS/W Copyrights wPatents wTrademarks wDesigns

It is evolving plans, developing Tools and infrastructurefor international/national IPR protections, facilitation,providing services of IPR Awareness, TechnologyAssessment and Alert based on Patent Search Services inElectronics and IT (TAPs), and is responding to the needsof the digital era.

IPR Cell, MCIT has already filed 43 Patents, 32copyrights, 9 Trade Marks and one Design to assist itsPSUs, Scientific Societies and Grantee Institutions secureIPR Protection for meaningful transfer of technology inthe current products & services. It has ventured to evolvesuitable forms and procedures for research organisationsto disclose their patents, licence the patents andcopyrights, enter into a contract with industry &Govt.collaborations and related services required by theinventors/creaters of IPR on one hand and industry/service organisations/society on the other hand.

16.Technology DevelopedHindi Encyclopedia (Vishwakosh)

ä Technology DeveloperElectronics Research & Development CentreC-56/1, Anusandhan Bhawan, Sector-62,Noida – 201301Tel. 91-4587717-25 Fax: 91-4587726Shri V.N.Shukla (ERDCI/N)&Kendriya Hindi Sansthan (KHS)Agra-282005Tel. 91-562-530683/684 Fax:91-562-530159Prof. Thakur Das (KHS)&Nagari Pracharini SabhaVaranasiTel. 91-542-331277 Fax:91-542-331488Shri Sudhakar Pandey (NPS)

ä Technology DescriptionHindi Encyclopedia is the only encyclopedia pub-lished way back in 60’s by Nagari PrachariniSabha, Varanasi. It consists of 12 volumes of datacovering almost all details of 1500 topics fromvarious fields of life. The task of digitizing theinformation & putting on CD and internet wasassigned to ERDCI/N & KHS Agra, by MIT &MHRD as a joint project. The information hasbeen made available in such a way that one canfind the information in alphabetical way or bycategories. The storage code is ISCII. Expand-able to other domains/ Lexicons. The system canbe integrated with various windows based wordprocessors. Useful to Translators, Linguists andOffice Assistants, Educational Institutes and or-ganization working in Hindi.

ä Technology Recipientsl Shri N. Srikumar

Pyramid Cyberway Pvt Ltd.909, Ansal Bhawan16, Kasturba Gandhi MargNew Delhi 110001Tel: 3314156, 3351804, 3355552Fax: [email protected] : Hindi Encyclopedia

Contents Page 16

Page 19: tdiljan2002

2.4 International Symposium on“Language in Cyberspace”

on 26-27 September-2001 at Seoul, Korea

Knowledge and information sharing through theglobal information networks by every country andevery community in the world is of vital importancefor the economic participation, Social cohesion andenrichment of linguistic and cultural diversity ofmankind;

Multilingualism, is of strategic importance to ensurethe right to freedom of opinion and expression, theright to participate in the cultural life of thecommunity and to have access to information;

The ICT (Information and CommunicationTechnologies) can play an important role inextending and preserving multilingualism forsustainable economic growth, human advancementand cultural diversity.

UNESCO, General Conference at its 31st session(October,2001) got inputs from regional workshopson, “ the promotion and use of multilingualism anduniversal access to cyberspace” and theimplementation of an intersectional programme, theInitiative B@bel. The international symposium on“ Language in Cyberspace” was organized by KoreanNational Commission for UNESCO on 26-27 Sept-2001, at Seoul. Dr. Om Vikas, MCIT participated andpresented the country paper on Language Technology.

The Recommendations of the Symposium :

I. Policy challenges of multilingualism : bridgingthe language divide

l Access to the enormous legacy of knowledgeavailable in cyberspace is too often limited by theuse of languages not known to the user. Thiscauses a disparity in the access to informationbetween those who speak that language and thosewho do not, a “ language divide”. Although thetrend is not a new one, it is accelerated by theuse of new technologies leading to the“globalization” of the most important languagesof the world to the detriment of a large numberof less-used languages. Expertise and resourcesmust be mobilized in the Asia-Pacific region toformulate and implement a coherent strategy toaddress this issue, such as use of vernacular languagesand the development of translation technologies,

not forgetting other means of communications.

l Another phenomenon, not new by itself, butincreasing exponentially with the introduction ofmultimedia technologies of communication is thegeneral decrease in linguistic diversity in theworld. Languages are often used as instrumentsof division between the people. Their survivaldepends on their capacity to resist the dominanceand standardized use of other languages. Anyloss of language is an impoverishment of globalcultural heritage and research capacity. Strategiesshould be designed to put ICT to the service ofpreservation and “ resurrection” of languages indanger of disappearance.

l The absence or insufficiencies of national policieson the use of languages on the global networksneed also to be looked into in order to increasethe number of vernacular languages and reinforcethe development of national standards for internalrepresentations, character glyphcoding andtranscription schemes. In this connection, therespect and use of all languages in cyberspaceshould be reaffirmed in the different national andinternational formally adopted texts andcompatible international norms and principlesshould be adopted to increase their accessibilityon-line;

l Issues of ownership of translated and publishedworks on-line must also be studied and principlesmeeting general consensus must be derived inorder to facilitate access to these works whilepreserving the authors rights;

l The policies for the advancement ofmultilingualism in cyberspace must beunderpinned by the development of languageeducation and ICT user training strategies andtutorial materials freely accessible on-line andresponding to the demands of the informationsociety. Policies, support and incentives areneeded to provide all citizens with the opportunityto nurture linguistic and ICT literacy throughformal, informal and life-long education;

l The challenge of bridging the language divide inthe access to and use of ICT cannot be effectivewithout the full participation of the developingcountries of the region and consideration of theirparticular constraints and needs: Member statesshould plan action to bridge language divide and to

Contents Page 17

Page 20: tdiljan2002

Contents Page 18

harmonize these individual efforts into aconsolidated worldwide approach and to ensure theprovision of freely accessible translation portals;

l Machine translations require the creation ofnumerous aid tools such as specific vocabulariesand multilingual dictionaries, virtual terminologynetworks and terminology data banks. On-linespecialized glossaries, lexicons multilingualindexing of contents, MT software dedicated tothe less common languages, etc. Countries of theAsia-Pacific region should support efforts bypublic and private sectors, internationalorganizations, as well as the civil society, to poolintellectual and financial resources in coremultilingual tools development projects;

III.Content challenges of multilingualism:Digitizing and providing e-contents

l Language diversity should be broadened byincreasing the number of languages and scriptson-line and by the creation of multilingual e-contents, websites and means of maintaining,accessing, retrieving and preserving them throughthe use ICT. All the public domain information(laws, regulations, statistics etc) locally relevantand informative to the citizens for their health,security and participation in their public lifeshould be accessible in the national and/or locallanguage. Countries of the Asia-Pacific regionshould adopt programmes to develop these freelyaccessible Web sites;

l Moreover, a great part of the national physical,written and oral heritage which is already freelyaccessible by all should be produced in digitisedform and made available through the applicationof ICT in the various local languages. Partnershipapproach involving governments, internationalorganizations, the private sector and NGOs shouldconsidered to build-up these massive content;

l In this regard, countries should mobilize resourcesto assist their major cultural institutions such aslibraries, archives and museums in preserving andmaking their collections accessible withappropriate measures of security in several languageson the global information networks, thoughmultilingual portal conception and digitisation.

l The creation of reliable, re-usable andinteroperable content being highly labourintensive and costly, most cost-efficient tools and

ensure promotion of knowledge creation capacities.

l In Particular, regional cooperation among nationsthat share cultural and linguistic similarities mustbe strongly encouraged by governments, regionaland international organizations such asUNESCO.

II. Technological challenges of multilingualism:ensuring language interoperability

l Language cannot be promoted on the globalnetworks without accessibility to the ICT andservices they can render. It is therefore importantto give due attention to the economic constrainsto the Internet connectivity particularly indeveloping countries; the internationalcommunity should support universal access totelematics networks and services as acontemporary of the human right.

l It is equally important that people at large haveaccess to the information available on thenetworks in their mother tongue. Countries ofthe Asia-Pacific region must be encouraged tosupport the development of multimediacommunity centres that would reach out allsegments of the society in their local language;

l Ensuring the “intercommunication” of languageson the networks is an issue for whichtechnological solutions are being actively sought.Numerous research and development projects arebeing carried out to ensure the automatic qualitytranslation of languages and machine use andrecognition of different scripts such as thedevelopment of smart fonts, cross-lingualconversion utilities, multilingual search engines,multilingual voice searching, speech recognitionetc. These efforts should be based on solid theoreticalfoundations and supplemented by objectivetranslation evaluation and accreditation systems.

l In particular, the lexicon, morphology, syntax andsemantic of different languages should bedescribed systematically in a manner suitable forautomatic word form recognition, parsing, andgeneration. Furthermore, these componentsshould be functionally integrated to modelmappings from language-specific surfaces touniversal content representations (hearer mode)and vice versa (speaker mode). International,regional and national governmental and non-governmental organizations should seek to

Page 21: tdiljan2002

l What is B@bel Initiative?

Public domain information is a global publicgood. With this in mind, UNESCO’s main goalconsists in redefining universal access toinformation in all languages in cyberspace byencouraging (1) the development of tools(translation mechanisms; terminology; protocols;etc.) that will facilitate multilingualcommunication in cyberspace (2) the promotionof fair allocation of public resources to publicinformation providers; and (3) the promotion ofaccess to multilingual public domain informationand knowledge.

The programme “Initiative B@bel” proposes todo this by implementing concrete activities atnational and international levels, with theobjective to develop multilingualism on theinformation networks and to encourage fullpartnership between governments, industry andcivil society. The programme could be orientedin several directions:

Creation of the infrastructure: establishment ofUNESCO Chairs, associating universities withindustry, for strengthening research in anddevelopment of multilingual search engines,multilingual gateways, virtual libraries andarchives, etc.;

Development of multilingual tools: adaptingmultilingual indexing of websites, thesauri,standards, lexicons and terminology existing inthe European Union, UNESCO, ISO, UNU,Union Latine, Infoterm, etc., to other languagesincluding local ones;

Strengthen interoperability: supporting thedevelopment of automatic translation tools,including the production of translation freesoftware, the application of translation schoolswork to the webpages, the on-line developmentof multilingual encyclopedia, upgrading ofrouters, etc.;

Formulation of national and international policiesand regulations: encouraging the use of manylanguages on the information networks, the on-line teaching of foreign languages in the educationsystems, the development of multilingual websites(with a web prize), etc.

methods should be used on the basis ofinternational standards for the preparation ofthese contents. It should include open systemdevelopment (i.e.CDS/ISIS software) with respectto data modeling and system design for net-baseddistributed collective work, their evaluation aswell as content validation, user empowerment andpromotion of international standards.

l Special attention should be given in the Asia-Pacific region to the issues arising from theprotection of the intellectual property rights andthe preservation of legal exemptions to thecopyright especially when it concerns reuse ofinformation in general and particularly thetranslated public domain information;

IV. International challenges : Promoting anddisseminating policy experiences.

l A reliable international comparative survey onthe use of language on the internet and moreparticularly on the related policies, norms andstandards adopted in different countries is muchneeded. International organization, in particularUNESCO, should work out a scheme for thecollection and sharing of such information.

l UNESCO should support mechanismspreventing exploitation of knowledge fromindigenous language e-contents.

l The UNESCO statistical Office and theUNESCO Observatory on the InformationSociety should be mobilized to collect, maintainand diffuse information on the multilingualresources and services produced in the countries andon existing policies related to these resources and todisseminate information on the best practices;

l The principles and measures envisaged in theRecommendation on the promotion and use ofmultilingualism and universal access to cyberspaceprepared within the UNESCO ProgrammeInformation for All reflecting the above-mentioned proposals should be given supportfrom the countries of the region; these principlesshould be presented during the World Summiton the Information Society to take place in 2003.

l The Asian-Pacific countries should participateactively in the implementation of thisrecommendation and take part in thedevelopment of Initiative B@bel throughintellectual and financial input.

Contents Page 19

Page 22: tdiljan2002

Contents Page 20

2.5 SCALLA 2001 “Sharing Capability inLocalisation and Human Language

Technologies”

on Nov. 21-23, 2001 at NCST, Bangalore

The SCALLA 2001 working conference was held atNCST, Bangalore on November 21-23, 2001.SCALLA is a joint project of NCST Mumbai, OpenUniversity UK and ISI Kolkata, with funding fromthe European Union under the Asia IT&C program.Its aim is to organize a series of 3 workshops inconsecutive years from 2001 to 2003, to bring togetherexperts in the field of localisation and human languagetechnology from Europe and South Asia, with a viewto exchange ideas in the field of localisation and humanlanguage technology, that have an important role toplay in bridging the digital divide.

SCALLA 2001 featured about 20 invited experts fromIndia and the UK with focus on the languages of theSouth Asian region.

Day I

Session 1 : “Introduction”Chair : Dr S.P. MudurProf. Pat Hall gave an overview of the conference.

Session 2 : “Localisation Needs”Chair : Mr S RamakrishnanA prototype of the Simputer was demonstrated whichis a low-cost portable and sharable device that can takeIT to the common man. Honey Bee Networks projectdisseminates information about grassroots innovationsin technology in several Indian states and languages.

Session 3 : “Localisation Practices”Chair : Prof. B B ChauduriDr Reinhard Schaler spoke about the current scenarioof localisation in Europe. Dr Mudur spoke about theissues and status of localisation of Indian languages.Prof. Pat Hall spoke about Software components andAPIs, which are useful technologies for localisation.

Session 4 : “Writing Systems, Input and Output”Chair : Prof. Pat HallProf. R K Joshi spoke about the uniqueness of theWriting Systems of India, and Prof. B B Chaudhuripresented OCR for Bangla.

Session 5 : “Cultural aspects of localisation”Chair : Dr OstlerThe main issue raised was the need to be aware of

cultural differences that merit investigation ofalternative paradigms of user interfaces, but it was feltthat this question has not yet been adequately studied.

Day II

Session 1 : “Language Models”Chair : Prof. B N PatnaikProf. Boyd Michailovsky spoke about developing alexicon and syntax for the Hoya Language of Nepal,and Prof. Harold Somers spoke about Developinglinguistic resources from corpus material.

Session 2 : “Language Generation”Chair : Prof. Rajeev SangalIn this session, Dr Donia Scott gave a presentation onMultilingual Natural Language Generation in specificdomains.

Session 3 : “Lexicography and Translation”Chair : Prof. U N SinghMr Durgesh Rao spoke about the issues in translatingbetween English and Hindi. This was followed by apresentation by Prof. Rajeev Sangal on building LexicalResources for Indian Languages, such as a freeelectronic English-Hindi dictionary.

Session 4 : “Lexicography”Chair : Prof. Harold SomersDr Niladri Sekhar Das spoke on the contribution oflanguage corpora on the development of dictionaries.Prof. U N Singh and Dr B Mallikarjun spoke onmaking a traditional dictionary into an electroniclexicon.

Day III

Session 1: “Speech and Literacy”Chair: Dr Reinhard SchalerDr Asoke Kumar Dutta spoke on Disbursing spokenlanguage technology in regional dialects, and DrGautam Sengupta talked on Voice-enabled MachineReadable dictionaries. Dr Roger Tucker presented apure speech personal digital assistant.

Session 2: “Concluding Session”Chair: Dr Om VikasDr Om Vikas highlighted the role of the IndianGovernment in promoting growth of Indian languagetechnologies, through the Technology Developmentfor Indian Languages (TDIL) initiative of theDepartment of IT. There is possibility of collaborationbetween universities in Europe and institutes in Indiato offer courses in Computational Linguistics. Finally,Prof. Pat Hall summed up the conference.

Page 23: tdiljan2002

2.6 Asia-Pacific Regional Consultation onUNESCO’s Medium Term Strategy for

“The Major Programme onCommunication and Information”

from 2002-2007,

on 18-19 December 2001 at New Delhi

Dr. Om Vikas, DIT and Dr. N. Vijayaditya, NICparticipated in the discussion workshop

Four areas of activities and priority deliverables wereidentified as follows:

l Contentl Accessl Capacity Buildingl Policy

Content

Challengesl Lack of content development templates/

guidelines

l Lack of content development experiences

l Lack of tools to digitize local content.

Actionsl Develop content templates/guidelines for

community needs

l Share content development experiences

l Identify and promote technology to provide toolsto digitize local content.

l Form partnerships for content development andsharing.

l Partners: Extension services (Education,Agriculture, Health) Govt. Agencies, NGOs,Community organizations and leaders, Mediaorganizations, Professional organizations, Privateentrepreneurs.

Flagship Programmesl Developing Local language processing capability.

Access

Challengesl Lack of connectivity (telecom, internet) High cost

of connectivity and access.

l Lack of appropriate low-cost devices (eg.Computers, FM, radios, energy sources)

l Lack of policy support and political will forproviding and ensuring universal access.

Actionsl Promote low cost devices (eg. Handhelds, FM

radio sets)

l Promote multi-purpose community centres andaccess points, especially for marginalized groups.

l Advocate universal access policies.

Flagship Programmesl Develop a business/operation model for

sustainable multipurpose community centres andTest this model by creating five communitycentres in different countries/cultures/languages/settings.

Capacity Building

Challengesl Lack of trained human resources in the field of

ICT,

l Lack of localized tools, technologies andmethodologies in the field of ICT,

l Lack of adequate financial resources for ICTenabled development

l Lack of awareness and management of changes—economic, political and social

l Lack of awareness among professionals anddecision-makers about the role of ICT indevelopment.

Actionsl Training of trainers

l Training and retraining human resources in thefield of ICT at various levels.

l Promoting ICT for high quality e-skillsdevelopment

l Organizing dialogues on ethics and socialresponsibility in ICT.

Contents Page 21

Page 24: tdiljan2002

2.7 The Asia PacificDevelopment Information Programme

(APDIP) - UNDPAPDIP seeks to promote and establish informationtechnology (IT) for social and economicdevelopment throughout Asia-Pacific. TheProgramme serves 42 countries.

APDIP is funded by the United NationsDevelopment Programme (UNDP) andimplemented by the UN Office for Project Services(UNOPS), Asia office.

Information Technology for Developing Countries

IT supports social and economic development by:

(i) Reducing geographic isolation;

(ii) Providing access to information and knowledgeresources and a means for exchanging information;

(iii) Creating opportunities for expanded trade andeconomic growth; and,

(iv) Enabling greater participation of Civil societyand transparency in governance.

APDIP Strategies

w Building capacity at all levels: APDIP sensitisesdecision-makers and assists in developing anenabling environment for IT and providestechnical training.

w Identifying IT champions: That provide a visionfor what IT can do, and have the influence tomake it happen.

w Identifying new opportunities for developmentis critical as countries implement IT/ Internet-related services.

w Co-operating with the private sector helps spurinvestment in the development world.

w Encouraging South-South partnerships as oneway of building capacity.

w Researching appropriate technologies tominimise access costs and extend the reach of theInternet to rural areas of developing countries.

Capacity Building

Seminars on IT Policy and Infrastructure Development

The APDIP-Cisco Networking Academies

An innovative partnership with Cisco Systems

l Sharing of basic tools, technology andmethodologies for ICT enabled development

l Promoting national and internationalpartnerships to respond to various challenges inthe field of ICT.

l Establishing UNESCO collaborative virtualinstitute for ICT enabled development.

Flagship Programmesl Training the trainers for ICT for development—

equitable, sustainable and peaceful

l Integrating ICT into curricula in schools anduniversities.

l Promoting ICT enabled entrepreneurship

l Colloboratable Virtual Institute for ICT enabledDevelopment.

Policies

Challengesl Lack of specific policies and strategies and

institutional infrastructure for implementation.

Actionsl Formulation of holistic and national & regional

policies & strategies. (HRD, Access andApplication) through participative process &enactment. Preparation of legal and institutionalframework for executing them.

Flagship Programmesl Assisting in formulation of national/sub regional

policy and promote sub regional and regionalcooperation.

l Development of guidelines for policy and strategyformulation.

Contents Page 22

Page 25: tdiljan2002

established to counter the severe shortage of networkspecialists in the Asia-Pacific.

The Mobile Internet Unit(MIU)

Equipped with Internet-ready computers, thiselectronic classroom on wheels travels to rural andmarginalised urban secondary schools, trainingteachers, and students alike on how to fully benefitfrom the global information infrastructure.

Workshops for Technical Personnel

APDIP frequently conducts WebmasterDevelopment Workshops.

Technical Assistance

Wiring the World: Building Connectivity

APDIP assists countries lacking an Internetconnection to design and establish low-cost links viasatellite so that applications of IT-telemedicine, distanceeducation, or e-commerce- can take off and prosper.

Networking People

APDIP assists regional organisation and countriesto improve their co-operation by creating andestablishment systems that are designed to meetspecific needs.

Hosting Information

One of APDIP’s major objectives is assisting otherdevelopment agencies to implement information services.

Research & Development

Expanding Connectivity to Rural Areas.

APDIP is working with a number of partners todevise ways of making satellite connections to theInternet affordable in rural areas.

Application Development

APDIP promotes customisation of software fordeveloping countries.

Global Issues

The Internet Governance Information Service

APDIP hosts the on-line Asia-Pacific InternetGovernance Information Service.

Contact :Asia Pacific Development Information Programme(APDIP) of UNDPP.O Box 12544,50782 Kuala Lumpur, MalaysiaTel:603-255-9122 Fax:603-2539740E-mail:[email protected] URL : www.apidip.net

2.8 The Workshop on Corpus-based NaturalLanguage Processing on 17-31 Dec 2001

at Anna UniversityThe Workshop on Corpus-based Natural LanguageProcessing was held from 17th Dec 2001 to 31st Dec2001 at AU-KBC Research Centre, MIT campus,Anna University jointly with The Language TechnologyResearch Centre, IIT Hyderabad, The National Centrefor Software Technology, Mumbai, The Resource Centerfor Indian Language Technology Solutions-Tamil, AnnaUniversity & The Tamil University, Thanjavur.

The objectives of the workshop were to provide anunderstanding of NLP in the context of Machine-Translation, Multilinqual Information Retrieval andother applications related to Indian Languages andEnglish, to introduce the usage of Statistical Techniqueson Lexical Resources to refine rule-based methods.These woule be used to develop the NLP applicationsand also to provide training in the use of Tools andResources in the domain of Statistical processing.

The topics covered in detail in lectures, tutorials andlabs were Finite State Automata and Finite StateTransducers, Linguistic Formalisms andComputational Grammar, Hidden Markov Models,Parsing, Machine Learning, Word SenseDisambiguation, Statistical Machine TranslationInformation Retrieval.

In addition to the hands-on lab classes, the teams ofparticipants were also involved in executing real-lifeProjects as a part of the workshop, broadly towardsrealizing an English-Indian language MT solution, and/or a cross-lingual Information Retrieval system,Necessary multilingual corpora and other resources arebeing created in Tamil, Hindi and Telugu, in additionto English and that was made available at the workshop.

There were 35 participants who belonged to differentstreams : Language and Linguistic streams and Scienceand Engineering streams such as Computers,Communications, Maths and Statistics.

The Resource persons of the Workshop were :

l Prof. Aravind K. Joshi, University of Pennsylvania,USA; Dr. B. Srinivas, AT&T Research, New Jersey,USA; Dr. Anoop Sarkar, University of Pennsylvania,USA; Dr. Rajeev Sangal, IIT, Hyderahad; Durgesh D.Rao, NCST, Mumbai; Sushma Bendre, IIT, Hyderabad

Five Resource centers from Punjab, Anna University,Bangalore, Assam and Kerala presented their work tothe participants at a specially arranged Session on29.12.2001.

Contents Page 23

Page 26: tdiljan2002

2.9 1st Workshop on Indian LanguageOCR on February 1-3, 2002

at University of Hyderabad

The Resource Centre for Indian LanguageTechnology Solutions (Telugu), established by theMinistry of Information Technology, Govt. of India,conducted a three-day workshop on IndianLanguage OCR Systems.

Researchers from various centres including ISI(Kolkata), IISc (Bangalore), C-DAC (Pune),Thapar Institute (Patiala), IIIT(Hyderabad),Vicisoft Technologies, Secunderabad, DRDLHyderabad, University of Mysore, and Universityof Hyderabad participated in the workshop. Each ofthe centres explained in depth all the technical detailsof their own systems.

Status report and plan of action resulted from theworkshop as summarized below:

Status Report and Plan of Action

Various centres working on OCR systems for Indianscripts have been doing very well and full-fledgedOCR systems can be developed in six to twelvemonths from now. These centres have been takingtheirown approaches which hold promise. There isstill a lot of scope for further experimentationand fine tuning.

OCRs for different scripts being developed atvarious centres is as follows,

1. Optical Character Recognition System forDevanagari

Dr. Veena Bansal, IIT, KanpurTel: 0512-597743 E-mail: [email protected]

Prof. B.B.Chaudhary, ISI, KolkataTel: 033-5778085 E-mail: [email protected]

Shri M.D.Kulkarni, C-DAC, PuneTel: 020-5694000 E-mail: [email protected]

2. Optical Character Recognition System for Bangla

Prof. B.B.Chaudhary, ISI, KolkataTel: 033-5778085 E-mail: [email protected]

3. Optical Character Recognition System for Oriya

Prof. B.B.Chaudhary, ISI, KolkataTel: 033-5778085 E-mail: [email protected]

Contents Page 24

Prof (Mrs.) Sanghmitra Mohanty, Utkal UniversityTel: 0674-580216 E-mail: [email protected]

Sh. A.K. Pujari, OCAC BhubaneshwarTel. : 0674-543113 E-mail : [email protected]

4. Optical Character Recognition System forGurumukhi

Prf. G.S.Lehal, TIET, PatialaTel: 0175-214868 E-mail: [email protected]

5. Optical Character Recognition System for Telugu

Prof. K.Narayan Murthy, Univ. of Hyderabad,HyderabadTel: 040-3010500 E-mail: [email protected]

6. Optical Character Recognition System for Tamil,Kannada

Prof. N.J.Rao, IISc. BangloreTel: 080-3092222 E-mail: [email protected]

7. Optical Character Recognition System forMalayalam

Prof. Ravinder Kumar, ER&DCI TrivendrumTel: 0471-320116 E-mail : [email protected]

Various centres have also been planning for postprocessing techniques suitable for thier ownlanguages and scripts. However, it was stressed thatmany of the pre-processing modules are generalpurpose tools thatcan be exchanged for mutualbenefit. This would save a lot of time and effort andavoid duplication of effort.

It was agreed that another technical workshopcum open competition may be planned sometimein August 2002 wherein various OCR systems canbe tested thoroughly on standardized test dataincluding laser printed documents as well asprinted books. Newspapers can also be tried out asa challenge. The output of the OCR systems will bein ISCII/UNICODE.

The IIIT Hyderabad volunteered to host a mailinglist with name as ILOCR so that the communitycould all be in close touch.

Page 27: tdiljan2002

2.10 1st International Conference onGlobal WordNet

on January 21-25, 2002 at CIIL Mysore

The Central Institute of Indian Languages, Mysoreorganized the 1st International Conference on GlobalWordNet from January 21-25, 2002 in collaborationwith Global WordNet Association, Netherlands, IITBombay and IIIT Hyderabad. The total number ofregistered delegates was 81 from 19 countriesincluding India.

There were 12 academic sessions apart from 2sessions of tutorials, Introductory Session andBusiness Meeting.

In the introductory session, Prof. Udaya NarayanaSingh (Director, CIIL) welcomed delegates,informed them about the Institute’s activities andthe possible contribution of the Institute in the areaof Building WordNets in Indian Languages. Dr. PiekVossen and Dr. Christiane Fellbaum in theirintroductory remarks gave an account of thedevelopment of WordNets with emphasis onPrinceton and Euro WordNet.

The twelve academic sessions discussed the followingthemes-

l Building WordNets

l Aligning WordNets/Cross Linguistic Work

l Semantic Relations and Lexical Semantics

l Assigning Domain Labels

l Ontologies, Concepts, Top Levels

l Lexical Semantics

l Disambiguation and Semantic Annotation

l Sublanguages

l Applications

l Interfaces

During academic sessions, a total of 50 papers werepresented. The accepted papers went through a two-way blind review.

There were two tutorial sessions. One was givenjointly by Dr. Christaine Fellbaum, Dr. Piek Vossenand Dr. Palmira Marrafa and the other was given byEneko Agirre and German Rigau together. Bothtutorials discussed various issues in Building

WordNets based on the experiences while buildingPrinceton, Euro and Basque WordNets.

The business meeting was open to all the participantsand was conducted by three board members ofGlobal WordNet Association, namely, Piek Vossen,Christiane Fellbaum and Palmira Marrafa. The issuesdiscussed during business meeting were the activitiesand membership of of GWA, Communicationamong members, Standardization of WordNets andFuture meetings.

The following are the recommendations:

1. Membership fee for the GWA for the period uptoand including the next GWA Meeting will bewaived for all the participants of the FirstInternational Conference and membership isalready available through the GWN website.

2. Prof. Udaya Narayana Singh offered to establisha website exclusively for the WordNet relatedactivities from the CIIL, Mysore site.

3. Standards must be developed with respect tofollowing points:

(a) lexical and semantic relations (both content andlabelling);

(b) representation (such as XML, as developedby IRST and Brno);

(c) the database

(d) shared tools

4. The available Industrial tools and interfacesshould be shared as much as possible.

5. A tentative projected date for the next GWNconference is January, 2004. Interest in organizingthe next meeting had been expressed informallyduring the meeting by Sofia Stamou (Patras,Greece) and Pavel Smrz (Brno, Czech Republic).

The Proceedings of the conference papers arepublished with an introduction by Dr. Piek Vossen& Dr. Christiane Fellbaum and Foreword by Prof.Udaya Narayana Singh. It contains all the 50 papersaccepted for the conference.

The publication is priced at Rs. 160 ($18.00). TheProceedings are also available in a CD form, whichcosts the same amount. These could be ordered forat CIIL, Mysore (Contact: [email protected]).

Contents Page 25

Page 28: tdiljan2002

Devanagari Code Chart3.1 Revision of Unicode Standard-3.0for Devanagari Script

Unicode Standards are widely being used by theIndustry for the development of MultilingualSoftwares. Indian scripts are also included in theUnicode Standards. Unicode consortium had taken thebasic inputs for standardisation of Indian scripts fromISCII-1988 document. There are some deficiencies inthe present Unicode Standards for Indian Scripts andneed to be removed for proper representation of theIndian Scripts.

MIT is the voting member of the Unicode Consortium.The ministry has collected inputs for each of the Indianscripts in order to make a single unified presentationof Indian scripts to the Unicode Consortium. Anumber of meetings starting from November 2000have been organized with the concerned StateGovernments/ Organizations/ Experts and Industry onthis subject. Based on the discussions / feedback draftCode charts for each of the script have been preparedalong-with code details. The first draft proposal of theproposed changes in the existing Unicode Standardsfor Indian scripts was published in the May 2001 issueof the TDIL Newsletter- VishwaBharat@tdil to get thefeedback from experts/ industry / users working in thearea of Indian Language Software Development. Thenews letter was also sent to the members of the UnicodeConsortium for their comments and initial response.The draft proposal was discussed in the UnicodeTechnical Committee (UTC) meeting held in USA inNovember 2001. The minutes of the UTC Meetingare available in document No. L2/01-430R and L2/01-431R of Unicode Consortium. The URL forviewing these documents are as given below:http://tdil.mit.gov.in/newsletter1.htmhttp: / /www.unicode .org/L2/L2001/01304-feedback.pdf

After getting the feedback from state government /linguists / experts / industry the second draft proposalis also being prepared. The final draft of the Devanagariscript is published here for your reference. The wholewrite-up is divided into three parts:

1. Code chart - For quick reference of the charactersand their code value.

2. Code Details - For reference of the character namesand annotations.

3. Devanagari:A brief review of the script – Thiswrite-up covers various aspects of the script.

3. Standardization

Contents Page 26

Page 29: tdiljan2002

Devanagari Code Chart Details

Code Character Description

Point0901 #Ä DEVANAGARI SIGN

CANDRABINDU

= anunasika

.0310 combining

candrabindu

0902 DEVANAGARI SIGN

ANUSVARA

= bindu

0903 #: DEVANAGARI SIGN

VISARGA

Independent vowels

0905 + DEVANAGARI LETTER A

0906 +É DEVANAGARI LETTER AA

0907 < DEVANAGARI LETTER I

0908 <Ç DEVANAGARI LETTER II

0909 = DEVANAGARI LETTER U

090A >ð DEVANAGARI LETTER UU

090B @ñ DEVANAGARI LETTER

VOCALIC R

090C Bô DEVANAGARI LETTER

VOCALIC L

090D Bì DEVANAGARI LETTER

CANDRA E

090E DEVANAGARI LETTER

SHORT E

• for transcribing Dravidian

short e

090F B DEVANAGARI LETTER E

0910 Bä DEVANAGARI LETTER AI

0911 +Éì DEVANAGARI LETTER

CANDRA O

0912 +Éà DEVANAGARI LETTER

SHORT O

• for transcribing Dravidian

short o

0913 +Éä DEVANAGARI LETTER O

0914 +Éè DEVANAGARI LETTER AU

Consonants0915 Eò DEVANAGARI LETTER KA0916 JÉ DEVANAGARI LETTER KHA0917 MÉ DEVANAGARI LETTER GA0918 PÉ DEVANAGARI LETTER GHA0919 Ró DEVANAGARI LETTER NGA091A SÉ DEVANAGARI LETTER CA091B Uô DEVANAGARI LETTER CHA091C VÉ DEVANAGARI LETTER JA091D ZÉ DEVANAGARI LETTER JHA091E \É DEVANAGARI LETTER NYA091F ]õ DEVANAGARI LETTER TTA0920 B DEVANAGARI LETTER TTHA0921 b÷ DEVANAGARI LETTER DDA0922 fø DEVANAGARI LETTER DDHA0923 hÉ DEVANAGARI LETTER NNA0924 iÉ DEVANAGARI LETTER TA0925 lÉ DEVANAGARI LETTER THA0926 nù DEVANAGARI LETTER DA0927 vÉ DEVANAGARI LETTER DHA0928 xÉ DEVANAGARI LETTER NA0929 xÉÃ DEVANAGARI LETTER NNNA

• for transcribing Dravidianalveolar n≡ 0928 xÉ 093C #Ã

092A {É DEVANAGARI LETTER PA092B ¡ò DEVANAGARI LETTER PHA092C ¤É DEVANAGARI LETTER BA092D ¦É DEVANAGARI LETTER BHA092E ¨É DEVANAGARI LETTER MA092F ªÉ DEVANAGARI LETTER YA0930 ®ú DEVANAGARI LETTER RA0931 .® DEVANAGARI LETTER

RRA• for transcribing Dravidianalveolar r

0932 ±É DEVANAGARI LETTER LA0933 ¤ý DEVANAGARI LETTER LLA0934 ¤Ã DEVANAGARI LETTER LLLA

• for transcribing Dravidian l≡ 0933 ¤ý 093C #Ã

0935 ´É DEVANAGARI LETTER VA0936 ¶É DEVANAGARI LETTER SHA

Contents Page 27

Page 30: tdiljan2002

0937 ¹É DEVANAGARI LETTERSSA

0938 ºÉ DEVANAGARI LETTER SA0939 ½ DEVANAGARI LETTER

HA093A # DEVANAGARI INVISIBLE

LETTER

Various signs093C #Ã DEVANAGARI SIGN NUKTA

• for extending the alphabetto new letters

093D % DEVANAGARI SIGNAVAGRAHA

093E #É DEVANAGARI VOWELSIGN AA

Dependent vowel signs093F Ê# DEVANAGARI VOWEL

SIGN I• stands to the left of theconsonant

0940 #Ò DEVANAGARI VOWELSIGN II

0941 #Ö DEVANAGARI VOWELSIGN U

0942 #Ú DEVANAGARI VOWELSIGN UU

0943 #Þ DEVANAGARI VOWELSIGN VOCALIC R

0944 DEVANAGARI VOWELSIGN VOCALIC RR

0945 #ì DEVANAGARI VOWELSIGN CANDRA E= candra

0946 #à DEVANAGARI VOWELSIGN SHORT E• for transcribing Dravidianvowels

0947 #ä DEVANAGARI VOWELSIGN E

0948 #è DEVANAGARI VOWELSIGN AI

0949 #Éì DEVANAGARI VOWEL

SIGN CANDRA O094A DEVANAGARI SHORT O

• for transcribing Dravidianvowels

094B #Éä DEVANAGARI VOWELSIGN O

094C #Éè DEVANAGARI VOWELSIGN AU

Various signs094D #Â DEVANAGARI SIGN HAL

• suppresses inherent vowel094E <reserved>094F <reserved>0950 $ DEVANAGARI OM0951 DEVANAGARI STRESS

SIGN UDATTA0952 DEVANAGARI STRESS

SIGN ANUDATTA0953 DEVANAGARI GRAVE

ACCENT0954 DEVANAGARI ACUTE

ACCENT0955 DEVANAGARI ANUSWARA

• Used in Sanskrit Yajurveda0956 DEVANAGARI SIGN

YAJURVEDIC ANUSWARA• Used in Sanskrit

0957 DEVANAGARI JIVHAMULIYA• Used in Sanskrit

Additional consonants (Their use should be avoided)0958 EÃò DEVANAGARI LETTER QA

≡0915 Eò 093C #Ã0959 JÉÃ DEVANAGARI LETTER

KHHA≡0916 JÉ 093C #Ã

095A MÉÃ DEVANAGARI LETTERGHHA≡0917 MÉ 093C #Ã

095B VÃÉ DEVANAGARI LETTER ZA≡091C VÉ 093C #Ã

095C c÷ DEVANAGARI LETTERDDDHA

Contents Page 28

Page 31: tdiljan2002

≡0921 b÷ 093C #Ã095D gø DEVANAGARI LETTER

RHA≡0922 fø 093C #Ã

095E ¡ÃÃ DEVANAGARI LETTER FA≡092B ¡ò 093C #Ã

095F ªÉà DEVANAGARI LETTER YYA≡092F ªÉ 093C #Ã

Generic additions0960 DEVANAGARI LETTER

VOCALIC RR0961 Cô DEVANAGARI LETTER

VOCALIC LL0962 #Ã DEVANAGARI VOWEL

SIGN VOCALIC L0963 #Ä DEVANAGARI VOWEL

SIGN VOCALIC LL0964 * DEVANAGARI PURNA

VIRAMA= phrase separator

0965 ** DEVANAGARI DEERGHVIRAM

Digits0966 0 DEVANAGARI DIGIT ZERO0967 1 DEVANAGARI DIGIT ONE

• Shape 1 is also used in Hindi0968 2 DEVANAGARI DIGIT TWO0969 3 DEVANAGARI DIGIT THREE096A 4 DEVANAGARI DIGIT FOUR096B 5 DEVANAGARI DIGIT FIVE

• Shape 5 is also used in Hindi096C 6 DEVANAGARI DIGIT SIX096D 7 DEVANAGARI DIGIT SEVEN096E 8 DEVANAGARI DIGIT EIGHT

• Shape 8 is also used in Hindi096F 9 DEVANAGARI DIGIT NINE

• Shape 9 is also used in Hindi

Devanagari-specific additions0970 0 DEVANAGARI ABBRE-

VIATION SIGN0971 ¯û0 DEVANAGARI CUR-

RENCY SIGN0972 IÉ DEVANAGARI LETTER KSHA0973 YÉ DEVANAGARI LETTER

GNYA0974 ¸É DEVANAGARI SIGN

SHRA0975 #Ç DEVANAGARI SIGN

REPH0976 ¨Ì LETTER SHA Used in

Marathi• ¶É (0936) ≡ ¨Ì (0976)

0977 ¡ô DEVANAGARI LETTERLA Used in Marathi• ±É (0932) ≡ ¡ô (0977)

0978 DEVANAGARI SIGN/SOFT RA/•Used in MARATHI asconsonant modifier

0979 DEVANAGARI CONSO-NANT• Used for SINDHI implo-sive placed just below theconsonant

097A DEVANAGARI CONSO-NANT• Used for SINDHI implo-sive placed just below theconsonant

097B DEVANAGARI CONSO-NANT•Used for SINDHI implo-sive placed just below theconsonant

097C DEVANAGARI CONSO-NANT• Used for SINDHI implo-sive placed just below theconsonant

Contents Page 29

Page 32: tdiljan2002

Explanations for Revised Devanagari Code Chart

Devanagari : U +0900-U +097F

The Devanagari script is used for writing classicalSanskrit and its modern historical derivative, Hindi.Extensions to Devanagari are used to write otherrelated languages of India (such as Marathi, Konkani,Sindhi and Sanskrit) and of Nepal (Nepali). In addition,the Devanagari script is used to write the dialects ofHindi and various other regional & tribal languages.

All other Indic scripts including Nandi Nagari, aswell as the Sinhala script of Sri Lanka, the Tibetanscript, and the Southeast Asian scripts (Thai, Lao,Khmer, and Myanmar), are historically connectedwith the Devanagari script as descendants of theancient Brahmi script. The entire family of scriptsshares a large number of structural features.

The principles of the Indic scripts are covered insome detail in this introduction to the Devanagariscript. The remaining introductions to the Indicscripts are abbreviated but highlight any differencesfrom Devanagari where appropriate.

Standards : The Devanagari block of the UnicodeStandard is based on ISCII-1988 (Indian StandardCode for Information Interchange). The ISCIIstandard of 1988 differs from and is an update ofearlier ISCII standards issued in 1983 and 1986.

The Unicode Standard encodes Devanagaricharacters in the same relative position as those codedin positions A0-F4

16 in the ISCII-1988 standard.

The same character code layout is followed for eightother Indic scripts in the Unicode Standard: Bengali,Gurmukhi, Gujarati, Oriya, Tamil, Telugu,Kannada, and Malayalam. This parallel code layoutemphasizes the structural similarities of the Brahmiscript and follows the stated intention of the Indiancoding standards to enable one-to-one mappingsbetween analogous coding positions in differentscripts in the family. Sinhala, Thai, Lao, Khmer, andMyanmar depart to a greater extent from theDevanagari structural pattern, so the UnicodeStandard does not attempt to provide any directmappings for these scripts to the Devanagari order.

In November 1991, at the time The UnicodeStandard, Version 1.0, was published, the Bureau ofIndian Standards published a new version of ISCII

in Indian Standard (IS)13194:1991. This newversion partially modified the layout and repertoireof the ISCII-1988 standard. Because of these events,the Unicode Standard does not precisely follow thelayout of the current version of ISCII. Nevertheless,the Unicode Standard remains a superset of theISCII-1991 repertoire except for a number of newVedic extension characters defined in IS 13194:1991Annex G - Extended Character Set for Vedic.Modern, non-Vedic texts encoded with ISCII-1991may be automatically converted to Unicode codevalues and back to their original encoding withoutloss of information.

Encoding Principles : The writing systems thatemploy Devanagari and other Indic scripts constitutea cross between syllabic writing systems andphonemic writing systems (alphabets). The effectiveunit of these writing systems is the orthographicsyllable, consisting of a consonant and vowel (CV)core and, optionally, one or more precedingconsonants, with a canonical structure of ((C) C)CV. The orthographic syllable need not correspondexactly with a phonological syllable, especially whena consonant cluster is involved, but the writingsystem is built on phonological principles and tendsto correspond quite closely to pronunciation.

The orthographic syllable is built up of alphabeticpieces, the actual letters of the Devanagari script.These pieces consist of three distinct character types:consonant letters with inherent vowel /a/, pureconsonant, independent vowels, and dependentvowel signs. In a text sequence, these characters arestored in logical (phonetic) order.

Principles of the Script

Rendering Devanagari Characters : Devanagaricharacters, like characters from many other scripts,can combine or change shape depending on theircontext. A character’s appearance is affected by itsordering with respect to other characters, the fontused to render the character, and the application orsystem environment. These variables can cause theappearance of Devanagari characters to differ fromtheir nominal glyphs (used in the code charts).

Additionally, a few Devanagari characters cause achange in the order of the displayed characters. Thisreordering is not commonly seen in non-Indic scripts

Contents Page 30

Page 33: tdiljan2002

and occurs independently of any bidirectionalcharacter reordering that might be required.

Consonant Letters : Each consonant letter representsa single consonantal sound but also has the peculiarityof having an inherent vowel, generally the shortvowel /a/ in Devanagari and the other Indic scripts.Thus U+0915 DEVANAGARI LETTER KArepresents not just /k/ but also /ka/. In the presenceof a dependent vowel, however, the inherent vowelassociated with a consonant letter is overridden bythe dependent vowel.

Consonant letters may also be rendered as half-forms,which are presentation forms used to depict the initialconsonant in consonant clusters. These half-formsdo not have an inherent vowel. Their rendered formsin Devanagari often resemble the full consonant butare missing the vertical stem, which marks a syllabiccore. (The stem glyph is graphically and historicallyrelated to the sign denoting the inherent /a/ vowel.)

Some Devanagari consonant letters have alternativepresentation forms whose choice depends uponneighboring consonants. This variability is especiallynotable for U+0930 DEVANAGARI LETTER RA,which has numerous different forms, both as theinitial element and as the final element of a consonantcluster. Only the nominal forms, rather than thecontextual alternatives, are depicted in the code chart.

The traditional Sanskrit / Devanagari alphabeticencoding order for consonants follows articulatoryphonetic principles, starting with pre velarconsonants and moving forward to bilabialconsonants, followed by liquids, semi vowels andthen fricatives, sibilents (Ushma). ISCII and theUnicode standard both observe this traditional order.

Independent Vowel Letters : The independentvowels in Devanagari are letters that stand on theirown. The writing system treats independent vowelsas orthographic CV syllables in which the consonantis null. The independent vowel letters are used towrite syllables that start with a vowel.

Dependent Vowel Signs (Matras) : The dependentvowels serve as the common manner of writingnoninherent vowels and are generally referred to asvowel signs, or as Matras in Sanskrit. The dependentvowels do not stand alone; rather, they are visibly

depicted in combination with a base letterform. Asingle consonant, or a consonant cluster, may have adependent vowel applied to it to indicate the vowelquality of the syllable, when it is different from theinherent vowel. Explicit appearance of a dependentvowel in a syllable overrides the inherent vowel of asingle consonant letter.

The greatest variation among different Indic scriptsis found in the way that the dependent vowels areapplied to base letterforms. Devanagari has acollection of nonspacing dependent vowel signs thatmay appear above or below a consonant letter, aswell as spacing dependent vowel signs that may occurto the right or to the left of a consonant letter orconsonant cluster. Other Indic scripts generally haveone or more of these forms, but what is a nonspacingmark in one script may be a spacing mark in another.Also, some of the Indic scripts have single dependentvowels that are indicated by two or more glyphcomponents and those glyph components maysurround a consonant letter both to the left and rightor may occur both above and below it.

The Devanagari script has only one characterdenoting a left-side dependent vowel sign: U+093FDEVANAGARI VOWEL SIGN I. Other Indicscripts either have no such vowel (Telugu andKannada) or include as many as three of these signs(Bengali, Tamil, Malayalam).

A one-to-one correspondence exists between theindependent vowels and the dependent vowel signs.Independent vowels are sometimes represented by asequence consisting of the independent form of thevowel /a/ followed by a dependent vowel sign. Forexample Figure 9.1 illustrates this relationship (seethe notation formally described in the “Rules forRendering” later in this section).

Figure 9.1 : Dependent Versus Independent Vowels

/a/ + Dependent Vowel Independent VowelA

n + I

vsI

vs + A

n= I

n

+ + Ê# Ê+ = <A

n + U

vsA

n + U

vs= U

n

+ + #Ö +Ö = =

The combination of the independent form of thedefault vowel /a/ (in the Devanagari script, U+0905

Contents Page 31

Page 34: tdiljan2002

DEVANAGARI LETTER A) with a dependentvowel sign may be viewed as an alternative spellingof the phonetic information normally representedby an isolated independent vowel form. However,these two representations should not be consideredequivalent for the purposes of rendering. Higher-level text processes may choose to consider thesealternative spellings equivalent in terms ofinformation content, but such an equivalence is notstipulated by this standard.

Hal Sign : Devanagari and other Indic scripts employa sign known as the hal sign (representing consonant),or vowel omission sign. A hal sign (for example,U+094D DEVANAGARI SIGN HAL) normallyserves to cancel (or kill) the inherent vowel of theconsonant to which it is applied. The hal functionsas a combining character, with its shape varying fromscript to script. When a consonant has lost itsinherent vowel by the application of hal, it is knownas a dead consonant; in contrast, a live consonant isone that retains its inherent vowel or is written withan explicit dependent vowel sign. In the UnicodeStandard, a dead, consonant is defined as a sequenceconsisting of a consonant letter followed by a halsign. The default rendering for a dead consonant isto position the hal as a combining mark bound tothe consonant letterform.

For example, if Cn denotes the nominal form of

consonant C and Cd denotes the dead consonant

form, then a dead consonant is encoded as shown inFigure 9.2.

Figure 9.2 : Dead Consonants

TAn + HAL

nTA

d

iÉ + #Â iÉÂ

Consonant Conjuncts : The Indic scripts are notedfor a large number of consonant conjunct forms thatserve as orthographic abbreviations (ligatures) of twoor more adjacent letterforms. This abbreviation takesplace only in the context of a consonant cluster. Anorthographic consonant cluster is defined as asequence of characters that represents one or moredead consonants (denoted C

d) followed by a normal,

live consonant letter (denoted Cl) or an independent

vowel having an inherent vowel or a vowel signrepresenting other vowel.

Under normal circumstances, a consonant cluster isdepicted with a conjunct glyph if such a glyph isavailable in the current font(s). In the absence of aconjunct glyph, the one or more dead consonantsthat form part of the cluster are depicted using half-form glyphs. In the absence of half-form glyphs, thedead consonants are depicted using the nominalconsonant forms combined with visible hal signs (seeFigure 9.3).

Figure 9.3 : Conjunct Formations

(1) GAd + DHA

IGA

h + DHA

n

MÉÂ + vÉ MvÉ

(2) KAd + KA

IK.KA

n

EÂò + Eò Gò

A number of types of conjunct formations appear inthese examples: (1) a half-form of GA in itscombination with the full form of DHA; (2) avertical conjunct K.KA.

A well-designed Indic script font may containhundreds of conjunct glyphs, but they are notencoded as Unicode characters because they are theresult of ligation of distinct letters. Indic scriptrendering software must be able to map appropriatecombinations of characters in context to theappropriate conjunct glyphs in fonts.

Explicit Hal : Normally a hal sign serves to create deadconsonants that are, in turn, combined with subsequentconsonants to form conjuncts. This behavior usuallyresults in a hal sign not being depicted visually.Occasionally, however, this default behavior is notdesired when a dead consonant should be excluded fromconjunct formation, in which case the hal sign is visiblyrendered. To accomplish this goal, the Unicode Standardadopts the convention of placing the character U+200CZERO WIDTH NON-JOINER immediately after theencoded dead consonant that is to be excluded fromconjunct formation. In this case, the hal sign is alwaysdepicted as appropriate for the consonant to which it isattached.KA

d+ ZWNJ + SSHA

lKA

d+ SSHA

n

EÂò + ZWNJ + ¹É EÂò¹É

Explicit Half-Consonants : When a dead consonantparticipates in forming a conjunct, the dead

Contents Page 32

Page 35: tdiljan2002

consonant form is often absorbed into the conjunctform, such that it is no longer distinctly visible. Inother contexts, however, the dead consonant mayremain visible as a half-consonant form. In general,a half-consonant form is distinguished from thenominal consonant form by the loss of its inherentvowel stem, a vertical stem appearing to the rightside of the consonant form. In other cases, the verticalstem remains but some part of its right-side geometryis missing.

In certain cases, it is desirable to prevent a deadconsonant from assuming full conjunct formationyet still not appear with an explicit hal. In these cases,the half-form of the consonant is used. To explicitlyencode a half-consonant form, the Unicode Standardadopts the convention of placing the characterU+200D ZERO WIDTH JOINER immediatelyafter the encoded dead consonant. The ZEROWIDTH JOINER denotes a nonvisible letter thatpresents linking or cursive joining behavior on eitherside (that is, to the previous or following letter).Therefore, in the present context, the ZEROWIDTH JOINER may be considered to present acontext to which a preceding dead consonant mayjoin so as to create the half-form of the consonant.

For example, if Ch denotes the half-form glyph of

consonant C, then a half-consonant form is encodedas shown in Figure 9.5.

Figure 9.5 : Half-Consonants

KAd

+ ZWJ + SSHAI

KAh + SSHA

n

EÂò + ZWJ + ¹É C¹É

This encoding of half-consonant forms also appliesin the absence of a base letterform. That is, thistechnique may also be used to encode independenthalf-forms, as shown in Figure 9-6.

Figure 9.6 : Independent Half-Forms

GAd

+ ZWJ GAh

MÉÂ + ZWJ M

Consonant Forms. In summary, each consonant maybe encoded such that it denotes a live consonant, adead consonant that may be absorbed into aconjunct, or the half-form of a dead consonant (see

Figure 9.7).

Figure 9.7 : Consonant Forms

Eò Eò KAI

Eò + #Â EÂò KAd

Eò + #Â + ZWJ C KAh

Rendering

Rules for Rendering : The following provides moreformal and detailed rules for minimal rendering ofDevanagari as part of a plain text sequence. Itdescribes the mapping between Unicode charactersand the glyphs in a Devanagari font. It also describesthe combining and ordering of those glyphs.

These rules provide minimal requirements for legiblyrendering interchanged Devanagari text. As with anyscript, a more complex procedure can add renderingcharacteristics, depending on the font andapplication.

It is important to emphasize that in a font that iscapable of rendering Devanagari, the set of glyphs isgreater than the number of Devanagari Unicodecharacters.

Notation : In the next set of rules, the followingnotation applies:

Cn

Nominal glyph form of consonant C as itappears in the code charts.

Cl

A live consonant, depicted identically to Cn.

Cd

Glyph depicting the dead consonant form ofconsonant C.

Ch

Glyph depicting the half-consonant form ofconsonant C.

Ln

Nominal glyph form of a conjunct ligatureconsisting of two or more componentconsonants. A conjunct ligature composed oftwo consonants X and Y is also denoted X.Y

n.

RAsub

A nonspacing combining mark glyph formof the U+0930 DEVANAGARI LETTERRA positioned below or attached to the lowerpart of a base glyph form.

Vvs

Glyph depicting the dependent vowel signform of a vowel V.

HAL The nominal glyph form nonspacing

Contents Page 33

Page 36: tdiljan2002

combining mark depicting U+094DDEVANAGARI SIGN HAL.

• A HAL character is not always depicted; whenit is depicted, it adopts this nonspacing markform.

Dead Consonant Rule : The following rulelogically precedes the application of any other ruleto form a dead consonant. Once formed, adead consonant may be subject to other rulesdescribed next.

Rl When a consonant Cn precedes a Hal

n

it is considered to be a dead consonantC

d. A consonant C

n that does not

precede Haln is considered to be a live

consonant CI.

TAn

+ Haln

TAd

iÉ + #Â iÉÂ

Consonant RA Rules : The character U+0930DEVANAGARI LETTER RA takes one of a numberof visual forms depending on its context in aconsonant cluster. By default, this letter is depictedwith its nominal glyph form (as shown in the codecharts). In two contexts, it is depicted using anonspacing glyph form that combines with a baseletterform.

R1 Except for the dead consonant RAd, when a dead

consonant Cd precedes the live consonant RA

I,

then Cd is replaced with its nominal form C

n

and RA is replaced by the subscript nonspacingmark RA

sub, which is positioned so that it

applies to Cn.

THAd

+ RAI

THAn+RA

subDisplayed

Output

BÂ + ®ú B + #Å B Å

R2 For certain consonants, the mark RAsub

maygraphically combine with the consonant to forma conjunct ligature form. These combinations,such as the one shown here, are furtheraddressed by the ligature rules described shortly.

PHAd

+ RAl

PHAn+ RA

subDisplayedOutput

¡Âò + ® ¡ò + #Å £ò

R3 If a dead consonant (other than RAd) precedes

RAd then subsitution of RA for RA

sub is

performed as described above; however, the Halthat formed RA

d remains so as to form a dead

consonant conjunct form.

TAd+RA

dTA

n+ RA

sub+HAL

n T.RA

d

iÉ + ®Â iÉ + #Å + # jÉÂ

A dead consonant conjunct form that containsan absorbed RA

d may subsequently combine

to form a multipart conjunct form.

T.RAd

+ YAl

T.R.YAn

jÉ + ªÉ jªÉ

Modifier Mark Rules : In addition to vowel signs,three other types of combining marks may be appliedto a component of an orthographic (visual) syllableor to the syllable as a whole: nukta, bindus, andsvaras.

R4 The nukta sign, which modifies a consonantform, is placed immediately after the consonantin the memory representation and is attachedto that consonant in rendering. If the consonantrepresents a dead consonant, then NUKTAshould precede Hal in the memoryrepresentation.

KAn

+ NUKTAn

+ Hal QAd

Eò + #Ã + #Â FÂò

R5 The other modifying marks, bindus and svaras,apply to the orthographic syllable as a wholeand should follow (in the memoryrepresentation) all other characters thatconstitute the syllable. In particular, the bindusshould follow any vowel signs, and the svarasshould come last. The relative placement ofthese marks is horizontal rather than vertical;the horizontal rendering order may varyaccording to typographic concerns.

KAn

+ AAvs

+ CANDRABINDUn

Eò + #É + #Ä EòÉÄ

Ligature Rules : Subsequent to the application ofthe rules just described, a set of rules governingligature formation apply. The precise application ofthese rules depends on the availability of glyphs inthe current font(s) being used to display the text.

Contents Page 34

Page 37: tdiljan2002

R6 If a dead consonant immediately precedesanother dead consonant or a live consonant,then the first dead consonant may join thesubsequent element to form a two-part conjunctligature form.

JAd

+ NYAl

J.NYAn

VÉÂ + \É YÉTTA

d+ TTHA

l+ TT.TTHA

n

]Â + B _

R7 A conjunct ligature form can itself behave as adead consonant and enter into further, morecomplex ligatures.

SAd+TA

d+ RA

nSA

d+T.R.A

n+S.T.RA

n

ºÉ + iÉ + ® ºÉ + jÉ + ºjÉ

A conjunct ligature form can also produce a half-form.

T.R.Ad

+ YAl

T.R.Ah

+ YAn

jÉ + ªÉ jªÉ

R8 If a nominal consonant or conjunct ligatureform precedes RA

sub, then the consonant or

ligature form may join with RAsub

to form amultipart conjunct ligature.

KAn

+ RAsub

K.RAn

Eò + #Å GòPHA

n+ RA

subPH.RA

n

¡ + #Å £ò

R9 In some cases, other combining marks will alsocombine with a base consonant, either attachingat a nonstandard location or changing shape.In minimal rendering there are only two cases,RA

I with U

vs or UU

vs .

RAl

+ Uvs

RUn

® + #Ö ¯ûRA

l+ UU

vsRUU

n

®ú + #Ú °ü

Memory Representation and Rendering Order

The order for storage of plain text in Devanagariand all other Indic scripts generally follows phoneticorder; that is, a CV syllable with a dependent vowelis always encoded as a consonant letter C followed

by a vowel sign V in the memory representation.This order is employed by the ISCII standard andcorresponds with both the phonetic and keying orderof textual data.

Rendering Order

Character Order Glyphh Order

KAn

+ Ivs

Ivs

+ KAn

Eò + Ê# ÊEò

Because Devanagari and other lndic scripts have somedependent vowels that must be depicted to the leftside of their consonant letter, the software thatrenders the Indic scripts must be able to reorderelements in mapping from the logical (character)store to the presentational (glyph) rendering. Forexample, if C

n denotes the nominal form of

consonant C, and Vvs denotes a left-side dependent

vowel sign form of vowel V, then a reordering ofglyphs with respect to encoded characters occurs asjust shown.

R10 When the dependent vowel Ivs is used to

override the inherent vowel of a syllable, it isalways written to the extreme left of theorthographic syllable. If the orthographicsyllable contains a consonant cluster, then thisvowel is always depicted to the left of thatcluster. For example:

TAd+RA

d+ I

vsT.RA

n+I

vsI

vs+T.RA

d

iÉ + ®Âú + Ê# jÉ + Ê# ÊjÉ

Sample Half-Forms : Dev.1 shows examples of half-consonant forms that are commonly used with theDevanagari script. These forms are glyphs, notcharacters. They may be encoded explicitly usingZERO WIDTH JOINER as shown; in normalconjunct formation, they may be used spontaneouslyto depict a dead consonant in combination withsubsequent consonant forms.

Dev.1 : Sample Half Forms

Eò #Â ZWJ C

JÉ #Â ZWJ J

MÉ #Â ZWJ M

Contents Page 35

Page 38: tdiljan2002

SÉ #Â ZWJ S

VÉ #Â ZWJ V

ZÉ #Â ZWJ Z

\É #Â ZWJ \

hÉ #Â ZWJ h

iÉ #Â ZWJ i

lÉ #Â ZWJ l

vÉ #Â ZWJ v

xÉ #Â ZWJ x

{É #Â ZWJ {

¡ò #Â ZWJ }

¤É # ZWJ ¤

¦É # ZWJ ¦

¨É # ZWJ ¨

ªÉ # ZWJ ª

±É # ZWJ ±

´É #Â ZWJ ´

ºÉ # ZWJ º

¹É # ZWJ ¹

ºÉ # ZWJ º

IÉ #Â ZWJ I

YÉ #Â ZWJ Y

¸É # ZWJ ¸

Sample Ligatures : Dev.2 shows examples ofconjunct ligature forms that are commonly used withthe Devanagari script. These forms are glyphs, notcharacters. Not every writing system that employs

this script uses all of these forms; in particular, manyof these forms are used only in writing Sanskrit texts.Furthermore, individual fonts may provide fewer or

more ligature forms than are depicted here.

Dev.2 : Sample Ligatures

Eò # Eò GòEò # iÉ HòEò # ®ú FêòRÂó # E SóRÂó # JÉ VóRÂó # MÉ WóRÂó # PÉ Xó\É # VÉ gÌVÉ # \É YÉnÂù # PÉ }nÂù # nù qùnÂù # vÉ rùnÂù # ¤É �ùnÂù # ¦É �nÂù # ¨É snÂù # ªÉ tnÂù # ´É uù]Âõ # ]õ õ]Âõ # B _öB~ # B aöbÂ÷ # MÉbÂ÷ # b÷ d÷bÂ÷ # fø e÷iÉ # iÉ kÉiÉ # ®ú jÉxÉ # xÉ zÉ¡Âò # ®ú £ò½Âþ # ¨É À½Âþ # ªÉ Á½Âþ # ±É ½þ½Âþ # ´É ¾½ #Þ Àþ® ú#Ö ¯û

Contents Page 36

Page 39: tdiljan2002

® #Ú °üºÉ jÉ ºjÉ

Sample Half - Ligature Forms : In addition to half-

form glyphs of individual consonants, half-forms are

also used to depict conjunct ligature forms. A sample

of such forms is shown in Dev.3. These forms are

glyphs, not characters. They may be encoded

explicitly using ZERO WIDTH JOINER as shown;

in normal conjunct formation, they may be used

spontaneously to depict a conjunct ligature in

combination with subsequent consonant forms.

Dev.3 : Sample Half-Ligature Forms

iÉ #Â iÉ #Â kiÉ #Â ®ú #Â j

Combining Marks : Devanagari and other Indic

scripts have a number of combining marks that could

be considered diacritic. One class of these marks,

known as bindus, is represented by U+0901

DEVANAGARI SIGN CANDRABINDU and

U+0902 DEVANAGARI SIGN ANUSVARA. The

first mark indicates nasalization of a vowel and the

second mark represent a nasal consonant occurring

after a vowel or final nasal closure of a syllable.

U+093C DEVANAGARI SIGN NUKTA is a true

diacritic. It is used to extend the basic set of

consonant letters by modifying them (with a

subscript dot in Devanagari) to create new letters.

U+0951..U+0957 are a set of combining marks used

in transcription of Sanskrit texts.

Digits : Each Indic script has a distinct set of digits

appropriate to that script. These digits may or may

not be used in ordinary text in that script. The

international form of Indian Digits (Hindsa) have

displaced the Indic script forms in modern usage in

many of the scripts. Some Indic scripts-notably

Tamil-lack a distinct digit for zero.

Punctuation and Symbols : U+0964

DEVANAGARI PURNA VIRAM is similar to a full

stop. Corresponding forms occur in many other Indic

scripts. U+0965 DEVANAGARI DEERGH

VIRAM marks the end of a verse in traditional texts.

Many modern languages written in the Devanagari

script intersperse punctuation derived from the Latin

script. Thus U+002C COMMA and U+00E FULL

STOP are freely used in writing Hindi, and the

‘PURNA VIRAM (danda) is usually restricted to

more traditional texts.

Encoding Structure : The Unicode Standard

organizes the nine principal Indic scripts in blocks

of 128 encoding points each. The first six columns

in each script are isomorphic with the ISCII-1988

encoding, except that the last 11 positions (U+0955

.. U+095F in Devanagari, for example), which are

unassigned or undefined in ISCII-1988, are used in

the Unicode encoding.

The seventh column in each of these scripts, along

with the last 11 positions in the sixth column,

represent additional character assignments in the

Unicode Standard that are matched across all nine

scripts. For example, positions U+xx66 ... U+xx6F

and U+xxE6 ... U+xxEF code the Indic script digits

for each script.

The eighth column for each script is reserved for

script-specific additions that do not correspond from

one Indic script to the next.

(The above revision is based on detailed discussions

with National & State level Institutions/Directorates

dealing with Devanagari based languages - Sanskrit,

Hindi, Marathi, Nepali, Konkani & Sindhi)

Contact : [email protected]@mit.gov.in

Contents Page 37

Page 40: tdiljan2002

Contents Page 38

3.2 Design Guides(Sanskrit, Hindi, Marathi, Konkani, Sindhi, Nepali)

Language Design Guides

Department of Information Technology, Ministryof Communications & Information TechnologyGovernment of India is working on the EncodingStandards for Indian Languages. During discussionwith a strong need was felt to prepare a LanguageGuide giving correct technical information about thelanguage to be used by the IT industry and othersimilar applications for localisation.

IBM India suggested an outline of the languageinformation required for software development. TheRCILTS refined these guide-lines. This includesCharacter sets (Consonant, Vowel, Dependent VowelSigns), Consonant Conjuncts, Sorting Order, Digits,Punctuation Symbols and cultural information etc.This issue brings out draft Language Design Guidesfor Devanagari based languages Hindi, Sanskrit,Marathi, Konkani, Sindhi & Nepali.

Inputs are solicited from the technical and linguisticcommunity for further refinement of thisinformation.

3.2.1 Sanskrit Design Guide

Introduction

Sanskrit is one of the most ancient languages of theworld, which has molded the culture and thought-system not only of India but of many other countriesin Asia such as Nepal, Sri Lanka, Myanmar, China,Japan, Korea, Thailand, Indonesia etc. Sanskrit isnot a dead language like Greek and Latin. It is stillspoken in some Indian families, though their numberis not very large. Even now new literature is beingcreated in Sanskrit and radio and televisionprogrammes are regularly broadcast. The ideascontained in Sanskrit continue to influence Indianmind. Its vocabulary has permeated all Indianlanguages, and thus it provided a continuity withthe past of our country. With the current world-wide interest in things Indian like Yoga, meditationand Ayurveda, there is a renewed interest in learningSanskrit in many countries outside India. What ismore, it can be said without any fear of exaggerationor contradiction that the ideas contained in Sanskritare going to mold the future thinking of the wholemankind in such areas as linguistics, philosophy,psychology, religion, sociology, in short in everything

that is related to the essence of the life of man. It isso because the deepest ideas expressed in Sanskritare not available anywhere else in the world.

The literature of Sanskrit is very vast. It includesnot only literary, philosophical and religious works,but also works on mathematics, medicine,astronomy, weaponry, animal husbandry, politicaland economic science, poetics, linguistics, and soon. In fact the Sanskrit literatures covers all aspectsof human life.

In this short article we shall pay attention only toSanskrit phonology and orthography.

No language has been so perfectly described asSanskrit. The main credit for this achievement goesto Panini who lived around 300 BC. Panini’s bookof Sanskrit grammar, called Ashtadhyayi, has beencalled by the eminent American linguist Bloomfieldas “one of the greatest monuments of humanintelligence”. It was because of Panini’s grammar thatSanskrit language was so standardized that, withoutthe modern means of transport and communication,the language could be understood and used byscholars throughout the length and breadth of India.Panini was preceded by a long chain of grammarians,and his tradition continued even afterwardsproducing such great grammarians as Katyayana,Patanjali and Bhartrihari.

With his 4000 sutras, each of which is usually nomore than two or three words, Panini was able toexplain how almost all the words used in Sanskrit ofhis time were formed. Panini’s grammar can be easilysaid to be precursor of today’s generative grammar.Given his corpus of stems of verbs and nouns andthe rules operating on them, even a computerprogrammed with Panini’s sutras can generatepractically all the words of classical Sanskrit. This isperhaps why it is said that Sanskrit is the most suitablelanguage for the computers.

Sanskrit Phonology and Orthography

1. Sanskrit vowels. There are ten main vowels inSanskrit. Of these, the following three are shortvowels:

a as u in cupi as i in situ as u in put

Page 41: tdiljan2002

Contents Page 39

The other seven main vowels are long. They taketwice as much time in their pronunciation as theshort vowels:

q as a in father] as ee in sheep[ as oo in poole as a in gateai as igh in high (with a short a)o as o in hopeau as ou in out (with a short a)

Note : q, ] and [ are long forms of a, i, and urespectively. The difference in the short and longvowels is important. The vowels ai and au arediphthongs, i. e., combinations of two vowel sounds.In pronouncing ai, the sound of a is immediatelyfollowed by that of i. Similarly, au is pronounced bymaking the sounds of a and u in quick succession.

2. Besides the above ten, there are three more vowelsin Sanskrit whose original pronunciation is lost. Theyare now mostly pronounced as combinations of aconsonant and a vowel as follows:

3 = r + i, ¥ = r + ], = = l + r + i

Note: i) Among these three, only 3 is frequently used.An example of this vowel having become acombination of a consonant and a vowel is found inthe word Sanskrit itself. The sound denoted by ri inthis word must have been a vowel sound which isnow lost to us. In some parts of India this vowel ispronounced as a combination of r + u. ii) Just as 3has a long counterpart in ¥, the vowel = also has alonger counterpart in theory, but it is not used inany actual word of Sanskrit. iii) Even though thesesounds are not pronounced like vowels, they have tobe treated as vowels in all grammatical contexts justas they were when the Sanskrit grammar was codifiedabout more than 2500 years ago.

3. Sanskrit vowels are written in the Devanagari scriptas follows:v a vk q b i bZ ] m u Å [_ 3 _¤ ¥ , e ,s ai vks o vkS au

Note: The vowel = is written as y but is not usedindependently in natural Sanskrit words.

4. Vowel marks. All vowels, except v] have twowritten symbols to represent them. The vowels are

written in the above form only when they are usedindependently, i.e., when they occur in the beginningof a word or follow another vowel. When any vowel,except a, follows a consonant, it is represented by itsparticular mark (mqtrq) attached to that consonant.When no mark is attached to its letter, a consonantis pronounced with the sound of a after it. Beloware shown the marks (mqtr7qs) of different vowels asattached to the Devanagari consonant u (na).

v Nil u na _ ` u` n3vk k uk nq _ ¤ u¤ n¥

b f fu ni , s us nebZ h uh n] ,s S uS naim q uq nu vk s ks uks noÅ w uw n[ vkS kS ukS nau

Note: i) The mark of b is written before the consonantwhile that of the vowel bZ is written after it. ii) Thevowel y has no special mark for it. When followinga consonant, this vowel itself is written below or afterthat consonant to symbolize its sound.

5. Sanskrit consonants. Sanskrit consonants aredivided very systematically into several groupsaccording to their place and manner of articulation.They are shown below according to their group-wisedivision:

Consonants

A. Stops

Simple Aspirated Voiced Asp.+Vc. Nasals

Gutturals d ka [k kha x ga ?k gha ³ xaPalatals p ca N cha t ja > jha ×k `aCerebrals V wa B wha M fa < fha .k zaDentals r ta Fk tha n da /k dha u naLabials i pa Q pha c ba Hk bha e ma

B. Semi-vowels ; ya j ra y la o va

C. Sibilants 'k 1a "k 2a Lk saD. Aspirates g ha % 4E. Special nasal a /

6. Stop consonants. A look at the table above showsus the systemic order of the arrangement of theconsonants. The first twenty five consonants are all‘stop’ consonants as in their pronunciation the flow

Page 42: tdiljan2002

of air is momentarily stopped at different places ofarticulation. The consonants of the first row(gutturals) are pronounced in the throat, the breathbeing stopped by raising the back part of the tongue.Those of the next row (palatals) are produced at theback part of the palate while the breath is stoppedby the middle part of the tongue. The next row (ofcerebrals) is articulated at the centre of the roof ofthe mouth, the breath being stopped by the frontupper part of the tongue. The consonants of the ofthe fourth row (dentals) are pronounced with thetip of the tongue touching the upper teeth. Theconsonants of the fifth row are produced by closingboth the lips. Thus there is, in general, a progressionof the consonants from the backmost part of thespeech apparatus to the frontmost part. (The palatalsare now pronounced between the cerebrals anddentals.)

These twenty five consonants are again dividedaccording to whether, i.) they are aspirated, that is,an extra puff of air is used in articulating them, ii)whether they are voiced, that is, the vocal chordsvibrate in their articulation, and iii.) whether partof the air is released through the nose whileproducing them.

7. Simple consonants. The five ‘simple stops’ in thefirst column are pronounced at their respective placesof articulation by momentarily stopping, with thetongue or the lips, the out-going air and thenreleasing it in a natural way without any special effort.Starting with d (ka) in the throat they graduallycome forward to end in i (pa) articulated with thelips. These consonants are:

Cons. as in Notes

d k in skull with no aspiration

p ch in chair the front part of thetongue pressing tightagainst the palate

V t in fit with no aspiration

r –– like French t, the tip of thetongue pressing upperteeth, no aspiration.

i p in sip with no aspiration

Note: For the sake of convenience, we refer to a

consonant as followed by the vowel a. But when wediscuss a consonant as such, it should be taken to bealone by itself, without any vowel attached to it.

8. Aspirated stops. The consonants in the secondcolumn of the ‘stops’ are pronounced by positioningthe tongue or lips to pronounce the correspondingsimple consonants of the first column. Then somepressure of air is built and released suddenly. It isbecause of this puff of air that the consonants of thiscolumn are called aspirates. While pronouncing them,one should be able to feel a clear puff of air by placingthe back of one’s palm in front of one’ mouth. Thedifference between the aspirate and the unaspirateconsonants is crucial in Sanskrit. The five aspiratestops are the following:

[k kha N cha B wha Fk tha Q pha

9. Voiced stops. The consonants of the third columnof the ‘stops’ are voiced counterparts of theconsonants of the first column. In theirpronunciation the vocal chords vibrate to produceresonance. These consonants are :

Cons. as in Notes

Xk g in gum

Tk j in jug front of tongue pressingagainst palate

M d in duck

n th in thus tip of tongue pressingagainst upper teeth

Ck b in but

10. Aspirated voiced consonants. The consonantsin the fourth column of the ‘stops’ are pronouncedlike those of the column 3 but a strong puff of air isadded in their articulation. These consonants are bothaspirate and voiced. They are:

?k gha > jha < fha /k dha Hk bha

11. The nasal consonants. The five consonants inthe last column are all nasal consonants. They arearticulated from the same place as the respectiveconsonants of the first column, but in theirpronunciation part of the air is let out through thenose. These consonants are:

³ xa ×k `a .k za u na e ma

12. Semi-vowels and sibilants. Semi-vowels areconsonants that have qualities of both vowels and

Contents Page 40

Page 43: tdiljan2002

consonants. In the articulation of sibilants somefriction is created by the position of the tongue andthe air comes out of the mouth with a hissing sound.The pronunciation of the Sanskrit semi-vowels andsibilants is generally like that in English, but thesound of j~ (r) has a strong trill in it. In the productionof this sound the tip of the tongue continuouslyvibrates against the front part of the palate. Thepronunciation of "k~ (2 ) was formerly as a cerebralconsonant but now it is pronounced almost like

palatal 'k~ (1 )-Semi vowels as in Sibilants ss

; y in yes 'k sh in shut

j r in run "k as above

y l in love l s in sunOk v in vulture

Note: Ok is usually pronounced with the lower liptouching the upper teeth. But when o occurs as thesecond consonant in a consonant cluster, it ispronounced like w in water. Both the lips are thenrounded and the tongue remains neutral.

13. The aspirate consonant g (ha) and visarga (%).The letter g (ha) denotes the aspirate and voicedconsonant as in the beginning of the English words

hut and happy. The symbol % (called visarga) denotesan unvoiced sound resembling that of h and isfrequently used at the end of words in Sanskrit.

14. The marks of the vowels m and Å combine withthe letter j~ in the following manner :

j~$ m = #] j~$ Å = :15. The mark of the vowel _ combines with theconsonant g~ in the following manner :

âr h3ta carried away Lkân; sah3daya kind

16. When no vowel mark is attached to theconsonant, it is pronounced with the vowel afollowing it. When the sound of the consonant itselfis to be shown, without any vowel following it, themark ~ (called halanta) is placed below it. In thewords below, the last consonant, marked withhalanta, is pronounced by itself without any vowel.

vge~ aha/ I rr~ tat that (n. sg.)

Consonant Clusters

17. When a consonant is followed by anotherconsonant without any intervening vowel, we get aconsonant cluster. Different methods are used torepresent such clusters.

If the first letter in the cluster has a vertical line onthe right side, that line is removed and the twoconsonants are joined together. Thus,t~ $ o ¾ To] l~ $ r ¾ Lr] u~ $ ; ¾ U;Tokyk jvqlq a flame 'kwU;e~ 1[nyam void, zero

18. If the consonant j~ (r) is followed by anotherconsonant, then j~ is represented by the mark Zplaced above the following consonant. Thus,j~ + o = oZ] j~ + ; = ;ZloZe~ sarva/ all, (n. sg.) vk;Z% qrya4 a nobleman

19. If j~ (r) follows a consonant in a cluster and isitself followed by a vowel, it is represented in differentways depending upon the shape of the precedingconsonant. If the preceding consonant has a verticalline in it, then j~ is represented by the mark z placedat the lower part of the vertical line. Thus,x~ + j = xz, i~ + jh = izhxzke% grqma4 a village ?kzk.k% ghrqza4 sense of smell

20. The mark z is also attached to the lower partof the letters n and g to show their cluster with j-æo% drava4 a liquid Ðkl% hrqsa4 decline

21. The cluster of V~ and j is written as Vª-m"Vª% u2wra4 a camel jk"Vªe~ rq2wram kingdom, nation

22. The cluster of 'k~ and j is written as J-Je% 1rama4 labour, Jks=ke~ 1rotram sense of hearing, ear

23. The letters d and Q drop these right side ‘hook’if they are the first letter in a cluster. Thus,D + ; = D;] D + o = Do] Q~ + y = ¶yokD;e~ vqkyam a sentence, iDo pakva cooked, ripe

24. The cluster of d~ and r (Dr) is also written as ä-Hkä% bhakta4 a devotee, 'kfDr% 1akti4 energy

25. The cluster of r~ with r is usually written as Ùk-lÙkk sattq existence egÙoe~ mahattvam greatness, importance

Contents Page 41

Page 44: tdiljan2002

26. The cluster of n~ and g~ with certain otherconsonants has been traditionally written in a specialcombined form of the two letters, but now there is atendency to write such clusters with a halanta markunder n~ and g~. Some examples are given below:

n~ + ; ¾ |] n~ + o = }] g~ + ; = á] g~ + e ¾ ãfo|k] fon~;k vidyq knowledge, learningcqð%] cqn~/k% Buddha4, The Buddha,vlá] vlg~; asahya (adj.) unendurable

27. The following clusters are written with specialletters:d~ + "k = {k (k2a), r~ + j = =k (tra), t~ + ×k = K (j`a).

v=k atra here j{kk rak2q defence

fp=ke~ citram a picture Kkue~ j�qnam knowledge

Note: 1. {k] =k and K are often written at the end ofthe Devanagari alphabet as independent letters.

2. The letter K is now mostly pronouned as gya. Insome areas it is pronouned as dna.

28. We saw above that each group of ‘stop’consonants has a nasal consonant in it. To representa nasal sound before a non-nasal stop consonantwithin the same word, the nasal consonant of thatparticular group is written.

v³~d% axka4 a mark, a lap p×py ca`cala restless

vUr% anta4 end d.B% kazwha4 throat

dEi% kampa4 tremor vkjEHk% qrambha4 beginning

29. The nasal sounds before semi-vowels (;] j] y]o), sibilants ('k] "k] l) and the aspirate (g) are of anindistinct nature. They are all represented by a dotplaced above the letter preceding them:

la;e% sa/yama4 control Lka'k;% sa/1aya4 a doubt

va'k% a/1a4 a part lalkj% sa/sqra4 the world

laokn% sa/vqda4 dialogue lagkj% sa/hqra4 destruction

30. Because of the facility in printing, there is nowa tendency to represent the nasal sounds within aword by the dot. Thus the words given above in 29may be occasionally seen as printed below :

vad%] var%] dai%] papy] daB%] vkjaHk%Note: Even though the purists insist that the nasalsounds be always written in their original form, theuse of the dot is increasing and is not incorrect. As

the nature of the nasal sound is governed by thefollowing consonant, this representation of differentnasal consonants by the dot has absolutely no effecton their pronunciation. In fact, the Government ofIndia has adopted a policy that such Sanskrit words,when used in Hindi, be always written with thedot.

31. Formerly, in a consonant cluster the twoconsonants were often written one upon the other.Many Sanskrit books are full of instances of suchwriting. But now there is a definite tendency towardsusing simplified forms of clusters using the halantamark. When consonants are written one above theother, the consonant written above is pronouncedfirst.:vWó%] v³~xss% axga4 part of the body iê%] iV~V% pawwa4 a slab

32. For the placement of the mark of the vowel b,the consonant cluster is regarded as one letter andthe mark is placed before the cluster even thoughthe vowel b is pronounced after the cluster:

'kkfUr% 1qnti4 peace eqfDr% mukti4 liberation

Note: When a consonant cluster is formed by usinghalanta sign the mark of vowel b is placed before thesecond letter as in cqn~f/k] iV~fVdk-33. The sign Z is written after the vowel marksabove the horizontal line even though it ispronounced before the consonants and theaccompanying vowel.

LkosZ (all) = l~ $ v $ j~ $ o~ $ ,]ÅfeZ% (a wave) = Å $ j~ $e~ $ b%

34. The symbol · (avagraha) is used to show the

elision of the vowel v when two words are joined insound-blending (sandhi):

rs $ vfi = rs·fi (they also)

35. The sacred symbol vkse~ is often written as ¬ .

36. For the mark of the full stop at the end of asentence a vertical line (A ) is used. This is the onlytraditional punctuation mark in Sanskrit. Nowinternational symbols such as question mark, quotationmarks, hyphen etc. are also being commonly used.

37. The Devanagari numerals are written as follows:

Contents Page 42

Page 45: tdiljan2002

Arabic 1] 2] 3] 4] 5] 6] 7] 8] 9] 0Devanagari 1, 2, 3, 4, 5, 6, 7, 8, 9, 0

To give some idea of written Sanskrit, some wordsand sentences are being given below:

a. Some words related to space.v=k here ;=k where (relative)r=k there ;=k&r=k here and there

dq=k where? fudVs nearbyloZ=k everywhere nwjs far off

v/k% down mifj above, on

b. Some words related to time.v/kquk now dnkfi u neverrnk then lnk always

bnkuhe~ now loZnk alwaysdnk when? v| today;nk when izkr% in the morning

(relative)

;nk&dnk now and lk;e~ in the eveningthen

c. Days of the weekjfookj% Sunday xq#okj% Thursday

(c`gLifrokj%)lkseokj% Monday 'kqØokj% Fridayeaxyokj% Tuesday 'kfuokj% Saturday

cq/kokj% Wednesday lIrkg% a week

d. Divisions of time.dky%/le;% time oknudkys at o’ clocklk;adky% evening ekl% month

izkr%dky% morning i{k% a fortnightgksjk ( f. ) an hour o"kZ%] o"kZe~ year

g~;% yesterday jkf=k ( f. ) night'o% tomorrow o"kkZdky% rainy seasonij'o% day after xzh"e% summer

tomorrowe/;kg~u% noon olUr% springvijkg~.k% afternoon gseUr%]f'kf'kj% winter

e/;jkf=k% midnight 'kjn~ ( f. ) autumn

e. Directions.iwoZ east ij other

if'pe west voj hither

mÙkj north mifj abovenf{k.k south v/kj downward

f. Cardinal and ordinal directions.,d—izFke one—first

f}—f}rh; two—secondf=k—r`rh; three—thirdprqj~—prqFkZ four—fourthi×pu~—i×pe five—fifth"k"k~—"k"B six—sixth

lIru~—lIre seven—seventhv"Vu~—v"Ve eight—eighthuou~—uoe nine—ninthn'ku~—n'ke ten—tenth

Below are given some sayings (lwfDr) that havebecome part of the Indian thought and are oftenused even by people who do not know muchSanskrit.lR;eso t;rsA The Truth always prevails.

ekrk Hkwfe% iq=kks·ga i`fFkO;k%A The earth is my mother, Iam her son.

tuuh tUeHkwfe'p LoxkZnfi Mother and motherland are

xjh;flA even greater than heaven.mnkjpfjrkuka rq olq/kSo The whole earth is a familydqVqEcde~A for the large-hearted.

Lons'ks iwT;rs jktk fo}ku~ The king is worshipped inloZ=k iwT;rsA his own country, a scholar

is worshipped everywhere.deZ.;sokf/kdkjLrs ek Qys"kq Your right is only the actiondnkpuA not in the fruit thereof.

;ksx% deZlq dkS'kye~A Yoga is skill in actions.vfgalk ijeks /keZ%A Ahimsa is the highest

religion..ekua fg egrka /kue~A Honour is the wealth of the

great.laxPN/;a laon/oe~A Walk together, speak in

one voice.'krgLr lekfdj Collect with hundredlglzgLr lafdjA hands, distribute with

thousand hands._rs Kkuku~ u eqfDr%A There is no liberation

without knowledge.

Contents Page 43

Page 46: tdiljan2002

rr~ Roe~ vflA Thou art that.v;ekRek czg~eA This soul of man is

Brahman.lo± [kyq bna czg~eA All this in the universe is

only Brahman.,da ln~ foizk% cgq/kk There is only one truth,

onfUrA scholars describe in manyways.

losZ HkoUrq lqf[ku%A Let all people be happy.

;Fkk jktk rFkk iztkA As is the king so is hispublic.

;Fkk fi.Ms rFkk As in the body so in the

czg~ek.MsA universe.eq.Ms eq.Ms efrfHkZUukA Opinions differ from

person to person.cqn~f/k;ZL; cya rL;A Intelligence is strength.lUrks"k% ijea lq[ke~A Contentment is the

highest happiness.vfr loZ=k otZ;sr~A One should give up

excess everywhere.;kn`'kh Hkkouk ;L; As is the inner feeling sofln~f/kHkZofr rkn`'khA is the result.Jn~/kkoku~ yHkrs Kkue~A One with faith gains

knowledge.vFkZL; iq#"kks nkl%A Man is slave to money.vkjksX;a ijeks ykHk%A Health is the highest gain.

'kqHkkLrs lUrq iUFkku%A May your path beauspicious.

opus dk nfjnzrkA Why be miserly in (kind)words!

eu ,o euq";k.kka dkj.ka| Mind is the cause of man’scU/keks{k;ks%A bondage and liberation.

yksHk% ikiL; dkj.ke~A Greed is the cause of sin.fouk'kdkys foijhrcqn~f/k%A At the time of one’s

downfall, his intelligencegets perverted.

This article is based on the initial part of Sandhaan’sCorrespondence Course in Sanskrit.

(Courtesy : Prof. Anil Vidyalankar,Former Professor at NCERT,

Presently Director Sandhaan, New DelhiPh. 91-11-6863126,

e-mail :[email protected])

Typical Colloquial Sentences in Sanskrit

GREETING

w Hello+滃 , ¦ÉÉä:, xɨÉÉä xɨÉ:, ½þ±ÉÉ - ºjÉÒ˱ÉMÉayi , bh°:, nam° nama:, hal¡ - str¢li´ga

w Good MorningºÉÖ|ɦÉÉiɨÉÂ*suprabh¡tam

w Good AfternoonºÉÖ ÉvªÉɼxɨÉÂ*sumadhy¡hnam

w Good Night¶É֦ɮúÉÊjÉ:*¿ubhar¡tri:

w Good ByeVªÉÉäEÂò ,{ÉÖxĘ́ɱÉɨÉ:*jy°k ,punarmil¡ma:

w ThanksvÉxªÉ´ÉÉnùÉ:*dhanyav¡d¡:

w How are you?¦É´ÉÉxÉÂ({ÉÖÆϱ±ÉMÉ)/¦É´ÉÊiÉ (ºjÉÒ˱ÉMÉ) EòlɨÉ +κiÉ?bhav¡n(puÆlli´ga)/bhavati (str¢li´ga) katham asti?

w I am fine thank you+½þ¨É EÖò¶É±É: +κ¨É ,vÉxªÉ´ÉÉnùÉ:({ÉÖÆϱ±ÉMÉ) +½þ¨É EÖò¶É±ÉÉ+κ¨ÉvÉxªÉ´ÉÉnùÉ:(ºjÉÒ˱ÉMÉ) *aham ku¿ala: asmi ,dhanyav¡d¡:(puÆlli´ga) ahamku¿al¡ asmi ,dhanyav¡d¡:(str¢li´ga)

w SorryIɨªÉiÉɨÉÂ*kÀamyat¡m

WEATHER

w It is cold¶ÉèiªÉÆ ´ÉiÉÇiÉä*¿aityaÆ vartat®

w It is cool outside¤Éʽþ: ¶ÉÒiÉÆ +κiÉbahi: ¿¢taÆ asti

Contents Page 44

Page 47: tdiljan2002

w It is hot=¹hÉÆ +κiÉ*uÀ¸aÆ asti

w It is raining´É¹ÉÇÊiÉ*varÀati

GENERAL

w What is your name?¦É´ÉiÉ:({ÉÖÆϱ±ÉMÉ) / ¦É´ÉiªÉÉ:(ºjÉÒ˱ÉMÉ) xÉÉ¨É ÊEò¨É ?¦É´ÉiÉ:({ÉÖÆϱ±ÉMÉ) /¦É´ÉiªÉÉ:(ºjÉÒ˱ÉMÉ)+ʦÉvÉÉxɨÉ ÊEò¨ÉÂ?bhavata:(puÆlli´ga)/ bhavaty¡:(str¢li´ga) n¡ma kimbhavata::(puÆlli´ga) /bhavaty¡:(str¢li´ga)abhidh¡nam kim?

w My name is Ranjan¨É¨É xÉÉ¨É ®ú\VÉxÉ: +κiÉ / ¨É¨É +ʦÉvÉÉxɨÉ ®ú\VÉxÉ: +κiÉ*mama n¡ma raµjana: asti / mama abhidh¡namraµjana: asti

w Where do you live?¦É´ÉÉxÉÂ({ÉÖÆϱ±ÉMÉ)/¦É´ÉÊiÉ (ºjÉÒ˱ÉMÉ) EÖòjÉ ´ÉºÉʺÉ?bhav¡n(puÆlli´ga)/bhavati (str¢li´ga) kutra vasasi?

w I live near Ghantaghar+½þ¨É PÉh]õÉPÉ®Æú ÊxÉEò¹ÉÉ/ºÉ¨ÉªÉÉ ´ÉºÉÉʨÉ*aham gha¸¶¡gharaÆ nikaÀ¡/samay¡ vas¡mi

w How old are you?¦É´ÉiÉ:({ÉÖÆϱ±ÉMÉ) / ¦É´ÉiªÉÉ:(ºjÉÒ˱ÉMÉ) EòÊiÉ+ɪÉÖ: ´ÉiÉÇiÉä?bhavata:(puÆlli´ga) / bhavaty¡:(str¢li´ga) kati ¡yu:vartat®?

w That building is talliÉnÂù ¦É´ÉxɨÉ =zÉiÉÆ +κiÉ *tad bhavanam unnataÆ asti

w She is beautifulºÉÉ ºÉÖxnù®úÉ +κiÉ *s¡ sundar¡ asti

w I like Bengali sweets¨ÉÁ¨É ´ÉRÂóMÉʨɹ]õÉzÉÆ ®úÉäSÉiÉä*mahyam va´gamiÀ¶¡nnaÆ r°cat®

w I love Birds¨ÉÁ¨É JÉMÉÉ: ®úÉäSÉxiÉä*mahyam khag¡: r°cant®

w Where is Railway station?vÉÚ É¶ÉEòÊ]õEòÉÊxɱɪÉÆ / ±ÉÉè½þ{ÉnùÊxɱɪÉÆ EÖòjÉ ´ÉiÉÇiÉä?dh£ma¿aka¶ik¡nilayaÆ / lauhapadanilayaÆ kutravartat®?

w How far is the Bus terminal from here?EòÊiÉ nÚù®Æú ¤ÉºÉªÉÉxÉ-+ÎxiɨɺlɱÉÆ <iÉ:?kati d£raÆ basay¡na-antimasthalaÆ ita:?

w How long will it take to reach the Airport?EòÊiÉ ´Éä±ÉÉ ¦ÉʴɹªÉÊiÉ ´ÉɪÉÖªÉÉxÉ-=bÂ÷b÷ªÉxÉEäòxpÆù MÉxiÉÖÆ <iÉ:?kati v®l¡ bhaviÀyati v¡yuy¡na-u··ayanak®ndraÆgantuÆ ita:?

w Is Mr. Raghunath there?ÊEò¨É ¸ÉÒ¨ÉÉxÉ ®úPÉÖxÉÉlÉ: iÉjÉ ºÉÎxiÉ?kim ¿r¢m¡n raghun¡tha: tatra santi?

w Please tell him to call back as soon as he is free.EÞò{ɪÉÉ ªÉnùÉ ºÉ: ÊxÉ´ªÉǺiÉÉä ¦ÉʴɹªÉÊiÉ iÉnùÉ iÉÆ ¨ÉªÉÉ ºÉ½þºÉƦÉɹÉʪÉiÉÖ É EòlɪÉk¤pay¡ yad¡ sa: nirvyast° bhaviÀyati tad¡ taÆmay¡ saha sambh¡Àayitum kathaya

w How much will it cost?+ºªÉ EòÊiÉ ¨ÉÚ±ªÉÆ ´ÉiÉÇiÉä?asya kati m£lyaÆ vartat®?

w Excuse meIɨÉÉÆ Eò®úÉäiÉÖ*kÀam¡Æ kar°tu

w From which Platform can I get the train forChandigarhEòκ¨ÉxÉ MɨÉxÉÉMɨÉxɺlɱÉä +½þ¨É SÉhÉb÷ÒMÉfÆø MÉxiÉÖÆ ®äú±ÉªÉÉxÉÆ |ÉÉ{iÉÖƶÉCxÉÉäʨÉ?kasmin gaman¡gamanasthal® ahamca¸a·¢ga·haÆ gantuÆ r®lay¡naÆ pr¡ptuÆ¿akn°mi?

w Does this train stop at AligarhÊEò¨ÉäiÉiÉ ®äú±ÉªÉÉxɨÉ +±ÉÒMÉfäø +´ÉºlÉÉ{ɪÉÊiÉ?kim®tat r®lay¡nam al¢ga·h® avasth¡payati?

w How many kids do you have?EòÊiÉ {ÉÖjÉÉ:/ ºÉÖiÉÉ: ¸ÉÒ¨ÉiÉɨÉ ({ÉÖÆϱ±ÉMÉ)/¸ÉÒ¨ÉiÉÒxÉɨÉÂ(ºjÉÒ˱ÉMÉ)?kati putr¡:/ sut¡: ¿r¢mat¡m (puÆlli´ga)/¿r¢mat¢n¡m(str¢li´ga)?

Contents Page 45

Page 48: tdiljan2002

w This gift is wonderfulBiÉiÉ ={ɽþÉ®Æú ¶ÉÉä¦ÉxɨÉ +κiÉ*®tat upah¡raÆ ¿°bhanam asti

w It is really prettyBiÉkÉÖ +iÉÒ´É ºÉÖxnù®Æú +κiÉ*®tattu at¢va sundaraÆ asti

w Food is delicious¦ÉÉäVÉxɨÉ ¤É½Öþ¯ûÊSÉEò®Æú/º´ÉÉÊnù¹]Æõ +κiÉ*bh°janam bahurucikaraÆ/sv¡diÀ¶aÆ asti

w Congratulations+ʦÉxÉxnùxÉÆ / Ênù¹]õ¬É ´ÉvÉÇxÉÆ *abhinandanaÆ / diÀ¶y¡ vardhanaÆ

w You look lovelyi´É¨É ºÉÖxnù®Æú/ SÉɯû:|ÉiÉÒªÉiÉä*tvam sundaraÆ/ c¡ru:prat¢yat®

w Wish you happy new yearxÉÚiÉxÉ ´É¹ÉÉÇʦÉxÉxnùxÉÆ / xɴɴɹÉǺªÉ ¶ÉÖ¦ÉEòɨÉxÉÉ*n£tana varÀ¡bhinandanaÆ / navavarÀasya¿ubhak¡man¡

w I wish you all the happinessiÉ´É ºÉ´ÉÇÊ´ÉvɺÉÉèJªÉÉlÉÈ EòɨɪÉÉ欃 /EòÉRÂóIÉÉʨÉ*tava sarvavidhasaukhy¡rthaÆ k¡may¡mi /k¡´kÀ¡mi

w Congratulations on your marriageÊ´É´ÉɽþÉlÉÈ ½þÉÌnùEòÉ: ¶ÉÖ¦ÉEòɨÉxÉÉ:/ Ê´É´ÉɽÆþ ={ɱÉIªÉ +ʦÉxÉxnùxÉÆ*viv¡h¡rthaÆ h¡rdik¡: ¿ubhak¡man¡:/ viv¡haÆupalakÀya abhinandanaÆ

w Keep your eyes wide open before marriage andhalf-shut afterwardsÊ´É´ÉɽþÉiÉ {ÉÚ ÉÈ xÉäjÉä +É´ÉÞiÉä EÖò¯û iÉi{ɶSÉÉiÉ /iÉnùxÉxiÉ®Æú +vÉÉÇ ÉÞiÉÆEÖò¯û *viv¡h¡t p£rvaÆ n®tr® ¡v¤t® kuru tatpa¿c¡t /tadanantaraÆ ardh¡v¤taÆ kuru

(Courtesy : Dr. D.K. Lobiyal,School of Computer and Systems Sciences, JNU

Ph. 91-11-610 7676-2774e-mail : [email protected])

3.2.1 Hindi Design Guide

Introduction

India is a vast country under the feet of the greatHimalayas in the northern end and is nourished byhuge oceans on the three sides. The southwest sideof India has the border of Arabian Sea whereassoutheast is lulled by the Bay of Bengal and thesouthern most part of India i.e. Cape Comorin(Kanyakumari) is washed by the India ocean. Its areais 3214 Kms. From north to south and 2933 Kms.from east to west. It has a land frontier of 15,200Kms. and a coastline of 7516.5 Kms. It lies to thenorth of the equator between 8.4 and 37.6 degreesnorth latitude and 68.7 and 97.25 degrees eastlatitude. It has six seasons – summer, winter, spring,Rainy, autumn, Mild cold.

India shares its political borders with Pakistan onthe west, Bangladesh and Myanmar on the east, andthe countries Nepal, China, Tibet and Bhutan inthe north.

India is a multilingual multiethnic and pluriculturalcountry. It has 28 states and 6 union territories. Itscapital is New Delhi.

Languages in India

As per census of India (1961), 1652 mothertonguesare spoken in India whereas 33 languages are majorlanguage having more than ten lakhs according tothe 1971 census. These languages are written using10 major script systems and a host of minor ones.

18 languages – Assamese, Oriya, Bengali, Urdu,Kannada, Kashmiri, Gujrati, Tamil, Telugu, Punjabi,Marathi, Malayalam, Konkani, Nepali, Manipuri,Sanskrit, Sindhi and Hindi are the languagesenshrined in the 8th schedule of the constitution ofIndia.

Hindi written in Devanagari script is the officiallanguage of the Indian Union according to the Article343 of the constitution of India.

Description of Hindi language

In present India, Hindi functions as a languagehaving multifaceted domains of use. It performsvarious roles & functions in the network of speech

Contents Page 46

Page 49: tdiljan2002

in Awadhi, Dingal and Maithali. The Khariboli,which was neither used in official transaction norfor the standardized usage in creative pursuits, wasserving as a language of wider communication. Theofficial status was assigned to it for the first time inGolkunda (present Hyderabad and Bijapur inAndhra Pradesh and Karnataka respectively) by theMuslim rulers and the sufi saints of southern Indiaused it for literary pursuits. Therefore, the zaban-e-Hind (language of India) was named as Dakkhini.

Some of the scholars opined that Hindi as avernacular is outcomes of inter mixture of twocultures or languages – the native and the Muslims.This mixed language was used not in the stabilizedterm in the royal camps and was called zaban-i-urdu-e-mualla (language of camp) which was later calledurdu or Rekhta. It is to be noted that in the royalcourts, the broad-based element of bhakha fromHindawi was consciously replaced by foreign loanwords and expression of persian origin. Urdu aslanguage or style was used to designate this variantof Hindi, which was non-Indian in spirit alien inliterary norms and style and was, restricted to theofficial transactions by the elite class. It is worthmentioned that the word Urdu was not heard till18th century though Muslim invaders got settled in Indiaatleast 500 years before the word Urdu was used.

By the time Britishers settled as rulers of India, therewas a deep gap between the two forms of verbalexpressions – Hindustani and Urdu. That is why,Gilchrist described three distinct varities (1) Highcourt or persian style (2) the genuine (or middle)style and (3) the Hindawi (or vulgar) style.

Mahatma Gandhi realized that English rulers arepoliticising the language problem and givingcommunal touch to the Hindi-Urdu dichotomy. Heput forward the composite concept of lingua francaand after accepting Hindustani as a commoninvariant of the colloquial usage characterized Hindias Hindustani written in Devanagari script and Urduin Perso-Arabic alphabets. There have been differentkinds of forces at different historical points, whichhelped enlarging the gulf vernacular. The intellectualsand the creative writers began to reshape the languagewith chaste and pure lexicon and expressive devices

communications. It is not only the official languageof the Union Government of India and the nine states– Uttar Pradesh, Uttranchal, Bihar, Jharkhand,Madhya Pradesh, chhatisgrah, Rajasthan, Haryana,Himachal Pradesh and the union territories of Delhiand Andman & Nicobar, but also the powerfulmedium of trade & commerce, mass communicationand day to day practical needs of inter-groupinteraction it serves throughout India as a languageof wider communication. It is only the language ofIndia, which like Sanskrit and English functionsbeyond its region. It shows that Hindi may haverestricted domains of operation but unlike Englishits uses are unlimited in number.

Hindi belongs to Indo-Aryan languages a sub-groupof the Indo-European family. The Indo-Aryanlanguages show an uninterrupted chain ofdevelopment from 300 BC to the present day whichis broadly classified into three major periods – OldIndo Aryan (OIA), Middle Indo-Aryan (MIA) andNew Indo-Aryan (NIA) – commonly known as theperiod of Sanskrit, Prakrit/Apabhramsa and Bhakharespectively. Hindi was the dialect of midland, whichfrom OIA period through MIA stage culminatedinto the group of dialects belonging to western Hindiincluding Braja, Bundeli, Khariboli Bangaru. Thelinguistic matrix of Hindi is an outgrowth ofKaurawi, a subgroup of western Hindi havingKhariboli and Bangaru therein.

History of Hindi language

Hindi was initially used for the inhabitants of India,mostly of Northern India or Madhyadesha andHindawi or Hindooi was the word commonly usedfor the major languages of this territory. Whendialects of Bhakha period (NIA) were evolving theirnorms of usage and standard variant for literarypursuits, Northern India was constantly facing thepressure from foreign invasion, particularly Mughalinvasions and their settlement in India. This hascreated a complex situation. Whereas the invadershad Turkish as their mothertongue, Arabic languageof religion and persian a language of officialtransaction and literary activities but in Madhyadeshaliterary creativity was pursued in their regionallanguage – primarily in Brajbhasha and secondarily

Contents Page 47

Page 50: tdiljan2002

from Sanskrit language. Thus, Hindi and Urdu havebeen identified as two distinct languages and Hindihas different roles in the present India i.e. Pan-Indianlanguage (contact language), official language,literary language etc.

Population using the Hindi language

As per data of census report of 1991, Hindi-Urdutogether claim 44.98 percent of the entire populationas native speakers (Year Book 2001). In other statesand union territories of India, 50% population usedHindi-Urdu as mother tongue. For example,Maharastra (15.09%), Karnataka (11.93%), AndhraPradesh (11.13%), West Bengal (8.72%), Punjab(7.29%), Goa (7.13%) and Andman Nicobar(17.63%) are worth mentioned.

A large number of Hindi speaking population isfound in Fiji, Surinam, Guinea, Trinidad & Tobagoas well as in Mauritius where the Indian origin peopleare settled for the last two hundred years. The otherEuropean and African countries as well as thecountries like America have also good populationspeaking Hindi-Urdu speakers. According to theircensus report, USA (26,253), Germany (24,500),New Zealand (11,200), South Africa (890,292),Yemen (232,760), Uganda (147,000) Singapore(5,000), Bangladesh (346,000) are worth mentioned.

Technical Characteristics

Hindi Alphabet CharacteristicsThe Hindi language, in common with Marathi,Konkani, Nepali, Sindhi and many other northIndian dialects, is written in the Devanagari scriptwhich is also the accepted All-India script forSanskrit.

The alphabets consist of 11+1 = 12 vowels and 35consonants, as given below:

(a) Vowels ¼Loj½+ +É < <Ç = >ð @ñ B Bä +Éä +Éè +Éìa ¡ i ¢ u £ ¥ e ® o au ¡£

(i) +Éì (¡£) indicates English O in words like office(+ÉìÊ¡òºÉ), Coffee (EòÉì¡òÒ), Sauce (ºÉÉìºÉ) etc.

(ii) The vowel @ñ occurs only in Sanskrit wordsborrowed into Hindi. In Hindi @ñ is pronouncedas ‘r + i = ri’ whereas it is pronounced as ‘r + u =

ru’ in some south Indian languages.

(iii) #Æ (anusw¡ra) and #& (visarg) are often includedin the list of vowel letter and are usually writtenas +Æ and +&, but so as Hindi is concerned, theyare mostly used with consonant.

(iv) For all practical purposes, + - +É, <-<Ç and =->ðmay be regarded as pairs of short and longvowels. B-Bä and +Éä-+Éè are all long vowels.

Dependent Vowel Signs (M¡tr¡s)

To indicate a vowel sound other than the implicitone, a vowel sign (m¡tr¡) is attached to the consonant.Thus, there are equivalent m¡tr¡s for all the vowels.Explicit appearance of a m¡tr¡ in a syllable overridesthe inherent vowel. These m¡tr¡s can exist alonebelow, to the right or to the left of the consonant towhich it is applied to. The ‘m¡tr¡s’ mostly come afterthe consonant letters as below:-+É ⇒ É, < ⇒ Ê , <Ç ⇒ Ò, = ⇒ Ö, >ð ⇒ Ú@ñ ⇒ Þ, B ⇒ ä, Bä ⇒ è, +Éä ⇒ Éä, +Éè ⇒ Éè

+ (a) has no m¡tr¡. The m¡tr¡s É (+É ), Ò (<),Éä (+Éä), Éè(+Éè )are written after the consonant whereas Ê (<) iswritten before, Ö (=), Ú (>ð) and Þ (@ñ )are writtenbelow and ä (B) and è (Bä ) are written above. Thus

EÂò + +É = EòÉ EÂò + @ñ = EÞòEÂò + < = ÊEò EÂò + B = EäòEÂò + <Ç = EòÒ EÂò + Bä = EèòEÂò + = = EÖò EÂò + +Éä = EòÉäEÂò + >ð = EÚò EÂò + +Éè = EòÉè

With ®Âú (r) = and >ð m¡tr¡s are written in anexceptional form i.e. ®Âú + = = ¯û and ®Âú + >ð = °ü

It may be noted that the m¡tr¡ is tagged on to theconsonant letter and is never written in full. Thus,EÂò + < (k + i) will not be written as Eò< but as ÊEò, EÂò+ = = EÖò is the correct form of m¡tr¡ and not Eò=. InEò< (kai) and Eò= (kau) forms, < and = are vowel infull form, not the m¡tr¡s.

(b) Consonant Letters ¼O;atu½Eò (ka) JÉ (kha) MÉ (ga) PÉ (gha) Ró (´a)SÉ (ca) Uô (cha) VÉ (ja) ZÉ (jha) \É (µa)]õ (¶a) B (¶ha) b÷ (·a) fø (·ha) hÉ (¸a)iÉ (ta) lÉ (tha) nù (da) vÉ (dha) xÉ (na){É (pa) ¡ò (pha) ¤É (ba) ¦É (bha) ¨É (ma)ªÉ (ya) ®ú (ra) ±É (la) ´É (va) ¶É (sha)

Contents Page 48

Page 51: tdiljan2002

¹É (Àa) ºÉ (sa) ½þ (ha) c÷ (¤a) gø (¤ha)

The following points are to be noted :(i) + (a) is a inherent in each consonant letter.

(ii) ¹É (Àa) occurs only in Sanskrit words borrowedinto Hindi.

(iii) Ró (´a), \É (µa), c÷ (¤a) and gø (¤ha) never occurin the beginning of the word; Ró and \É and neveroccur independently themselves. They arealways combined with a following consonant.

The first twenty-five consonants i.e. (ka) to (ma)are divided into five categories (Varga):ka category (Eò ´ÉMÉÇ) - Eò JÉ MÉ PÉ Róca category (SÉ ´ÉMÉÇ) - SÉ Uô VÉ ZÉ \ɶa category (]õ ´ÉMÉÇ) - ] B b÷ fø hÉta category (iÉ ´ÉMÉÇ) - iÉ lÉ nù vÉ xÉpa category ({É ´ÉMÉÇ) - {É ¡ò ¤É ¦É ¨É

The fifth letters of each category, Ró, \É, hÉ, xÉ, and ¨Éare nasals.

Rest of the consonants can be placed in an unmarkedcategory.

Consonant ConjunctsThe device of conjoining consonant letters was usedin writing Sanskrit to indicate the pronunciation ofconsonants without an intervening inherent a.Traditional conjunct consonant letters are IÉ (ksha),YÉ (jna), jÉ (tra), ¸É (shra) and t (dya). It is to benoted that in Hindi YÉ is pronounced as MªÉ (gya).Traditional conjunct consonant letters are verycommon in Sanskrit loanwords.

The common conjuncts are listed as given below:CEò (kka), CJÉ (kkha), CiÉ (kta), CªÉ (kya), C±É (kla),C´É (kva), JªÉ (khya), Mnù (gda), SSÉ (cca), hBö (¸¶ha),hªÉ (¸ya), kÉ (tta), k´É (ttva), ilÉ (ttha), xxÉ (nna), x¨É(nma), xªÉ (nya), {iÉ (pta), {ªÉ (pya), |É (pra), ¤VÉ (bja),¤nù (bda), ¤¤É (bba), ¨¨É (mma), ¨ªÉ (mya), ªªÉ (yya), ±½þ(lha), ´ªÉ (vya), ¶Eò (shka), ¹Eò (Àka), ¹]õ (Àta), ¹hÉ(Àna), ]Âõ]õ (¶¶a), bÂ÷b÷ (··a), qù or nÂùnù (dda), rù or nÂùvÉ(ddha), uù or nÂù´É (dva), ¼xÉ or ½ÂþxÉ (hna), À or ½Âþà É(hma), ¿ (hra), ¼±É (hla), ¼´É (hva).

It is to be noted that conjuncts involving initial r arewritten with a special superscript form for 'r': Thus®Âú + Eò = EÇò (rka), ®Âú + ¨É = ¨ÉÇ (rma), ®Âú + ¹É = ¹ÉÇ (rÀa) etc.

written at the end of its syllable such as ®Âú + ¹É + É =¹ÉÉÇ (rÀha), ®Âú + lÉ + Ò = lÉÔ (rth¢), ®Âú + ªÉ + Éä = ªÉÉæ (ryo).

But when ‘®ú’ (r) follows a consonant letter havingvertical stroke below and to the left of the stroke: EÂò+ ®ú = Gò (kra), ¨É + ®ú = ©É (mra), nÂù + ®ú = pù (dra).

When preceeded by ]õ, B, b÷, fø and Uô it is written asgiven:]Âõ + ®ú ]Åõ (¶ra), bÂ÷ + ®ú bÅ÷ (·ra).

Halant #ÂúThe Halant is the vowel + (a) omission sign. It servesto cancel the inherent vowel of the consonant towhich it is applied. When the simple consonantwithout the inherent ‘+’ is specifically be expressed,(a sign right slanting stroke) called Hal or Halant, isput below the letter. But it should be noted thathalant specifically is not used below the consonantletters having vertical stroke. In these letters, onlyvertical stroke is removed. E.g. JÉ ⇒ J , MÉ ⇒ M ,SÉ ⇒ S , iÉ ⇒ i , {É ⇒ { , ±É ⇒ ±. Even the verticalstroke in between the consonant letters like Eò and ¡òare not removed, only a part of the right portion ofthe letter is removed. E.g. E ⇒ C, ¡ò ⇒ }

Halant is used on those consonant letters, which haveno vertical stroke in them. E.g. UÂô (ch), ]Âô (¶), B (¶h),fÂø (·h), nÂù (d), ®Âú (r), ½Âþ (h). When the inherent +occuring at the end of the some of the words is silent,such as +lÉÉÇiÉ (arth¡t), {ÉÊ®ú¹Énù (pariÀhad), halant canbe used below the consonant.

Nukt¡ #ÃA subscript dot (nukt¡) is sometimes used withcertain Devanag¡ri letters to denote sounds of non-Indian origin in loan words. Fine of the consonantletters in Devan¡gari with nukt¡s (diacritic mark)represents some of persian, Arabic and Englishsounds. This usage in common, but not obligatory,the more so since the great majority of Hindi speakerstend to replace these sounds with sounds of Indianorigin. E.g. Fò (qa), KÉ (xa), NÉ (Ga), WÉ (za), ¢ò (fa).

In the letters Fò (qa), KÉ (xa), NÉ (Ga), nukta or dot isnot mostly used as majority of Hindi speakers replacethem with Eò (ka), JÉ(kha) and MÉ (ga).

Anusw¡ra #ÆAnusw¡ra indicates a nasal sound. It is a ‘homorganic’

Contents Page 49

Page 52: tdiljan2002

nasal representing Ró, \É, hÉ, xÉÂ, and ¨É belonging toany of the five ka, ca, ¶a, ta and pa categories. Whenthe nasal consonant precedes the consonant of thesame category, it will be taken as anusw¡ra. Forexample,+RÂóEò = +ÆEò (aNk) ¨É\SÉ = ¨ÉÆSÉ (maNc), ¨Éhb÷xÉ =¨ÉÆb÷xÉ (maN·an), ʽþxnùÒ = ˽þnùÒ (hiNd¢) ±É¨¤ÉÉ = ±ÉƤÉÉ(laNb¡). Here ‘N’ is the sign of anusw¡ra. It is to benoted that in Sanskrit anusw¡ra #Æ is not usually usedas a homorganic nasal for nasal consonants RÂó, \ÉÂ, hÉÂ,xÉÂ, ¨ÉÂ. These nasal consonants are used themselves.

It is to be mentioned that before ªÉ (ya), ´É (va), ®ú(ra), ±É (la), ¶É (sha), ºÉ (sa), anuswara (#Æ) is usedpreceding them. Such as ºÉÆªÉ¨É (samyam), ºÉÆ´ÉÉnù(samv¡d), ºÉÆ®úIÉhÉ (samraksha¸), ºÉƱÉÉ{É (saml¡p), ºÉƺÉÉ®ú(samsar). Here ‘ºÉÆ’ represents prefix ºÉ¨É. But inreduplicated form of the nasal letters such as +xxÉ(anna), ºÉ¨¨ÉÉxÉ (samm¡n), iÉÖ ½þÉ®úÉ (tumh¡ra), anusw¡rais not used. Similarly in ªÉ (ya) ´É (va) and ½þ (ha),anusw¡ra is not used and the nasal letter will comein its original form. E.g. +xªÉ (annya), ºÉɨªÉ (s¡mmya),ºÉ¨x´ÉªÉ (samanvay) and EòÉx½þÉ (k¡nh¡).

Anun¡sik #ÄThe superscript sign ‘chandrabindu’ (#Ä) representsanun¡sik sound. It is placed above a vowel denotingvowel nasality through the nose. E.g.½þÉÄ - +ÉÄJɽÚÄþ - {ÉÚÄUô

when a consonant has a vowel sign i.e. m¡tr¡ aboveits headline, the chandrabindu (#Ä) will be used as‘anusw¡ra’ although this anusw¡ra represents anun¡siksound i.e. chandrabindu. E.g. ˺ÉSÉÉ<Ç, JÉÓSÉxÉÉ, ¦Éå]õ,¦ÉéºÉ, {ÉÉåUôxÉÉ, ®úÉénùxÉÉ *

Visarga #&The sign (#&) called visarga is written in a linear way.It has the sound of a voiced g (ha) in Hindi. Itoccurs almost exclusively in Sanskrit words borrowedinto Hindi. E.g.+iÉ& (atah), |ÉɪÉ& (pr¡yah), ºÉɨÉÉxªÉiÉ& (s¡m¡nyatah)

NumeralsIn Devanagari, numerals are used as0 1 2 3 4 5 6 7 8 9

But at All India level, Roman numerals are used :

0 1 2 3 4 5 6 7 8 9

also marks SÉÉèlÉÉ (1/4) chauth¡, +ÉvÉÉ ¡dh¡ (1/2), {ÉÉèxÉÉpaun¡ (3/4), ºÉ´ÉÉ sav¡ (1¼), bä÷gø ·e¤h (1½), and føÉ<Ç·h¡i (2½).

Punctuation Marks (Vir¡ma)In Hindi, Sentences are concluded with the verticalmark (*) called p£rn vir¡ma or dand¡.

The vertical stroke is also used for marking the endof the first hemistich i.e. half verse. For marking theend of the verse itself two vertical strokes called d¢rghvir¡m may be used. E.g.

{ÉÉälÉÒ {Égø {Égø VÉMÉ ¨ÉÖ+É {ÉÆÊb÷iÉ ¦ÉªÉÉ xÉ EòÉäªÉ *føÃÉ<Ç +ÉJÉ®ú |Éä É EòÉ {Égäø ºÉÉä {ÉÆÊb÷iÉ ½þÉäªÉ **

Some modern writers in Hindi prefer to use theEnglish punctuation mark full stop to the verticalstroke.

The rest of the punctuation marks, viz. Comma [,](alpvir¡m), semi colon [;] (ardh vir¡m), colon [:] (vir¡mcinha), hyphen [-] (yojak), dash [-] (nirdeshak),single and double inverted commas [‘ ’ and “ ” ](uddharan cinha), question mark [?] (prashna s£cak),bracket [( )] (koÀ¶ak) etc. have been borrowed fromEnglish. However, the colon [:] is usually avoided,lest it should be confused with the ‘Visarga’ sign.

Ancient Signs$ (Om), g (Swasti), Jh (shr¢) are mostly used inHindi also as in Sanskrit.

Avagraha ·It is primarily used in Sanskrit text. It creates an extrastress on the preceding vowel.

Character Set Consideration

• Hindi CharacteristicsDevanagari script is syllabic and is written from leftto right. The characters of the script are given belowin their traditional order called ‘Varnmala’accompanied by Roman characters.

Vowels

Syllabic form:+ +É < <Ç = >ð @ñ B Bä +Éä +Éè +Éìa ¡ i ¢ u £ ¥ e ® o au ¡£

Intra-Syllabic form:

Contents Page 50

Page 53: tdiljan2002

#Â É, Ê , Ò, Ö , Ú , Þ , ä , è , Éä , Éè #Éì

ConsonantsVoiceless/ Voiceless/ Voiced/ voiced/ Nasal

unasp. asp. unasp asp.

Eò (ka) JÉ (kha) MÉ (ga) PÉ (gha) Ró (´a)

SÉ (ca) Uô (cha) VÉ (ja) ZÉ (jha) \É (µa)

]õ (¶a) B (¶ha) b÷ (·a) fø (·ha) hÉ (¸a)

iÉ (ta) lÉ (tha) nù (da) vÉ (dha) xÉ (na)

{É (pa) ¡ò (pha) ¤É (ba) ¦É (bha) ¨É (ma)

ªÉ (ya) ®ú (ra) ±É (la) ´É (va) ¶É (¿a)

¹É (Àa) ºÉ (sa) ½þ (ha) c÷ (¤a) gø (¤ha)

• Formats of Units

CalendarThe era prevalent in the Hindi speaking area is‘Vikram Samvat’ started by the king Vikram. Itdiffers from the christian era by +57 years. The newyear of Indian Calendar begins on the 16th day ofmonth Ch®tra ‘SÉèjÉ’ (mostly in the month of Marchof christian year).

The twelve months of the Vikram Samvat are namedin Hindi: (Sanskrit months are given in the bracket)SÉèiÉ ch®t (SÉèjÉ), ¤Éè¶ÉÉJÉ b®À¡kh (´Éè¶ÉÉJÉ), VÉäB je¶h, (VªÉä¹Bö),+¹ÉÉgø aÀ¡¤h (+ɹÉÉgø), ºÉÉ´ÉxÉ S¡wa¸ (¸ÉÉ´ÉhÉ), ¦ÉÉnùÉå Bh¡do(¦ÉÉpù{Énù), C´ÉÉ®ú Kw¡r (+Éζ´ÉxÉ), EòÉÊiÉEò Katik (EòÉÌiÉEò),+MɽþxÉ agahan (+OɽþɪÉxÉ or ¨ÉÉMÉǶÉÒ¹ÉÇ), {ÉÚºÉ p£s ({ÉÉè¹É),¨ÉÉPÉ m¡gh (¨ÉÉPÉ), ¡òÉMÉÖxÉ ph¡gun (¡òɱMÉÖxÉ)

Week DaysSeven days of the week are named as follows:®úÊ´É´ÉÉ®ú (<iÉ´ÉÉ®ú) Raviv¡r/Itv¡r (Sunday)

ºÉÉä¨É´ÉÉ® Somv¡r (Monday)

¨ÉÆMɱɴÉÉ®ú Mangalv¡r (Tuesday)

¤ÉÖvÉ´ÉÉ®ú Budhv¡r (Wednesday)

MÉÖ°ü´ÉÉ® (¤ÉÞ½þº{ÉÊiÉ´ÉÉ®ú) Gur£v¡r/B¤ihspativ¡r (Thursday)

¶ÉÖGò´ÉÉ®ú (áhukrav¡r) (Friday)

¶ÉÊxÉ´ÉÉ®ú (¶ÉxÉÒSÉ®ú) áhaniv¡r/áhanic¡r (Saturday)

DatesThe dates of the month in Indian calendar are dividedinto two parts i.e. paksh (fortnights), Krishna pakshand Shukla paksh. If the dates completes on 15th

day i.e. Am¡vasya it Krishna paksh and then the samedates are started in the Shukla paksh which ends onpurnim¡. Here are the dates given below:

|ÉÊiÉ{ÉnùÉ (|ÉlɨÉÉ) Pratipad¡/Prathm¡ (1)

ÊuùÊiɪÉÉ dvitiy¡ (2)iÉÞiÉÒªÉÉ t¤it¢ya (3)SÉiÉÖlÉÔ caturth¢ (4){ÉÆSɨÉÒ pancm¢ (5)¹É¹BöÒ ÀaÀ¶h¢ (6)ºÉ{iɨÉÒ saptam¢ (7)+¹]õ¨ÉÒ aÀ¶m¢ (8)xÉÉè ÉÒ (xɴɨÉÒ) naum¢/Navm¢ (9)nù¶É¨ÉÒ da¿hm¢ (10)BEòÉnù¶ÉÒ ekad¡¿h¢ (11)uùÉnù¶ÉÒ dv¡da¿h¢ (12)jªÉÉänù¶ÉÒ trioda¿h¢ (13)SÉÉènù¶É cauda¿h (14)+¨ÉɴɺªÉÉ am¡vasy¡ (15)

Therefore ‘am¡vasy¡’ and ‘p£rnim¡’ have the samefourteen dates prior to them and pratipad¡ is thestarting point and am¡vasy¡ belongs to Krishnapaksh and p£rnim¡ belongs to the Shukla paksh.

TimeTime in Indian context in {ɽþ®ú (pahar), PÉc÷Ò (gha¤¢),{É±É (pal) and +IÉ®ú (akshar).

A {ɽþ®ú (pahar) (in Sanskrit it is called ‘|ɽþ®ú’) is 1/8 ofday and night i.e. 3 hours.

A ‘PÉc÷Ò’ (gha¤¢) is 1/60 of day and night i.e. 24minutes.

A ‘PÉc÷Ò’ (gha¤¢) is divided into 60 parts which arecalled iy

A ‘iy’ is divided into 60 parts which are called +IÉ®ú(akshar).

These time points can be explained as under:+IÉ®úà = 24/60 (=2/5 Seconds){É±É = 60 +IÉ®ú (=24 Seconds)PÉc÷Ò = 60 {É±É ( = 24 minutes){ɽþ®ú = 7.5 PÉc÷Ò (= 3 hours)day + night = 60 PÉc÷Ò (= 24 hours and 8 {ɽþ®ú)

CurrencyThe principal unit of currency is the rupee (¯{ɪÉÉ). Arupee has a hundred p®s¡. Metallic coins are onep®s¡, 2 p®s¡, 3 p®s¡, 5 p®s¡, 10 p®s¡, 20 p®s¡(nownon-current), 25 p®s¡, 50 p®s¡, one rupee, tworupees, five rupees. Currency notes are for one rupee,two rupees, five rupees (now non-current), ten rupees

Contents Page 51

Page 54: tdiljan2002

twenty rupees, fifty rupees, one hundred rupees, fivehundred rupees and one thousand rupees.

Previously, rupee had 16 ¡n¡, athani has eight ¡n¡,chavanni had four ¡n¡, takka was half ¡n¡, p®s¡was one-fourth ¡n¡ and dhela was half p®s¡. Nowthis type of currency is not prevalent.

Weight and Measures(a) The unit of weight is ‘ºÉä®ú’ (ser) which is dividedinto sixteen parts called ‘Uô]õÉÆEò’ (chat¡Nk). It can bedescribed as following.4 Uô]õÉÆEò = 1 {ÉÉ´É (quarter)8 Uô]õÉÆEò = +ÉvÉÉ ºÉä®ú (half ser)16 Uô]õÉÆEò = 1 ºÉä®ú (ser)40 ºÉä®ú = 1 ¨ÉxÉ (maund)

(b)For weighing gold, silver etc. as well as medicines,the following weights are used:8 JɺÉJÉºÉ = 1 SÉÉ´É±É (c¡val)8 SÉÉ´É±É = 1 ®úkÉÒ (Ratt¢)8 ®úkÉÒ = 1 ¨ÉɶÉÉ (m¡sh¡)12 ¨ÉɶÉÉ = 1 iÉÉä±ÉÉ (tol¡)5 iÉÉä±ÉÉ = 1 Uô]õÉÆEò (chat¡Nk)(c)The unit for linear measurement is MÉVÉ (yard) asshown below:12 <ÆSÉ = 1 ¡Öò]õ (feet)3 ¡Öò]õ = 1 MÉVÉ (yard)220 MÉVÉ = 1 Qyk±x (Furlog)8 Qyk±x = 1 ¨ÉÒ±É (mile)1 ¤ÉÉʱɶiÉ = 1 +ÉvÉÉ ½þÉlÉ (half hand)1 ½þÉlÉ = +ÉvÉÉ MÉVÉ (half yard)1 MÉVÉ = 16 ÊMÉ®ú½þ(Note: one balisht is of length of 9 inches)

(d)Land areas are measured in the following way:

(a)144 ´ÉMÉÇ <ÆSÉ(inch) = 1 ´ÉMÉÇ ¡Öò]õ (Square foot)9 ´ÉMÉÇ ¡Öò]õ(Foot) = 1 ´ÉMÉÇ MÉVÉ (Square yard)4840 ´ÉMÉÇ MÉVÉ = 1 BEòc÷ (acre)

(b)20 ʤɺɴÉÉƺÉÒ (bisw¡Ns¢) = 1 ʤɺɴÉÉ (bisw¡)20 ʤɺɴÉÉ = 1 ¤ÉÒPÉÉ (b¢gh¡)31/4 ¤ÉÒPÉÉ = 1 BEòc÷ (Acre)14,400 ´ÉMÉÇ ¡Öò] = 1 ¤ÉÒPÉÉ

At present, India uses new metric system in weightsand measurement. The descriptions are given here.

(a) For measuring Solid things:1 OÉÉ¨É = about 15.48 OÉäxÉ (grain)

or 1 ¨ÉɶÉÉ (m¡sh¡)1000 OÉÉ¨É = 1 ÊEò±ÉÉäOÉÉ¨É (Kg)100 ÊEò±ÉÉäOÉÉ¨É = 1 ÏC´É]õ±É (quintal)

(b) For measuring liquid,1 MÉä±ÉxÉ = about 4.5 ʱÉ]õ®ú (Litres)1 ʱÉ]õ®ú = 1.75 Ë{É]õ (pint)1000 ʱÉ]õ®ú = 1 ÊEò±ÉÉäʱÉ]õ®ú (Kilolitre)

(c) For measuring areas etc.(i) 100 ºÉå]õÒ¨ÉÒ]õ®ú = 1 ¨ÉÒ]õ®ú (meter)

1000 ¨ÉÒ]õ®ú = 1 ÊEò±ÉÉä ÉÒ]õ®ú (Kilometer)8 ÊEò±ÉÉä ÉÒ]õ®ú = 5 ¨ÉÒ±É (mile)1 ¨ÉÒ]õ®ú = about 39.4 <ÆSÉ (inches)

(ii) 1 ½äþC]äõªÉ® (Hectare) = more than 2 BEòc÷100 ½äþC]äõªÉ® = 1 ´ÉMÉÇ ÊEò±ÉÉä ÉÒ]õ®ú

(Sq. Kms)

Abbreviation names¦ÉÉ. VÉ. {ÉÉ ¦ÉÉ®úiÉÒªÉ VÉxÉiÉÉ {ÉÉ]õÔ<ÆEòÉ <Îxnù®úÉ EòÉÆOÉäºÉ=. |É. =kÉ®ú|Énäù¶É+ÉÆ. |É. +ÉÆwÉ|Énäù¶É¨É. |É. ¨ÉvªÉ|Énäù¶É®úÉ. EÖò. ¶É¨ÉÉÇ ®úɨÉEÖò¨ÉÉ®ú ¶É¨ÉÉǪÉÚ. Eäò. ªÉÖxÉÉ<]äõb÷ ËEòMÉb÷¨ÉªÉÚ. BºÉ. B. ªÉÖxÉÉ<]äõb÷ º]äõ]ÂõºÉ +Éì¡ò +¨ÉäÊ®úEòÉB¨É. B. ¨Éɺ]õºÉÇ +Éì¡ò +É]ÇõºÉ¤ÉÒ. B. ¤ÉèSɱɮú +Éì¡ò +É]ÇõºÉb÷Éì. b÷ÉC]õ®ú

ErasIn Indian mythology, there are four eras (ªÉÖMÉ) given as

ºÉiÉ ªÉÖMÉ (satyug)jÉäiÉÉ ªÉÖMÉ (tret¡yug)uùÉ{É®ú ªÉÖMÉ (dw¡paryug)EòʱɪÉÖMÉ (Kaliyug)

AM/PM markers|ɦÉÉiÉ ({ÉÚ.) |ÉÉiÉ&/ºÉ֤ɽþ dawn / morning{ÉÚ´ÉÉǼxÉ forenoon+{É®úɼxÉ (+{É.) after noonnùÉä{ɽþ®ú NoonºÉÆvªÉÉ/¶ÉɨÉ/ºÉÉÆZÉ Evening®úÉÊjÉ/®úÉiÉ Night

Time Zone+ÉVÉ Today

Contents Page 52

Page 55: tdiljan2002

Eò±É Tomorrow & YesterdayÊnùxÉ Day{É®úºÉÉä Day after tomorrow and

day before yesterdayxÉ®úºÉÉä Two days after tomorrow

and two days beforeyesterday

ºÉ{iÉɽþ/½þ}iÉÉ Week{ÉJÉ´ÉÉ®úÉ/{ÉIÉ Fortnight¨ÉɺÉ/¨É½þÒxÉÉ MonthÊiɨÉɽþÒ Quarterly/Three monthlyUô¨ÉɽþÒ Half-yearly´É¹ÉÇ/ºÉÉ±É Yearnù¶ÉɤnùÒ Decade¶ÉiÉɤnùÒ CenturyºÉ½þºjÉɤnùÒ Millennium

(Courtesy : Prof. K.K. Goswami,Kendriya Hindi Sansthan, Delhi)

Tel : 011-6961323)

Typical Colloquial Sentences in Hindi

GREETINGw Hello

hOlaao

w Good Morning

saup`BaatSuprabh¡t

w Good Afternoon

namastoNamaste

w Good Night

SauBa rai~áhubh r¡tri

w Good Bye

AlaivadaAlvid¡

w Thanks

QanyavaadDhanyav¡d

w How are you

Aap kOsao hOMËpa k®se h®

w I am fine thank you

maOM AcCa hÐU QanyavaadM®n acch¡ h£n dhanyav¡d

w Sorry

maaf kIijayaoM¡ff k¢jiye

WEATHERw It is cold

yah zMDa hOYah ¶han·¡ h®

w It is cool outside

baahr zMDa hOB¡har ¶han·¡ h®

w It is hot

yah gama- hOYah garm h®

Contents Page 53

Page 56: tdiljan2002

Yah¡n se bas tarminal kitn¢ d£r h®

w How long will it take to reach the airport?

hvaa[- AD\Do tk phuÐcanao mao iktnaa samaya lagaogaaHaw¡i a··e tak pahuncne men kitn¡ samay

lageg¡

w Is Mr. Raghunath There?

@yaa EaImaana\ rGaunaaqa vahaÐ hOKya shri raghun¡th wah¡n h®

w Please tell him to call back as soon as he is free

jaOsao hI ]sao samaya imalao kRpyaa faona krnao ko ilayao kihyaoJ®se h¢ use samay mile k¤ipy¡ phone karne ke liye

kahiye

w How much will it cost?

[sakI kImat iktnaI haogaIIsk¢ k¢mat kitn¢ hog¢

w Excuse me

mauJao xamaa kroMMujhe ksham¡ karen

w From Which Platform can I get the train for

Chandigarh?

mauJao iksa PlaoTfama- sao caNDIgaZ, ko ilayao To/na imala saktI hOMujhe kis platform se chan·iga¤ha ke Liye train mil

sakt¢ h®

w Does this train stop at Aligarh?

@yaa yah To/na AlaIgaZ, pr $ktI hOÆKya yah train al¢ga¤ha par r£kt¢ h®

w How many kids do you have?

Aapko iktnao baccao hOMËpke kitne bacce h®n

w The gift is wonderful

yah ]phar Ad\Baut hOYeh uph¡r adbhut h®

w It is really pretty

yah vaastva maoM saundr hOYeh v¡stav men sundar h®

w It is raining

baairSa hao rhI hOB¡ri¿h ho rah¢ h®

GENERAL

w What is Your Name?

Aapka naama @yaa hOÆËpk¡ n¡m kya h®

w My Name is Ranjan

maora naama rMjana hOMer¡ n¡m ranjan h®

w Where do you live?

Aap khaÐ rhto hOÆËp kah¡n rahte h®

w I live near Ghantaghar

maOM GaMTaGar ko pasa rhta hUÐM®n ghan¶¡ghar ke p¡ss raht¡ h£n

w How old are you?

vkidh vk;q fdruh gS\Ëpak¢ ¡yu kitn¢ h®?

w That building is tall

vah Bavana }Ðcaa hOvah bhavan £nch¡ h®

w She is beautiful

vah saundr hOVah sunder h®

w I like Bengali sweets

maOM baMgaalaI imaza[- psand krta hUÐM®n bengali mi¶h¡¢ pasand kart¡ h£n

w I love birds

maOM icaiD,yaaoM kao psand krta hUÐM®n chi·iyon ko pasand kart¡ h£n

w Where is Railway Station?

rolavao sToSana khaÐ hORelve steshan kah¡n h®

w How far is the Bus Terminal from here?

yahaÐ sao basa Tima-nala iktnaI dUr hO

Contents Page 54

Page 57: tdiljan2002

w Food is delicious

Baaojana svaaidYT hOBhojan sw¡diÀht h®

w Congratulations

baQaa[- haoBadh¡¢ ho

w You look lovely

Aap AcCo idKto hOMËp acche dikhte h®n

w Wish you happy new year

Aapkao nayao vaYa- kI SauBakamanaayaoMËpko naye varÀh k¢ ¿hubhk¡man¡yen

w I wish you all the happiness

maO Aapko sauK kI kamanaa krta hUÐM®n ¡pke sukh k¢ k¡man¡ kart¡ h£n

w Congratulations on your marriage

Aapko ivavaah pr baQaa[-Ëpke viv¡h par badh¡¢

w Keep your eyes wide open before marriage and

half shut afterwards

ivavaah sao pUva- ApnaI AaÐKoM pUrI KulaI AaOr [sako baad AaQaIband rKoMViv¡h se p£rv apn¢ ¡nken p£r¢ khul¢ aur iske b¡d

¡dh¢ band rakhen

(Courtesy : Prof. R.M.K. Sinha, IIT Kanpur

E-mail : [email protected]

Tel : 0512-597174)

3.2.2 Marathi Design Guide

Introduction

The information presented in this document isintended to assist in understanding the nature andproblems of Marathi implementation in current andfuture products. It contains the generic descriptionof Marathi.

Description of the Marathi Language

Marathi is the official language of the MaharastraState in India. It has derived its phonetic characterset and its behaviour mainly from Sanskrit. Variouslanguages like Telugu, Tamil, Kannad,, Arabic,Persian, Hindi, Portugese and English haveinfluenced and enriched the Marathi language. It iswritten using the Devenagari script. Marathi iswritten from left to right and top to bottom, in thesame manner as English. Major dialects of Marathiare: Deshi, Vadadi and Nagpuri as well as Konkani.The two major styles are Granthik and Bolbhasha.

History of Marathi Language

Marathi is a direct descendant of Sanskrit throughMaharastri Prakrit . It has been influenced andenriched by Telugu, Tamil, Kannad, Arabic, Persian,Hindi, Portugese and English. The history of thelanguage can be divided into old Martahi (IslamicRule), Middle Marathi (Shivaji-British advent) andModern Marathi (British advent onwards).

Population using the Marathi Language

62,481,681 in India (1991 UBS) (7.5% of the IndianPopulation); 65,000,000 in the world(1991 UBS);

Technical Characteristics

Marathi Alphabet Characteristics

The Marathi language, in common with Hindi,Konkani, Nepali, Sindhi and many other northIndian dialects, is written in the Devanagari scriptwhich is also the accepted all-India script for Sanskrit.

The alphabets consist of 11vowels and 36consonants, as given below:

(a) Vowels

+ +É < <Ç = >ð @ñ B Bä +Éä +Éèa ¡ i ¢ u £ ¥ e ai o au

Contents Page 55

Page 58: tdiljan2002

Punctuation Marks (Virama)

In Marathii, sentences are concluded with full stopas in English. In old Marathi poetry, the dandasymbol is seen, e.g.,

¤ÉÉä±ÉÉSÉÒ E fÒ ¤ÉÉä±ÉÉSÉÉ ¦ÉÉiÉ *VÉä ÉÉäÊxɪÉÉ iÉÞ{iÉ EÉähÉ VÉɽ±ÉÉ **

Rest of the punctuation marks in Marathi are sameas in Hindi.

Ancient Signs

$ (Om).

Formats of Units

Calendar

The new year of the calendar- like the rest of India-begins on the 16th day of month Ch®tra ‘SÉèjÉ’ (mostlyin the month of March of christian year). The twelvemonths are given as (English pronunciations inparenthesis):

(ch®tra) SÉèjÉ, (v®¿h¡kh) ´Éè¶ÉÉJÉ, ( jyeÀ¶ha) T; s "B ,(¡À¡·h) +ɹÉÉf, (shr¡van) ¸ÉÉ´ÉhÉ, (bh¡drapad) ¦ÉÉpù{Én,(¡¿hvin) +Éζ´ÉxÉ, (k¡rtik) EòÉÌiÉEò, (m¡rga¿¢rÀa) ÉÉMÉǶÉÒ¹ÉÇ,(pauÀa) {ÉÉè¹É, (m¡gh) ¨ÉÉPÉ, (ph¡lgun) ¡òɱMÉÖxÉ

Week Days

Seven days of the week are named as follows:

®úÊ´É´ÉÉ® Raviv¡r (Sunday)ºÉÉä¨É´ÉÉ® Somv¡r (Monday)

¨ÉÆMɳ´ÉÉ® MangaLv¡r (Tuesday)¤ÉÖvÉ´ÉÉ® Budhv¡r (Wednesday)MÉÖ°ü´ÉÉ® Gur£v¡r (Thursday)

¶ÉÖGò´ÉÉ® áhukrav¡r (Friday)¶ÉÊxÉ´ÉÉ® áhaniv¡r (Saturday)

Dates

Dates in Marathi are same as in Hindi.

Time

Time divided into |ɽþ®ú (prahar), PÉb÷Ò (gha·¢) andIÉhÉ (kshan).

A |ɽþ®ú (prahar) is 1/8 of day and night i.e. 3 hours.

A ‘PÉbÒ’ (gha·¢) is 1/60 of days and night i.e. 24minutes.

(i) The vowel @ñ occurs only in Sanskrit and hasbeen borrowed into Marathi. In Marathi @ñ ispronounced as ‘r + u = ru’.

(ii) #Æ (anuswara) and #& (visarg) are often includedin the list of vowel letter and are usually written as +Æand +&.

Dependent Vowel Signs

Dependent vowel signs in Marathi are same as in Hindi.

Consonant Letters

The Marathi also have the same consonant letters asin Hindi. There are two consonants which are writtenslightly different from Hindi. These are ³ (La) and

¨Ì (sha). Other rules related to consonants in Marathiare same as in Hindi.

Eò (ka) JÉ (kha) MÉ (ga) PÉ (gha) Ró (´a)SÉ (ca) Uô (cha) VÉ (ja) ZÉ (jha) \É (µa)]õ (¶a) B (¶ha) b÷ (·a) fø (·ha) hÉ (¸a)iÉ (ta) lÉ (tha) nù (da) vÉ (dha) xÉ (na)

{É (pa) ¡ò (pha) ¤É (ba) ¦É (bha) ¨É (ma)ªÉ (ya) ® (ra) ±É (la) ´É (va) ¶É (¿ha)

¹É (Àa) ºÉ (sa) ½þ (ha) ³ (La)

Consonant Conjuncts

The conjunct consonant letters i.e., IÉ (ksha), YÉ(dnya), jÉ (tra), ¸É (shra) and t (dya) appear inMarathi also. It is to be noted that in Marathi YÉ ispronounced as dnya.

The common conjuncts in Marathi are same as in Hindi.

It is to be noted that conjuncts involving initial r arewritten with a special superscript form for r calledrafaar in Marathi.

Halant, Anuswara, Avagraha & Visarga

The usage of these signs in Marathi is same as in Hindi.

Numerals

In Devanagari, numerals are used as0 1 2 3 4 5 6 7 8 9other notations used are the marks : {ÉÉ´É (1/4) pav,

+vÉÉÇ ardh¡ (1/2), {ÉÉ=hÉ p¡un, (3/4), ºÉ´´ÉÉ savv¡ (1¼),nÒbø d¢· (1½), and +bÒSÉ a·¢ca (2½).

Page 56

Page 59: tdiljan2002

However, the predominant practice is to use theinternational system of hour, minute and second.

Currency

Currency sign in Marathi is same. as in Hindi.

Weights and Measures

(a) The unit of weight is ‘¶Éä®ú’ (¿er) which is dividedinto sixteen parts called ‘Uô]õÉEò’ (chat¡nk). It can bedescribed as follows:

4Uô]õÉEò = 1 {ÉÉ´É (quarter)8 Uô]õÉEò = +vÉÉÇ ¶Éä®ú (half ¿er)16 Uô]õÉE = 1 ¶Éä®ú (¿er)40 ¶Éä®ú = 1 ¨ÉhÉ (maund)

(b)For weighing gold, silver etc. as well as medicines,the following weights are used:

8 ®úkÉÒ = 1 ¨ÉɶÉÉ12 ¨ÉɶÉÉ = 1 iÉÉä³É5 iÉÉä³É = 1 Uô]õÉEò

(c)The unit for linear measurement is MÉVÉ (yard) asshown below:

12 <ÆSÉ = 1 ¡Ú]õ (foot)3 ¡Ú]õ = 1 MÉVÉ220 MÉVÉ = 1 ¡ò±ÉÉÈMÉ

8 ¡òú±ÉÉÈMÉ = 1¨Éè±É (mile)

1 ½þÉlÉ = +vÉÉÇ MÉVÉ (half yard)

(d)Land areas are measured in the following way:

(a)144 ´ÉMÉÇ <ÆSÉ = 1 ´ÉMÉÇ ¡Ú]õ (Square foot)9 ´ÉMÉÇ ¡Ú]õ = 1 ´ÉMÉÇ MÉVÉ (Square yard)4840 ´ÉMÉÇ MÉVÉ = 1 BEò® (acre)

There is another unit of measure called xaqBk (Gun¶h¡).At present, India uses new metric system in weightsand measurement. The descriptions are given here.

(a) For measuring Solid things:

1 OÉì É = about 15.48 OÉäxÉ (grain)

or 1 ¨ÉɶÉÉ1000 OÉì É = 1 ÊEò±ÉÉäOÉì É (Kg)100 ÊEò±ÉÉäOÉì É = 1 ÏC´É]õ±É (quintal)

(b) For measuring liquid,

1 MÉì±ÉxÉ = about 4.5 ±ÉÒ]õ®ú (Litres)1000 ±ÉÒ]õ® = 1 ÊE±ÉÉä±ÉÒ]® (Kilolitre)

(c) For measuring length and areas:

(i) 100 ºÉå]õÒ¨ÉÒ]õ®ú = 1 ¨ÉÒ]õ®ú (meter)1000 ¨ÉÒ]õ®ú = 1 ÊEò±ÉÉä ÉÒ]õ®ú (Kilometer)8 ÊEò±ÉÉä ÉÒ]õ®ú = 5 ¨Éè±É1 ¨ÉÒ]õ®ú = about 39.4 <ÆSÉ (inches)

(ii) 1 ½äþC]® = more than 2 BEò®100 ½äþC]®ú = 1 ´ÉMÉÇ ÊEò±ÉÉä ÉÒ]õ®ú (Sq. Kms)

AM/PM markers

{ɽÉ]/ºÉEɳ dawn / morningnùúÙ{ÉÉ® NoonºÉÆvªÉÉEɲ Evening

®úÉjÉ Night

Time Zone

+ÉVÉ Today=tÉ Tomorrow

EòÉ±É YesterdayÊnù´ÉºÉ Day{É®ú´ÉÉ Day after tomorrow and day

before yesterdayiÉä®ú´ÉÉ Two days after tomorrow and two

days before yesterdayvkBoMk Week{ÉÆvÉ®´ÉbÉ Fortnight

¨ÉʽþxÉÉ MonthÊiɨÉɽþÒ Quarterly/ Three monthly

ºÉ½É¨ÉɽþÒ Half-yearly´É¹ÉÇ Yearnù¶ÉɤnùÒ Decade

¶ÉiÉɤnùÒ CenturyºÉ½þºjÉɤnùÒ Millennium

Contents Page 57

Page 60: tdiljan2002

Typical Colloquial Sentences in Marathi

GREETING

w HelloueLdkjNamask¡r

w Good MorningºÉÖ|ɦÉÉiÉSuprabh¡t

w Good Afternoon¶ÉÖ¦É nÖ{ÉÉ®áubha dup¡r

w Good Night¶É֦ɮÉjÉÒáubhar¡tr¢

w Good Bye{ÉÖx½É ¦Éä]ÚPunh¡ bhet£

w ThanksvÉxªÉ´ÉÉnDhanyav¡d

w How are youEºÉ ä +ɽÉiÉ?/EºÉÉ +ɽäºÉ?/E¶ÉÒ +ɽäºÉ?Kase ¡h¡t?/ kas¡ ¡hes?/ ka¿¢ ¡hes ?

w I am fine thank you¨ÉÒ BÒE +ɽä vÉxªÉ´ÉÉnM¢ ¶h¢k ¡he dhanyav¡d

w Sorry¨É±ÉÉ JÉän +ɽäMal¡ khed ¡he

WEATHER

w It is coldMÉÉ®BÉ +ɽäG¡ra¶h¡ ¡he

w It is cool outside¤ÉÉ½ä® MÉÉ®´ÉÉ +ɽäB¡her g¡rav¡ ¡he

w It is hot=¹¨ÉÉ +ɽäUÀm¡ ¡he

w It is raining

ikÅl iMr vkgsP¡£s pa·at ¡he

GENERAL

w What is your name?

iÉÖZÉä xÉÉ´É EÉªÉ +ɽä?Tuze n¡v k¡y ¡he?

w My name is Ranjan

¨ÉÉZÉä xÉÉ´É ®ÆVÉxÉ +ɽäM¡ze n¡v ranjan ¡he

w Where do you live?

iÉÚ dksBs ®É½iÉÉäºÉ?T£ ko¶he r¡hatos?

w I live near Ghantaghar

¨ÉÒ PÉÆ]ÉPÉ®ÉVɴɳ ®É½iÉÉäM¢ ghan¶¡ghar¡javaL R¡hato

w How old are you?

iÉÖZÉä ´ÉªÉ ÊEiÉÒ/ EÉªÉ +ɽä?Tuze vay kit¢/ k¡y ¡he?

w That building is tall

iÉÒ <¨ÉÉ®iÉ =ÆSÉ +ɽäÃT¢ im¡rat uMca ¡he.

w She is beautiful

iÉÒ näJÉhÉÒ/ ºÉÖÆn® +ɽäT¢ dekha¸¢/ suMdar ¡he.

w I like Bengali sweets

¨É±ÉÉ ¤ÉÆMÉɱÉÒ Î¨ÉBÉ<Ç +É´ÉbiÉäMal¡ bang¡l¢ mi¶h¡¢ ¡va·ate

w I love birds

¨É±ÉÉ {ÉIÉÒ +É´ÉbiÉÉiÉMal¡ paksh¢ ¡va·at¡t.

w Where is Railway station?

®ä±´Éä º]ä¶ÉxÉ EÉäBä +ɽä ?Railway station ko¶he ¡he?

w How far is the Bus Terminal from here?

¤ÉºÉº]Äb ªÉälÉÚxÉ ÊEiÉÒ nÚ® +ɽä?Bus terminal yeth£na kit¢ d£ra ¡he?

Contents Page 58

Page 61: tdiljan2002

w How long will it take to reach the Airport?ʴɨÉÉxÉiɳÉ{ɪÉÆÇiÉ {ÉÉä½ÉäSÉhªÉɺÉÉBÒ ÊEiÉÒ ´Éä³ ±ÉÉMÉä±É?Vim¡nataL¡paryaMta pohocaNy¡s¡¶h¢ kit¢ veL l¡gel?

w Is Mr. Raghunath there?¸ÉÒ ®PÉÖxÉÉlÉ +ɽäiÉ E É ?Shr¢ Raghun¡th ¡het k¡?

w Please tell him to call back as soon as he is freeVÉä ½É iÉÉä ¨ÉÉäE³ É ½Éä<Ç±É iÉä ½É EÞ{ɪÉÉ iªÉɱÉÉ {ÉÖx½É ¡ÉäxÉE®ÉªÉ±ÉÉ ºÉÉÆMÉÉJevh¡ to mokaL¡ ho¢la tevh¡ k¤ipay¡ ty¡l¡ punh¡

phon kar¡yal¡ s¡Mg¡

w How much will it cost?ªÉÉSÉÒ ËE¨ÉiÉ EÉªÉ +ºÉä±É?/ ½ä ÊEiÉÒ±ÉÉ {Ébä±É?Y¡c¢ kiMmat k¡y asel?/ he kit¢l¡ pa·el?

w Excuse meeyk ekQ dj@djkMal¡ m¡ph kar/kar¡

w From which Platform can I get the train forChandigarh?SÉÆnÒMÉf±ÉÉ VÉÉhÉÉ®Ò ]ÅäxÉ ¨É±ÉÉ EÉ ähÉiªÉÉ ¡±ÉÉ]É´É® MÉÉBiÉɪÉä<DZÉ?CaMd¢ga·hal¡ j¡N¡r¢ tren mal¡ koNaty¡ phal¡t¡var

g¡¶hat¡ ye¢l?

w Does this train stop at Aligarh?

½Ò ]ÅäxÉ +ʱÉMÉf±ÉÉ lÉÉƤÉiÉä EÉ?H¢ tren aliga·hal¡ th¡Mbate k¡?

w How many kids do you have?iÉÖ ½É±ÉÉ ¨ÉÖ±Éä ÊEiÉÒ?Tumh¡l¡ mule kit¢?

w This gift is wonderful½Ò ¦Éä]´ÉºiÉÚ +xÉÉäJÉÒ +ɽäH¢ bhetavast£ anokh¢ ¡he

w It is really pretty¡É®SÉ UÉxÉPh¡raca ch¡n

w Food is deliciousJÉÉt{ÉnÉlÉÇ ¯SÉE® +ɽäKh¡dyapad¡rtha rucakar ¡he

w Congratulations

+ʦÉxÉÆnxÉAbhinaMdan

w You look lovely

iÉÚ ºÉÖ®äJÉ ÊnºÉiÉäºÉT£ surekh disates

w Wish you happy new year

iÉÖ±ÉÉ/ iÉÖ ½É±ÉÉ xÉ´ÉÒxÉ ´É¹ÉÇ ºÉÖJÉÉSÉä VÉÉ´ÉÉä ½Ò ºÉÊnSUÉTul¡/tumh¡l¡ nav¢n varÀh sukh¡ce j¡vo h¢ sadicch¡

w I wish you all the happiness

¨ÉÒ iÉÖZªÉɺÉÉBÒ ºÉE±É ºÉÖJÉÉÆSÉÒ EɨÉxÉÉ E®iÉÉäM¢ tuzy¡s¡¶h¢ sakala sukh¡Mc¢ k¡man¡ karato

w Congratulations on your marriage

Ê´É´ÉɽÉ|ÉÒiªÉlÉÇ iÉÖZÉä +ʦÉxÉÆnxÉViv¡h¡pr¢tyartha tuze abhinaMdan

w Keep your eyes wide open before marriage and

half- shut afterwards

Ê´É´ÉɽÉ{ÉÚ´ÉÔ +É{ɱÉä bÉä³ä {ÉÚhÉÇ{ÉhÉä =PÉbä B ä´ÉÉ´Éä +ÉÊhÉiªÉÉxÉÆiÉ® +vÉæViv¡h¡p£rv¢ ¡pale ·ole p£rNapaNe ughade thevave

¡Ni ty¡naMtar ardhe

(Courtesy : Prof. Pushpak Bhattacharya,of IIT Mumbai E-mail : [email protected]

Tel : 022-5767718)

Contents Page 59

Page 62: tdiljan2002

3.2.3 Konkani Design Guide

History of Konkani

Konkani is a direct descendant of Sanskrit throughPrakrit and Apabramsha like Hindi. It has beeninfluenced and enriched by Marathi, Kannada,Portuguese, Malayalam and English. These days thereis strong interaction between Hindi and Konkanion account of the three language formula in oureducation system, Hindi cinema, AIR andDoordarshan.

Geographical spread

Originally Konkani was the language of Goa andKonkan (the western littoral). In early days, due toeconomic and political reasons some of the speakersof Konkani moved out from Goa towards north andsouth in pursuit of their employment, trade andcommerce. Muslim conquest and rule was alsoresponsible for their emigration to some extent. Butemigration became intense on account of terror ofthe inquisition after the conquest of Goa by thePortuguese. As a result, the bulk of Konkani speakersare now in Goa, Maharastra, Karnataka, Kerla andin some parts of Tamilnadu. Though it is spokenand used by different heterogeneous social andreligious groups, it has retained its structural unity.Approximately, three million people speak Konkani.

Status of Konkani

The Sahitya Academy, recognised it as a literarylanguage in 1975. It became the official language ofthe state of Goa in 1987. It was included in the 8th

schedule of the Constitution of India in 1992 andgot the status of a national language. Its corpus ofthree million entries was completed and handed overto the Department of Electronics Government ofIndia in 1999. It will be put to use soon incollaboration with IIT, Bombay. After the liberationof Goa in 1961, Konkani literature has been enrichedconsiderably. In Goa Konkani can be now studiedat primary, secondary, higher secondary, undergraduate and post-graduate levels. As a subject,Konkani can be offered at Union Public Serviceexamination as at the UGC examination.Unfortunately foundation courses of INGOU,which are available in most of the national language,are not available in Konkani, Manipuri and Nepali

so far. Though, Devenagari script is recognised asthe official script of Konkani, in some parts Romanand Kannada are still in vogue.

Technical characteristics

Konkani alphabet scheme is almost the same as ofHindi. It has consonant letters, vowel letters, diacriticmarks, punctuation, numerals etc. Theirpronunciation in Konkani is indicated as givenbelow:

Vowels

+, +É, <, <Ç, =, >ð, @ñ, B, Bä, +Éä, +Éè, +Æ, +&

Consonants (´ªÉÆVÉxÉ)

I. Eò JÉ MÉ PÉ Róka kha ga gha ´a

II. SÉ Uô VÉ ZÉ \É

cha chha ja jha µa

III. ] B b÷ fø hÉ

¶a ¶ha ·a ·ha ¸a

IV. iÉ lÉ nù vÉ xÉ

ta tha da dha na

V. {É ¡ò ¤É ¦É ¨É

pa pha ba bha ma

VI. ªÉ ®ú ±É ´É

ya ra la va

VII. ¶É ¹É ºÉ ½þ¿a Àa sa ha

Konkani Alphabet

EòÉåEòhÉÒ +IÉ®ú ¨ÉɱÉÉ (Konkani Akshar M¡l¡)

Vowels (º´É®ú - Svar)

+ (Short) A ‘a’ in at, another

+É (Long) Ë a in all, father, last< (Short) I i in India, in, it, tip

<Ç (Long) Ì i in machine, and insweet, feet

= (Short) U u in pull, bull, full>ð (Long) Í oo in pool, fool,

cool

Contents Page 60

Page 63: tdiljan2002

@ñ (Short) Î r in RythemB (Long) E a in May, PayBä (Diphtthnoge) AI i in time, fine+Éä (Long) O o is go, tomb, note

+Éè (Diphthnoge) AU or AO ou in house,mouse, out

+Æ AM (the nasal sign) ,Anuswara

+& AH Visarga

Vowel Sign Usage Vowel Sign Usage

+ ... ... B ä EäòºÉA E KES+É É ®úÉ¨É Bä è EèònùË RËM AI KAID

< Ê Ê¶É´É +Éä Éä ±ÉÉäEò

I SIV O LOK

<Ç Ò MÉÒiÉ +Éè Éè EòÉè±ÉÌ GÌT AU KAUL= Ö SÉÖ{É +Æ Æ EÆòºÉU CHUP AM KAMS>ð Ú nÚùvÉ +& & {ÉÖxÉ&Í DÍDH AH PUNAH@ñ Þ xÉÞ{ÉÎ NÎP

Consonants

Eò JÉ MÉ PÉ RóKA KHA GA GHA NA

EòÉ JÉÉ MÉÉ PÉÉ RóÉKË KHË GË GHË NË

ÊEò ÊJÉ ÊMÉ ÊPÉ ÊRóKI KHI GI GHI NI

EòÒ JÉÒ MÉÒ PÉÒ RóÒKÌ KHÌ GÌ GHÌ NÌ

EÖò JÉÖ MÉÖ PÉÖ RÖóKU KHU GU GHU NU

EÚò JÉÚ MÉÚ PÉÚ RÚóKÍ KHÍ GÍ GHÍ NÍ

EÞò JÉÞ MÉÞ PÉÞ RÞóKÎ KHÎ GÎ GHÎ NÎ

Eäò JÉä MÉä PÉä Räó

KE KHE GE GHE NEEèò JÉè MÉè PÉè Rèó

KAI KHAI GAI GHAI NAIEòÉä JÉÉä MÉÉä PÉÉä RóÉä

KO KHO GO GHO NOEòÉè JÉÉè MÉÉè PÉÉè RóÉè

KAU KHAU GAU GHAU NAUEÆò JÉÆ MÉÆ PÉÆ RÆó

KAM MHAM GAM GHAM NAMEò& JÉ& MÉ& PÉ& Ró&

KAH KHAH GAH GHAH NAHSÉ Uô VÉ ZÉ \É

CHA CHHA JA JHA NASÉÉ UôÉ VÉÉ ZÉÉ \ÉÉ

CHË CHHË JË JHË NË

ÊSÉ ÊUô ÊVÉ ÊZÉ Ê\ÉCHI CHHI JI JHI NISÉÒ UôÒ VÉÒ ZÉÒ \ÉÒ

CHÌ CHHÌ JÌ JHÌ NI

SÉÖ UÖô VÉÖ ZÉÖ \ÉÖCHU CHHU JU JHU NU

SÉÚ UÚô VÉÚ ZÉÚ \ÉÚCHÍ CHHÍ JÍ JHÍ NÍ

SÉÞ UÞô VÉÞ ZÉÞ \ÉÞCHÎ CHHÎ JÎ JHÎ NÎ

SÉä Uäô VÉä ZÉä \ÉäCHE CHHE JE JHE NE

SÉè Uèô VÉè ZÉè \ÉèCHAI CHHE JAI JHAI NAI

SÉÉä UôÉä VÉÉä ZÉÉä \ÉÉäCHO CHHO JO JHO NOSÉÉè UôÉè VÉÉè ZÉÉè \ÉÉè

CHAU CHHAU JAU JHAU NAUSÉÆ UÆô VÉÆ ZÉÆ \ÉÆ

CHAM CHHAM JAM JHAM NAMSÉ& Uô& VÉ& ZÉ& \É&

CHAH CHHAH JAH JHAH NAH] B b fø hÉ

ÙA ÙHA ÚA ÚHA ÛA

]É Bk b÷É føÉ hÉÉÙË ÙHË ÚË ÚHË ÛË

Ê] fB Êb Êfø ÊhÉ

Contents Page 61

Page 64: tdiljan2002

TAU THAU DAU DHAU NAUiÉÆ lÉÆ nùÆ÷ vÉÆ xÉÆ

TAM THAM DAM DHAM NAMiÉ& lÉ& nù÷& vÉ& xÉ&

TAH THAH DAH DHAH NAH{É ¡ò ¤É ¦É ¨É

PA PHA BA BHA MA{ÉÉ ¡òÉ ¤ÉþÉ ¦ÉÉ ¨ÉÉPË PHË BË BHË MË

Ê{É Ê¡ò ʤÉþù ʦÉø ʨÉPI PHI BI BHI MI{ÉÒ ¡òÒ ¤Éþù÷Ò ¦ÉÒ ¨ÉÒPÌ PHÌ BÌ BHÌ MÌ

{ÉÖõ ¡Öòö ¤ÉþùÖ ¦ÉÖ ¨ÉÖPU PHU BU BHU MU{ÉÚ ¡Úò ¤ÉþÚ÷ ¦ÉÚ ¨ÉÚ

PÍ PHÍ BÍ BHÍ MÍ

{ÉÞ ¡Þò ¤ÉþùÞ÷ ¦ÉÞ ¨ÉÞPÎ PHÎ BÎ BHÎ MÎ

{Éä ¡äò ¤Éþùä÷ ¦Éää ¨ÉäPE PHE BE BHE ME{Éè ¡èò ¤Éþùè÷ ¦Éè ¨Éè

PAI PHAI BAI BHAI MAI{ÉÉä ¡òÉä ¤Éþù÷Éä ¦ÉÉä ¨ÉÉä

PO PHO BO BHO MO{ÉÉè ¡òÉè ¤Éþù÷Éè ¦ÉÉè ¨ÉÉè

PAU PHAU BAU BHAU MAU{ÉÆ ¡Æò ¤ÉþùÆ÷ ¦ÉÆ ¨ÉÆ

PAM PHAM BAM BHAM MAM{É& ¡ò& ¤Éù÷& ¦É& ¨É&

PAH PHAH BAH BHAH MAHªÉ ®ú ±É ´É ¶É

YA RA LA VA áA

ªÉÉ ®É ±ÉÉ ´ÉÉ ¶ÉÉYË RË LË VË áË

ÊªÉ Ê®ú ʱÉþù Ê´É Ê¶ÉYI RI LI VI áI

ªÉÒ ®úÒ ±Éþù÷Ò ´ÉÒ ¶ÉÒYÌ RÌ LÌ VÌ áÌ

ªÉÖ ¯ûö ±ÉþùÖ ´ÉÖ ¶ÉÖYU RU LU VU áU

ªÉÚ °üò ±ÉþÚ÷ ´ÉÚ ¶ÉÚ

ÙI ÙHI ÚI ÚHI ÛI]Ò Bh b÷Ò føÒ hÉÒÙÌ ÙHÌ ÚÌ ÚHÌ ÛÌ

]Öõ Bq bÖ÷ fÖø hÉÖÙU ÙHU ÚU ÚHU ÛU

]Úõ Bw bÚ÷ fÚø hÉÚÙÍ ÙHÍ ÚÍ ÚHÍ ÛÍ

]Þõ B bÞ÷ fÞø hÉÞÙÎ ÙHÎ ÚÎ ÚHÎ ÛÎ

]äõ Bs bä÷ fäø hÉäÙE ÙHE ÚE ÚHE ÛE

]èõ BS bè÷ fèø hÉèÙAI ÙHAI ÚAI ÚHAI ÛAI

]Éä Bks b÷Éä føÉä hÉÉäÙO ÙHO ÚO ÚHO ÛO

]Éè BkS b÷Éè føÉè hÉÉèÙAU ÙHAU ÚAU ÚHAU ÛAU

]Æõ Ba bÆ÷ fÆø hÉÆÙAM ÙHAM ÚAM ÚHAM ÛAM

]& B% b÷& fø& hÉ&ÙAH ÙHAH ÚAH ÚHAH ÛAHiÉ lÉ nù vÉ xÉ

TA THA DA DHA NAiÉÉ lÉÉ nùÉ vÉÉ xÉÉTË THË DË DHË NË

ÊiÉ ÊlÉ Ênù ÊvÉø ÊxÉTI THI DI DHI NIiÉÒ lÉÒ nù÷Ò vÉÒ xÉÒTÌ THÌ DÌ DHÌ NÌ

iÉÖõ lÉÖö nùÖ÷ vÉÖ xÉÖTU THU DU DHU NUiÉÚõ lÉÚ nùÚ÷ vÉÚ xÉÚTÍ THÍ DÍ DHÍ NÍ

iÉÞ lÉÞ nùÞ÷ vÉÞ xÉÞTÎ THÎ DÎ DHÎ NÎ

iÉä lÉä nùä÷ vÉä xÉäTE THE DE DHE NEiÉè lÉè nùè÷ vÉè xÉè

TAI THAI DAI DHAI NAIiÉÉä lÉÉä nù÷Éä vÉÉä xÉÉä

TO THO DO DHO NOiÉÉè lÉÉè nù÷Éè vÉÉè xÉÉè

Contents Page 62

Page 65: tdiljan2002

YÍ RÍ LÍ VÍ áÍ

ªÉÞ ´ÉÞ ¸ÉÞ

YÎ VÎ áÎ

ªÉä ®úäò ±Éþùä÷ ´Éää ¶ÉäYE RE LE VE áE

ªÉè ®úèò ±Éþùè÷ ´Éè ¶ÉèYAI RAI LAI VAI áAIªÉÉä ®úÉä ±Éþù÷Éä ´ÉÉä ¶ÉÉä

YO RO LO VO áO

ªÉÉè ®úÉè ±Éþù÷Éè ´ÉÉè ¶ÉÉè

YAU RAU LAU VAU áAU

ªÉÆ ®úÆò ±ÉþùÆ÷ ´ÉÆ ¶ÉÆYAM RAM LAM VAM áAM

ªÉ& ®ú& ±É÷& ´É& ¶É&

YAH RAH LAH VAH áAH

¹É ºÉ ½þ ³ý IÉSHA SA HA ßA KSHA¹ÉÉ ºÉÉ ½þÉ ³ýÉ IÉÉ

SHË SË HË ßË KSHË

Ê¹É ÊºÉ Ê½þù ʳý ÊIÉSHI SI HI ßI KSHI

¹ÉÒ ºÉÒ ½þù÷Ò ³ýÒ IÉÒSHÌ SÌ HÌ ßÌ KSHÌ

¹ÉÖõ ºÉÖö ½þùÖ ³ýÖ IÉÖSHU SU HU ßU KSHU

¹ÉÚ ºÉÚ ½þÚ÷ ³ýÚÚ IÉÚSHÍ SÍ HÍ ßÍ KSHÍ

¹ÉÞ ºÉÞ ½þùÞ÷

SHÎ SÎ HÎ

¹Éä ºÉä ½þùä÷ ³ýää IÉäSHE SE HE ßE KSHE¹Éè ºÉè ½þùè÷ ³ýè IÉè

SHAI SAI HAI ßAI KSHAI¹ÉÉä ºÉÉä ½þù÷Éä ³ýÉä IÉÉä

SHO SO HO ßO KSHO¹ÉÉè ºÉÉè ½þù÷Éè ³ýÉè IÉÉè

SHAU SAU HAU ßAU KSHAU¹ÉÆ ºÉÆ ½þùÆ÷ ³ýÆ IÉÆ

SHAM SAM HAM ßAM KSHAM

¹É& ºÉ& ½& ³ý& IÉ&

SHAH SAH HAH AH KSHAH

Consonant LettersIn Konkani also the each consonant letters representa single consonant sound with an inherent vowel,the short vowel /a/. There are 5 “Vargs” (Groups)of consonants. Each “Varg” contains 5 Consonants,the last of which is a nasal one. In addition to these,there are 9 Non ”Varg” consonants as in Hindi.Consonants letters may also be rendered as halfforms. The half form represents only the consonantsound and does not include the inherent vowel. SomeDevanagari consonant letters are depicted withalternate presentation forms. The choice of theseforms depends upon the neighboring consonants.

Independent Vowel Letters & Dependent VowelSigns (Matras or Vowel-modifiers)

There are separate symbols for all the vowels in theIndian scripts which are pronounced independently.To indicate a vowel sound other than the implicitone, a vowel sign (matra) is attached to theconsonant. Thus there are equivalent matras for allthe vowels. Explicit appearance of a matra in asyllable overrides the inherent vowel. These matrascan exist above, below, to the right or to the left ofthe consonant to which it is applied.

Consonant Conjuncts, Nukta & AnuswarThe Consonant Conjuncts, Nukta and Anuswar inKonkani are same as in Hindi. Nukta is a diacriticmark used for deriving 5 other consonants inDevanagari. Anuswar indicates a nasal consonantsound. When an anuswara comes before a consonantbelonging to any of the 5 Vargs, it represents thenasal consonant belonging to the Varg. It representsa different nasal sound when placed with a non-Vargconsonant.

Chandrabindu, Avagrah & VisargThese signs are generally not used in Konkani,however at some places the usage of these signs isalso seen.

NumeralsThe Konkani script uses numerals as in Hindi, butinternational numerals are more in use.

Punctuation MarksThe punctuation marks in Konkani are same as inHindi except for the use of Viram Chinha. To markthe end of a sentence it uses English full stop sign.

Contents Page 63

Page 66: tdiljan2002

Typical Colloquial Sentences in Konkani

GREETING

w Hello

½ìþ±ÉÉäh¯l°

w Good Morning

näù´É ¤É®úÒ ºÉEòɳý nùÓ´Édeva bar¢ sak¡½a d¢Æva

w Good Afternoon

näù´É ¤É®úÒ ºÉÉÆVÉ nùÓ´Édeva bar¢ s¡µja d¢Æva

w Good Night

näù´É ¤É®úÒ ®úÉiÉ nùÓ´Édeva bar¢ r¡ta d¢Æva

w Good Bay

¤É®åú, ¨Éä³ýÚƪÉÉbar®Æ, m®½£Æy¡

w Thanks

näù´É ¤É®åú Eò°Æüdeva bar®Æ kar£Æ

w How are you?

iÉÚÆ EòºÉÉä +ɺÉÉ, iÉÚÆ Eò¶Éå / Eò¶ÉÒ +ɺÉÉ ? iÉÖ ÉÒ Eò¶ÉåEò¶ÉÒ +ɺÉÉiÉt£Æ kas° ¡s¡, t£Æ ka¿eÆ / k¡¿¢ ¡s¡ ? tum¢ ka¿eÆ

ka¿¢ ¡s¡ta

w I AM Fine. Thank You

½þÉÆ É ¤É®úÉä +ɺÉÉÆ, näù´É ¤É®åú Eò°Æüh¡Æva bar° ¡s¡Æ, d®va bar®Æ kar£Æ

w Sorry

¨ÉÉ¡ò Eò®ú / Eò®úÉm¡pha kara / kar¡

WEATHER

w It is Cool

lÉÆb÷ÉªÉ +ɺÉÉtha¸·¡ya ¡s¡

w It is Cool Outside

¦Éɪɮú lÉÆb÷ÉªÉ +ɺÉÉbh¡yara tha¸·¡ya ¡s¡

w It Is Hot

MÉ®ú ÉÒ +ɺÉÉ (VÉÉiÉÉ)garam¢ ¡s¡ (j¡t¡)

w It Is Raining

{ÉÉ´ÉºÉ {Éb÷]õÉp¡vasa pa·a¶¡

GENERAL

w What Is Your Name

iÉÖVÉå / iÉÖ ÉSÉå xÉÉÆ É ÊEònåù ?tuj®Æ / tumac®Æ n¡Æva kid®Æ ?

w My Name is Ranjan

¨½þVÉå xÉÉÆ É ®ÆúVÉxÉmhaj®Æ n¡Æva raµjana

w Where Do You Live ?

iÉÚÆ JÉÆªÉ ®úÉ´ÉiÉÉt£Æ khaÆya r¡vat¡

w I Live Near Ghantaghar

½þÉÆ É PÉÆ]õÉPÉ®úÉ ±ÉÉMɺɮú ®úÉ´ÉiÉÉÆh¡Æva gha¸¶¡ghar¡ l¡gasara r¡vat¡Æ

w How Old Are You ?

iÉÖVÉÒ Ê{É®úÉªÉ ÊEònåù (ÊEòiɱÉÒ)tuj¢ pir¡ya kid®Æ (kital¢)

w That Building Is Tall

iÉÒ <¨ÉÉ®úiÉ >ÆðSÉ +ɺÉÉt¢ im¡rata £Æca ¡s¡

w She Is Beautiful

iÉå / iÉÒ ºÉÖÆxnù®ú +ɺÉÉt®Æ / t¢ sunndara ¡s¡

w I Like Bengali Sweets

¨½þÉEòÉ ¤ÉÆMɱÉÒ MÉÉäb÷¶ÉÓ +É´Éb÷iÉÉiÉ / ¤É®úÒ ±ÉÉMÉiÉÉiÉmh¡k¡ ba´gal¢ g°·¿¢Æ ¡va·at¡ta / bar¢ l¡gat¡ta

w I Love Birds

¨½þÉEòÉ ºÉ´ÉhÉÓ +É´Éb÷iÉÉiÉmh¡k¡ sava¸¢Æ ¡va·at¡ta

w Where Is The Railway Station ?

®äú±´Éä º]äõ¶ÉxÉ JÉÆªÉ +ɺÉÉr®lv® s¶®¿ana khaÆya ¡s¡

Contents Page 64

Page 67: tdiljan2002

w How Long Is The Bus Terminal From Here ?

½þÉÆMÉɺÉÚxÉ ¤ÉºÉº]Äõb÷ ÊEòiɱÉÉä {ÉèºÉ +ɺÉÉh¡´g¡s£na basas¶aÄ·a kital° p®sa ¡s¡

w How Long It Takes To Reach The Airport ?

ʴɨÉÉxÉiɳýÉ®ú {ÉÉ´É{ÉÉEò ÊEòiɱÉÉä ´Éä³ý ±ÉÉMÉiÉÉ ?vim¡nata½¡ra p¡vap¡ka kital° v®½a l¡gat¡ ?

w Is Mr. Raghunath Here ?

¸ÉÒ ®úPÉÖxÉÉlÉ ½þÉÆMÉÉ +ɺÉÉ ?¿r¢ raghun¡tha h¡´g¡ ¡s¡ ?

w Please Tell Him To Call Back As Soon As He Is

Free

(={ÉEòÉ®ú Eò°üxÉ) iÉÉä ¨ÉäFò³ýÉä VÉɱªÉɤɮúɤɮú iÉÉEòɨ½þVÉäEòbä÷xÉ =±ÉÉå ÉEò ºÉÉÆMÉ(upak¡ra kar£na) t° m®qa½° j¡ly¡bar¡bara t¡k¡

mhaj®ka·®na uloÆvaka s¡´ga

w How Much Will It Cost ?

iÉÉEòÉ ÊEòiɱÉä {Éb÷]õ±Éä ? iÉÉEòÉ ÊEòiɱÉÉä JÉSÉÇ ªÉäiɱÉÉät¡k¡ kital® pa·a¶al® ? t¡k¡ kitalo kharca y®talo

w Excuse Me

¨ÉJɱÉɶÉÒ Eò®úmakhal¡¿¢ kara

w From Which Platform Can I Get The Train To

Chandigarh?

SÉÆnùÒMÉb÷ÉEò ´ÉSÉ{ÉÉSÉÒ MÉÉb÷Ò JÉƪÉSªÉÉ {±Éì]õ¡òɨÉÉÇ®ú(¡ò±ÉÉ]õÉ®ú) ¨Éä³ý]õ±ÉÒcand¢ga·¡ka vacap¡c¢ g¡·¢ khaÆyacy¡

pl¯¶aph¡rm¡ra (phal¡¶¡ra) m®½a¶al¢

w Does This Train Stop At Aligarh ?

½þÒ +ÉMÉMÉÉb÷Ò +±ÉÒMÉb÷ÉEò lÉÉƤÉiÉÉ ?h¢ ¡gag¡·¢ al¢ga·¡ka th¡mbat¡ ?

w How Many Kids Do You Have ?

iÉÖEòÉ ÊEòiɱÉÓ ¦ÉÖ®úMÉÓ ? (+ɺÉÉiÉ)tuk¡ kital¢Æ bhurag¢Æ ? (¡s¡ta)

w This Gift Is Wonderful

½þÒ ¦Éå]õ´ÉºiÉ ºÉÉä¤ÉÒiÉ +ɺÉÉh¢ bh®¸¶avasta sob¢ta ¡s¡

w It Is Really Pretty

iÉÒ * iÉå JÉ®úÒSÉ* JÉ®åúSÉ ºÉÖÆnù®ú +ɺÉÉ

t¢ / teÆ khar¢ca / khar®µca sundara ¡s¡

w Food Is Delicious

VÉä ÉhÉ ºÉÖ ÉÉnùÒEò +ɺÉÉj®va¸a suv¡d¢ka ¡s¡

w Congratulations

{É®ú¤ÉÓparab¢Æ

w You Look Lovely !

iÉÚÆ ¤É®úÉä ÊnùºÉiÉÉ +ÉÆ !t£Æ bar° disat¡ ¡Æ !

w Wish You Happy New Year

iÉÖEòÉ xÉ´Éå ´ÉºÉÇ ºÉÖJÉÊnùhÉå VÉÉÆ Étuk¡ naveÆ varsa sukhadi¸®Æ j¡Æva

w I Wish You All The Happiness

½þÉÆ É iÉÖEòÉ / iÉÖ ÉEòÉÆ ºÉMɳýÒ ºÉÖJÉÉÆ +ÉÆ Ébå÷iÉÉÆh¡Æva tuk¡ / tumak¡Æ saga½¢ sukh¡Æ

¡Æva·®nt¡Æ

w Congratulations On your Marriage

iÉÖEòÉ / iÉÖ ÉEòÉÆ ±ÉMxÉÉSÉÒ {É®ú¤ÉÓtuk¡ / tumak¡Æ lagn¡c¢ parab¢Æ

w Keep Your Eyes Wide Open Before Marriage

And Half shut Afterwards

±ÉMxÉÉ+ÉnùÓ iÉÖVÉä nùÉä³ýä {ÉÖ®úÉªÉ =Häò nù´É®ú +ÉxÉÒ ={É®úÉÆiÉ+næù vÉÉÆ{ÉÒ±±Éä nù´É®úlagn¡¡d¢Æ tuje do½e pur¡ya ukte davara ¡n¢

upar¡nta arde dh¡mp¢lle davara

(Courtesy : Prof. K.J. Mahale, Former ViceChancellor of Manipur University

E-mail : [email protected]

Tel : 0832-239250)

Contents Page 65

Page 68: tdiljan2002

3.2.4 Sindhi Design Guide

History of the Sindhi Language

Sindhi belongs to the North-Western group of Indo-Aryan Languages. It is one of the major literarylanguages of Indo-Pakistan sub-continent. It has itsorigin from an old Indo-Aryan dialect or PrimaryPrakrit spoken in the region of Sindh at the time ofCompilation of the Vedas or perhaps some centuriesbefore that. Glimpses of that dialect can be seen tosome extent in the literary language of the hymns ofthe Rig-Veda. Sindhi, like other languages of thisfamily, has passed through old Indo-Aryan(i.e.Sanskrit) and Middle Indo-Aryan (i.e. Pali,Secondary Prakrits and Apabhramsha) stages ofgrowth, and entered the New Indo-Aryan stagearound the tenth century A.D. As the Sindh regionis situated near the North-Western borders ofundivided India, it suffered frequent invasions. Itremained under the Muslim rule for more thaneleven hundred years. Hence, Sindhi borrowedcomparatively more Arabic and Persian words. Inspite of this, the basic vocabulary and grammaticalstructure of Sindhi has remained mostly unchanged.

Population using the Sindhi language

The Sindhi language is predominantly spoken inSindh, the region which has become a part ofPakistan after the partition of India in 1947. As aresult, about 1.2 million Sindhi-Speaking Hindus,compelled by the socio-political crisis of that period,migrated to India. Sindhis in India have no particularlinguistic state, but considering their justifieddemand, the Sindhi language was recognised in theVIII Schedule of the Constitution of India on April10, 1967. The Sindhi-speaking people are spreadthroughout India, with main concentration in thecities and towns of Gujarat (Ahemadabad andVadodara ), Maharashtra (Mumbai, Ulhasnagar andPune), Rajasthan (Ajmer, Jaipur, Jodhpur andUdaipur ), Uttar Pardesh (Agra, Kanpur, Lucknowand Varanasi), Madhya Pardesh (Bhopal, Indore,Gwalior) and in Delhi. The Sindhi language hasshown satisfactory growth in India as well as inPakistan in educational, literary and cultured fields.It is also used as an official language inSindh(Pakistan). Accoding to 1981 Census Reportof Pakistan there are about fifteen million peoplewho have declared Sindhi as their mother tongue.

On the other hand, according to the 1991 CensusReport of India, there are about 2.2 million Sindhi-speaking people residing in different provinces ofthe country. Most of the Sindhi Hindus belong tothe business community. Hence, about two millionSindhis are permanently settled in other countriesof the world.

Scripts used for writing Sindhi

The Sindhi language is written mainly in two scripts,Viz. Devanagari-Sindhi and Arabic-Sindhi.Devanagari-Sindhi Script is based upon Devanagariwriting system used for Sanskrit and Hindi, withfour additional characters x] t] M] c representingimplosive sounds of Sindhi. On the other hand,Arabic-Sindhi script is based upon the Arabic writingsystem. It consists of 52 characters standardized bythe British government in 1853, by additions andmodifications of the basic 28 characters used forArabic language.

Apart from these, the Sindhi language has its ownindigenous old script called “Sindhi”, which has itsorigin in Proto-Nagari, Brahmi and Indus Valleyscripts. Its use, how ever, is now restricted tocommercial correspondence by some traders and inreligious scriptures of Ismaili Khoja Muslims ofSindh. Considering the present Socio-Culturalsituation of Sindhis in India, the Devanagari-Sindhiscript is being increasingly used to preserve andpromote their literary and cultured heritage.

Devanagari-Sindhi Alphabet

(1) Vowels :

+ +É < <Ç = >ð @ñ B Bä +Éä +Éèa ¡ i ¢ u £ ¤ e ® o au

Note: @ñ is used for writing Sanskrit words inheritedby Sindhi in tatsam form.

(2) Consonants :

Eò Fò JÉ KÉ MÉ x NÉ PÉ Róka qa kh Áa ga g gha gha ¸aSÉ Uô VÉ t WÉ ZÉ \Éca cha ja ja za jha µa]õ B b M c÷ fø g hɶ ¶h · È ¤ ·ha ¤ha ´aiÉ lÉ nù vÉ xÉta tha da dha na

Contents Page 66

Page 69: tdiljan2002

{É ¡ò ¢ò ¤É c ¦É ¨Ép ph fa ba ba bha maªÉ ® ±É ´Éya ra la va¶É ¹É ºÉ ½þ¿a Àa sa ha

Special Conjunct consonants

IÉ jÉ YÉ ¸É

Ancient Sign (om)

$

Notes :-

(1) Implosive consonants are written by putting linebelow (+vÉÉä®äúJÉÉ) the corresponding explosiveconsonant.

x t M cga ja ·a ba

(2) Fò (qa), KÉ (kha), NÉ gha, WÉ (za) and ¢ò (fa)represent consonants borrowed by Sindhi fromArabic, Persian or English.

(3) c (¤a) represents retroflex consonant, while gø(¤ha) represents aspirated retroflex consonant.

(4) ¹É (Àa) is used only in Sanskrit words inheritedby Sindhi.

(5) IÉ (kÀa), jÉ (tra), YÉ (gya/jµ) and ¸É (¿ra) arespecial characters representing some conjunctconsonant.

(6) $ is ancient special character inherited bySindhi from Sanskrit.

(7) Devenagari-Sindhi writing system followsHindi method of writing in all other respects,such as, - vowel signs attached with consonants,diacritic marks, punctuation’s, numerals,conjunct consonants, Halant marker, anuswara,etc.

(Courtesy : Dr. M.K. Jetley, Linguist & SindhiExpert, New Delhi Tel : 011-2146121)

Typical Colloquial Sentences in Sindhi

GREETING

w Hello

½þ®äú ®úɨÉHare R¡m

w Good Morning

SÉ´ÉÉä ZÉÖ±Éä±ÉɱÉCavo Jhulel¡l

w Good Afternoon

SÉ´ÉÉä ZÉÖ±Éä±ÉɱÉCavo Jhulel¡l

w Good Night

SÉ´ÉÉä ZÉÖ±Éä±ÉɱÉCavo Jhulel¡l

w Good Bye

SÉÆMÉÉä ½þ±±ÉÉ lÉÒCango Hall¡n th¢

w Thanks

¨Éä½þ®ú¤ÉÉxÉÒMeharb¡n¢

w How are you

iÉ´½þÉÄ EòÓ+ +ÉʽþªÉÉä ?/ iÉ´½þÉÆVÉÉä Eòʽþc÷Éä ½þÉ±É +ɽäþ *Tanw¡ k¢n ¡hiyo?/ tanw¡jo kahi¤o h¡l ¡he

w I am fine thank you

¨ÉÉ Bhd +ÉʽþªÉÉÄ ¶ÉÖGò +ɽäþM¡ ¶h¢k ¡hiy¡n ¿ukra ¡he

w Sorry

¨ÉÉ¡ò EòVÉÉäM¡ph Kajo

WEATHER

w It is cold

lÉÊvÉ +ɽäþ *Thadhi ¡he

w It is cool outside

¤ÉÉʽþÊ®ú lÉÊvÉ +ɽäþ *Bahiri Thadhi ¡he

Contents Page 67

Page 70: tdiljan2002

w How far is the Bus Terminal from here?

¤ÉºÉ ]ǫ̃ÉxÉ±É Ê½þiÉÉÆ EäòÊiÉ®úÉä {É®äú +ɽäþ *Bus Terminal Hit¡n ketiro pare ¡he

w How long will it take to reach the airport?

½þ´ÉÉ<Ç +dä÷ {ɽÖÆþSÉhÉ ¨Éå EäòÊiÉ®úÉä ´ÉCiÉÖ ±ÉMÉÆnùÉä ?Airport Pahucha¸ men ketiro waqtu lagando

w Is Mr. Raghunath There?

UôÉ Ê¨É. ®úPÉÖxÉÉlÉ Ê½þiÉä +ɽäþ ?Chha Mr. Raghun¡th Hite ¡he

w Please tell him to call back as soon as he is free

VÉÓ+ ½Úþ ´ÉÉÆnùÉä ÊlÉ®ú ½ÖþxÉJÉä ´ÉÉ{ÉºÉ ¢òÉäxÉ Eò®úhÉ ±ÉÉ<SÉ<VÉÉä *J¢n H£ V¡ndo thir Hunkhe v¡pas phone karan l¡i

caijo

w How much will it cost?

ʽþxÉ ¨Éå EäòÊiÉ®úÉä {ÉèºÉÉ ±ÉMÉÖÆnùÉ/ ½þÒ PÉcä÷ +ɽäþhin men ketiro p®s¡ lagund¡ / h¢ gha¤e ¡he

w Excuse me

¨ÉÖJÉä ¨ÉÉ¡ò EòVÉÉä *Mukhe m¡ph kajo

w From Which Platform can I get the train for

Chandigarh?

SÉÆb÷ÒMÉfø ±ÉÉ< MÉÉnÂùÒ ¨ÉÖÆJÉä Eòʽþcä÷ {±Éä]õ¢òɨÉÇ iÉÉÆ Ê¨É±ÉnùÒChan·iga¤h l¡i g¡di mukhe kahi¤e paltform t¡n

Milad¢

w Does this train stop at Aligarh?

UôÉ ½þÒ+ MÉÉnÂùÒ +±ÉÒMÉfø iÉä ¤ÉÒ½ÆþnùÒ ?Ch¡ h¢ g¡d¢ Al¢ga¤h te B¢hand¢

w How many kids do you have?

iÉ´½þÉÆJÉä EäòÊiÉ®úÉ ¤ÉÂÉ®ú +ÉʽþÊxÉ ?Tanw¡hkhe ketir¡ B¡r ¡hini

w The gift is wonderful

½þÒ+ ºÉÚÊJÉc÷Ò iɨÉÉ¨É lqBh +ɽäþ *H¢ S£kh¤¢ Tam¡m su¶h¢ ¡he

w It is really pretty

½þÒ+ ¶É< iɨÉÉ¨É ºÉÖʽþc÷Ò +ɽäþ *H¢ ¿i tam¡m suhi¤¢ ¡he

w It is hot

MÉ®ú ÉÒ +ɽäþ *Garm¢ ¡he

w It is raining

¤É®úºÉÉiÉ lÉÒ {É´Éä */ ¤É®úºÉÉiÉ +SÉÒ ®ú½þÒ +ɽäþ *Bars¡t th¢ pawe / bars¡t ac¢ rah¢ ¡he

GENERAL

w What is Your Name?

iÉ´½þÉÄVÉÉä xÉɱÉÉä UôÉ +ɽäþ ?Tanwh¡jo N¡lo cha ¡he

w My Name is Ranjan

¨ÉÖÆʽþVÉÉä xÉɱÉÉä ®ÆúVÉxÉ +ɽäþ *Muhinjo N¡lo Ranjan ¡he

w Where do you live?

iÉ´½þÉÄ ÊEòlÉä ®ú½ÆþnùÉ +ÉʽþªÉÉä *Tanwh¡ kithe Rahand¡ ¡hiyo

w I live near Ghantaghar

¨ÉÉÆ PÉÆ]õÉPÉ®ú VÉä {ÉɺÉä ®ú½ÆþnùÉä +ÉʽþªÉÉÄ *M¡ Ghan¶¡ghar je p¡se rahando ¡hyan

w How old are you?

iÉ´½þÉÄVÉÒ =ʨÉÊ®ú EòÊiÉÊ®ú +ɽäþ ?Tanwh¡j¢ umiri katiri ¡he

w That building is tall

½Úþ <¨ÉÉ®úiÉ Êb÷PÉÒ +ɽäþ *H£ im¡rat ·igh¢ ¡he

w She is beautiful

½Úþ+ UôÉäÊEòÊ®ú ºÉÖʽþÊhÉ +ɽäþ *H£ Chokiri Suhi¸i ¡he

w I like Bengali sweets

¨ÉÖJÉä ¤ÉÆMÉɱÉÒ feBkbZ ´ÉhÉÆnùÒ +ɽäþ *Mukhe Beng¡l¢ Mi¶ha¢ varha¸d¢ ¡he

w I love birds

¨ÉÖJÉä {ÉÊJɪÉÖÊxÉ ºÉÉÆ {ªÉÉ®ú +ɽäþ *mukhe pakhiyuni s¡n py¡r ¡he

w Where is Railway Station?

®äú±É´Éä º]äõ¶ÉxÉ ÊEòlÉä +ɽäþ *Railway Station Kithe ¡he

Contents Page 68

Page 71: tdiljan2002

w Food is delicious

JÉÉvÉÉä iɨÉɨÉÖ º´ÉÉÊnù¹]õ +ɽäþ *Kh¡dho tam¡mu Sw¡diÀh¶ ¡he

w Congratulations

´ÉÉvÉÉƪÉÚV¡dh¡ny£

w You look lovely

iÉ´½þÉÆ iɨÉɨÉÖ ºÉÖʽþhÉÉ ±ÉMÉÒ ®úʽþªÉÉ +ÉʽþªÉ */ iÉÚ ºÉÖʽþhÉÒlÉÒ ±ÉMÉÒ *Tanwh¡ tam¡mu suhi¸¡ lag¢ Rahiy¡ ¡hiya / T£

Suhi¸¢ Th¢ lag¢

w Wish you happy new year

xÉB ºÉ±É VÉÚÆ ´ÉvÉɪÉÚÆ *Nae sal j£n v¡dh¡ny£

w I wish you all the happiness

<Ç·É®ú ¶É±É iÉ´½þÉÆJÉä JÉÖ¶É ®úJÉä *Ì¿war ¿al Tanw¡hkhe Khu¿a rakhe

w Congratulations on your marriage

¶ÉÉnùÒ+ VÉÚÆ ´ÉvÉɪÉÚÆá¡d¢ j£n V¡dh¡ny£

w Keep your eyes wide open before marriage and

half shut afterwards

¶ÉÉnùÒ+ JÉÉÆ {ÉʽþÊ®ú {ÉÚ®úÒ Bå ¶ÉÉnùÒ+ JÉÉÆ {ÉÉä<ú +vÉÖ +ÊJɪÉÚÆJÉÖÊ±ÉªÉ±É ®úJÉÉä *á¡d¢ kh¡n pahiri p£r¢, en ¿¡d¢ kh¡n poi Adhu

Akhiy£n Khulial Rakho

(Cour tesy : Ms J yoti Arora, MC&IT, New Delhi

E-mail : [email protected] .in Tel : 4301878)

3.2.5 Nepali Design Guide

Introduction (PARICHAYA)

Nepal, cut by 28-degree latitude (+IÉÉƶÉ) south ofthe Himalayan main ridge, is an ancient country asthe earliest reference about Nepal is found in theArthashartra of Kautilya. Being a land-lockedcountry, the nearest seacoast being about 700milesfrom its border. It borders with India in the East,South & West and in the North with the China.The area of Nepal is 147-181sq.kilomaters. NepaleseTime is 5 hours 45 minutes ahead of GMT and 15minutes ahead of IST. It has a population of nearly25 million people. Nepal is divided into 14 Anchals

(Zones) and 75 districts.

Being a Himalayan Kingdom, altitude varies from70 meters to 8848 meters. The Himalayan regionin the north lies at an altitude between 16,000 to29,000 feet. It is in this region that the world famouspeaks of Mount Everest (SAGARMATHAA, as theNepalese call it) and Kanchanjunga, Makalu,Muktinath, Dhaulagiri, Annapurna and GaneshHimal are situated. Its average length from the Mechiriver on the east to the Mahakali River on the westis 550 miles. The width varies from 150 miles inthe eastern sector to about 90 miles in the westernregion.

The foundation of the modern state of Nepal waslaid by the king of Gorkha, PRITHVINARAYANSHAH, in A.D. 1769. The Present Shah dynasty ofNepal was ascended the throne in 1768. Presently,Kathmandu is the Capital of Nepal.

It is the only independent Hindu Kingdom in theworld and lies at the foot of mighty Himalayas. Thisancient kingdom, which finds its mention in theKautilya Arthashastra, supposed to have been writtenin 3rd century B.C, and Skanda Purana, is inhabitedby the people of different castes and creeds, speakingdifferent languages such as Nepali, Newari, Bhutia,Tamang, Lepcha, Magar, Maithili and Hindi etc.Bhutias, Tamangs, Limbus, Rais and Sherpas, whohave earned international fame due to theirmountaineering adventures, live in the north andeast part of the land whereas Newars are settled inthe centre and the Magars, the Kiratis and theGurungs are settled along the Mahabharat ranges.

Contents Page 69

Page 72: tdiljan2002

since its amalgamation in India. Now it has formeda prestigious place in the 8th schedule of the Indianconstitution. In addition, the language is alsoextensively spoken in the southern part of Bhutan.Many people living in Myanmar, Australia, UK,USA, Singapore and Hongkong also speak Nepali.

Census data of 1991 (Census of India) shows that itis one of the three main languages in some of thestates, namely Arunachal Pradesh (Number ofSpeakers- 81,176, percentage 9.4) and Sikkim(number of speakers- 2,56,418, percentage 63.1)Number of Nepali language speakers (per 10,000persons) stands at 25 as per the census of India, 1991.

Technical characteristics (PRABIDHIK SWAROOP)

Nepali Alphabet Characteristics

The Nepali language is written in the Devenagariscript which is also the script for Sanskrit. Thealphabet consist of 13 vowels and 36 consonants, asgiven below:

(a) Vowels ¼Loj½

+ +É < <Ç = >ð @ñ B Bä +Éä +Éè +Æ +&a ¡ i ¢ u £ ¥ e e o ¡£ am ah

(i)The vowel @ñ occurs only in Sanskrit words

borrowed into Nepali as in Hindi.

(ii) #Æ (anuswara) and #& (visarg) are often included

in the list of vowel letter and are usually written as +Æand +&, but so as Nepali is concerned, they are mostlyused with consonant.

(iii) For all practical purposes, +-+É, <-<Ç and =-

>ð may be regarded as pairs of short and long vowels.

B-Bä and +Éä-+Éè are all long vowels.

Dependent Vowel Signs (ek=k Matr¡s)

To indicate a vowel sound other than the implicitone, a vowel sign is attached to the consonant. Thus,there are equivalent vowel signs for all the vowels.Explicit appearance of a vowel sign in a syllableoverrides the inherent vowel. These vowel signs canexist alone below, to the right or to the left of theconsonant to which it is applied. The vowel signsmostly come after the consonant letters as below: -

In the terai region of the land live the Tharus,Maithils, Dhunals, Jaisis, the Kshatriyas and theBrahmins. All those people have their own traditions,customs, colourfull customs, dialects and ways of living.

The main religions are Hinduism and Buddhism.90% of the population are Hindus and 8.5% areBuddhists. Rest belongs to other religions like IslamSikhism etc. All religions in this land have flourishedside by side and one can see the minarets of Mosques,Gurdwaras and other places of worship. The templeof Pashupatinath in Kathmandu occupies anexceptional place in the cultural history of Indo-Nepal relations. Similarly, the stupa ofSwayambhunath in Kathmandu is a famous placeof pilgrimage for the Buddhists of all over the world.

It is now a multi-party democracy with theconstitutional monarchy and the new democraticconstitution of the kingdom of Nepal waspromulgated on November 9,1990.

Language (BHËSHË)

Nepali has been official language of Nepal sinceA.D.1768 and is written in the DEVENAGARIscript. Nepali was declared constitutional languageof Government of India in 1992.

Nepali Language was formerly called ‘KHËSKURË’,‘PARBATIË’ and ‘GORKHËLI’. Now it is the nationalLanguage and ‘Lingua Franca’ of Nepal. It belongsto the Indo-Aryan group of Languages. Being apermanent home of many castes and tribes, thereare people living in Nepal who also speak Tibeto-Burman dialects. Nepali language is enriched byArabic, Chinese, Japanese, French, Persian,Portuguese, Turkish and English terminologies inaddition to the words used by many tribes. Being adirect descendant of Sanskrit, it has many tatsamwords and influence of Sanskrit.

Population using Nepali Language

Nepali language has the privilege to be ‘LinguaFranca’ in the northern districts of West Bengal andSikkim in India. In addition, it is widely spoken inmany pockets of Arunachal Pradesh, Assam,Meghalaya, Himachal Pradesh, Madhya Pradesh,Uttar Pradesh, Uttaranchal and Punjab. It has beenan official language in the district of Darjeeling (W.B)since 1961 and that of Sikkim since 1974 i.e. ever

Contents Page 70

Page 73: tdiljan2002

+É = É, < = Ê , <Ç = Ò, = = Ö, >ð = Ú@ñ = Þ, B = ä, Bä = è, +Éä = Éä, +Éè = Éè

+ (a) has no matra. The matras É (+É ), Ò (<),Éä (+Éä), Éè(+Éè )are written after the consonant whereas Ê (<) iswritten before, Ö (=), Ú (>ð) and Þ (@ñ )are writtenbelow and ä (B) and è (Bä ) are written above. Thus

EÂò + +É = EòÉ EÂò + @ñ = EÞòEÂò + < = ÊEò EÂò + B = EäòEÂò + <Ç = EòÒ EÂò + Bä = EèòEÂò + = = EÖò EÂò + +Éä = EòÉäEÂò + >ð = EÚò EÂò + +Éè = EòÉè

With ®Âú (r) = and >ð matras are written in an

exceptional form i.e. ®Âú + = = ¯û and ®Âú + >ð = °ü

It may be noted that the matra is tagged on to theconsonant letter and is never written in full. Thus,

EÂò + < (k + i) will not be written as Eò< but as ÊEò, EÂò+ = = EÖò is the correct form of matra and not Eò=. InEò< (kai) and Eò= (kau) forms, < and = are vowel infull form, not the matras.

(b) Consonant Letters (O;atu o.kZ VYANJAN VARÛA)

Eò (ka) JÉ (kha) MÉ (ga) PÉ (gha) R (´a)SÉ (ca) Uô (cha) VÉ (ja) ZÉ (jha) \É (µa)

]õ (¶a) B (¶ha) b÷ (·a) fø (·ha) hÉ (¸a)iÉ (ta) lÉ (tha) n (da) vÉ (dha) xÉ (na){É (pa) ¡ò (pha) ¤É (ba) ¦É (bha) ¨É (ma)

ªÉ (ya) ® (ra) ±É (la) ´É (va) ¶É (¿a)

¹É (Àa) ºÉ (sa) ½þ (ha)

The following points are to be noted:(i) + (a) is a inherent in each consonant letter.

(ii) ¹É (Àa) occurs only in Sanskrit words borrowedinto Nepali.

(iii) Ró (´a), \É (µa) and hÉ (¸a) never occur in the

beginning of the word; Ró and \É and never occurindependently themselves. They are always combinedwith a following consonant.

The first twenty-five consonants i.e. (ka) to (ma) are dividedinto five categories (Vargas) as in Hindi. Rest of theconsonants can be placed in an unmarked category.

Consonant Conjuncts (VYANJAN SANDHI)

The device of conjoining consonant letters was usedin writing Sanskrit to indicate the pronunciation ofconsonants without an intervening inherent /a/.Traditional conjunct consonant letters i.e. IÉ (ksha),YÉ (jµa), jÉ (tra), ¸É (¿ra) and t (dya). It is to be notedthat in Hindi YÉ is pronounced as MªÉ (gya). Traditionalconjunct consonant letters are very common inSanskrit loanwords.

The common conjuncts used and rules for conjunctsformation in Nepali are same as in Hindi.

Halant, Nukta, Visarga, Avagraha & Anuswara

The use of Halant and Visarga is the same in Nepalias in Hindi. Nukt¡ (iÉ±É ±ÉäÊJÉxÉä lÉÉä{±ÉÉä) is not in usedin standard Nepali in Nepal, however it is found insome Nepali books published in India.

Anusw¡ra # Æ indicates a nasal sound. It is a‘homorganic’ nasal representing Ró, \É, hÉ, xÉÂ, and ¨ÉÂbelonging to any of the five consonant categories.When the nasal consonant precedes the consonantof the same category, it will be taken as anusw¡ra.For example, ºÉƺÉÉ®ú (SaNsaR), ´ÉÆÆ¶É (vaNsh) etc.

Here ‘N’ is the sign of anusw¡ra. It is to be notedthat in Nepali anusw¡ra #Æ is not usually used as ahomorganic nasal for nasal consonants RÂó, \ÉÂ, hÉÂ, xÉÂ,¨ÉÂ. These nasal consonants (+xÉÖxÉÉʺÉEò ´ªÉÆVÉxÉ) are usedthemselves. Other properties of the anusw¡ra inNepali are same as in Hindi.

Anunasik

The superscript sign (+IÉ®ú¨ÉÉÊlÉ ½þÉʱÉxÉ ä ÊSɼxÉ)‘chandrabindu’ (#Ä) represents anun¡sik (+xÉÖxÉÉʺÉEò)sound. It is placed above a vowel denoting vowelnasality through the nose. E.g.

½þÉÄ - +ÉÄJÉÉ

when a consonant has a vowel sign i.e. m¡tr¡ aboveits headline, the chandrabindu (#Ä) will be used as‘anusw¡ra’ although this anusw¡ra represents anun¡siksound i.e. chandrabindu.

Numerals (SANKHYA BODHAK)

Nepali uses the same numerals as in Hindi. In someplaces use of Roman numerals is also found. Nepali

Contents Page 71

Page 74: tdiljan2002

also uses marks SÉÉèlÉÉä (1/4) chautho, +ÉvÉÉ ¡dh¡ (1/2), {ÉÉèxÉä paune (3/4), ºÉ´ÉÉ sav¡ (1¼), bä÷fø ·e·h (1½),

and +føÉ<Ç a·h¡¢ (2½).

Punctuation Marks (Virama)

In Nepali, Sentences are concluded with the verticalmark (*) called Purn viram. The vertical stroke isalso used for marking the end of the first hemistichi.e. half verse. For marking the end of the verse itselftwo vertical strokes called deergh viram may be used.

ºÉMÉ®ú ¦É®úÒ xÉÉè±ÉÉJÉä iÉÉ®úÉ ¨É MÉxxÉ ºÉÎCiÉxÉ *{Éä]õEòÉä EÖò®úÉ ºÉɨÉÖxxÉä +É=ÄUô ¨É ¦ÉxxÉ ºÉÎCiÉxÉ **

The rest of the punctuation marks in Nepali are sameas in Hindi / Roman.

Ancient Signs (PRACHIN CHINHA)

$ (Om), g (Swasti), Jh (shri) are mostly used inNepali as in Hindi and Sanskrit.

Formats of Units

Calendar

The era prevalent in the Nepali speaking area is‘Vikram Samvat’ started by the king Vikram. Itdiffers from the christian era by +57 years. The newyear of Nepalese Calendar begins on the 1st day ofmonth BÓSAKH (¤Éè¶ÉÉJÉ) i.e. on the 14th day of Aprilof christian year. The twelve months of the VikramSamvat are named in Nepali: (Sanskrit months aregiven in the bracket)

SÉèiÉ ca®t (SÉèjÉ), ¤Éè¶ÉÉJÉ b®¿¡kh (´Éè¶ÉÉJÉ), tsB je¶h (T;s"B),+¹ÉÉfø aÀ¡·h (+ɹÉÉgø), ºÉÉ=xÉ Saun (¸ÉÉ´ÉhÉ), ¦ÉnùÉè Bhadau(¦ÉÉpù{Énù), +ºÉÉäVÉ Asoj, (+Éζ´ÉxÉ), EòÉÊkÉEò Katik (EòÉÌiÉEò),¨ÉRÂóʺɮ agahan (¨ÉÉMÉǶÉÒ¹ÉÇ), {ÉÚºÉ p£s ({ÉÉè¹É), ¨ÉÉPÉ m¡gh

(¨ÉÉPÉ), ¡òÉMÉÖxÉ ph¡gun (¡òɱMÉÖxÉ)

Week Days

Seven days of the week are named as follows:

®úÊ´É´ÉÉ®ú (+É<iÉ´ÉÉ®ú) Raviv¡r/Ëitv¡r (Sunday)

ºÉÉä¨É´ÉÉ®ú Somv¡r (Monday)

¨ÉÆMɱɴÉÉ®ú Mangalv¡r (Tuesday)

¤ÉÖvÉ´ÉÉ®ú Budhv¡r Wednesday)

MÉÖ°ü´ÉÉ®ú (¤ÉÞ½þº{ÉÊiÉ´ÉÉ®ú) Gur£v¡r/B¤ihspativ¡r (Thursday)

¶ÉÖGò´ÉÉ®ú áukrav¡r (Friday)

¶ÉÊxÉ´ÉÉ®ú (¶ÉÊxɶSÉ®ú ÉÉ®úú) áaniv¡r/áani¿carv¡r (Saturday)

Dates

The dates of the month in Nepali calendar are sameas in Hindi. There is a slight difference inpronunciation and writing of some of the dates asmentioned below:

{É\SɨÉÒ paµcm¢ (5), jɪÉÉänù¶ÉÒ trioda¿¢ (13), +¨ÉɴɺªÉÉ/+ÉéºÉÒ am¡vasy¡ (15)

Time

Time in Indian context in {ɱÉÉ/{ɽþ®ú (pal¡/pahar), PÉb÷Ò(gha·¢), {ɱÉÉ (pal¡) and +IÉ®ú (ak¿ar). A {ɱÉÉ/{ɽþ®ú (pal¡/pahar) (in Sanskrit it is called ‘|ɽþ®ú’) is 1/8 of dayand night i.e. 3 hours.

A ‘PÉb÷Ò’ (gha·¢) is 1/60 of days and night i.e. 24minutes.

A ‘PÉb÷Ò’ is divided into 60 parts which are called +IÉ®ú(akshar).

These time points can be explained as under:

+IÉ®úà = 24/60 (=2/5 Seconds)

{ɱÉÉ = 60 +IÉ®ú (=24 Seconds)

PÉb÷Ò = 60 {ɱÉÉ ( = 24 minutes)

{ɽþ®ú = 7.5 PÉb÷Ò (= 3 hours)

day + night = 60 PÉb÷Ò = 24 hours and 8 {ɽþ®ú

Currency

The principal unit of currency is the rupee (°üÊ{ɪÉÉÄ).A rupee has a hundred p®s¡. Metallic coins are onep®s¡, 2 p®s¡, 3 p®s¡, 5 p®s¡, 10 p®s¡, 20 p®s¡, 25p®s¡, 50 p®s¡, one rupee, two rupees, five rupees.Currency notes are for one rupee, two rupees, fiverupees, ten rupees twenty rupees, fifty rupees, onehundred rupees, five hundred rupees and onethousand rupees.

Previously, rupee had 16 ¡n¡s, mohar has eight ¡n¡s,suk¡ had four ¡n¡s, ¶ak¡ was half ¡n¡, p®s¡ was one-fourth ¡n¡. Now this type of currency is notprevalent.

Weight and Measures

(a) The unit of weight is ‘ºÉä®ú’ (ser) which is dividedinto sixteen parts called ‘Uô]õÉEò’ (cha¶¡k). It can bedescribed as following.

Contents Page 72

Page 75: tdiljan2002

4 Uô]õÉEò = 1 {ÉÉ= (quarter)8 Uô]õÉEò = +ÉvÉÉ ºÉä®ú (half ser)16 Uô]õÉEò = 1 ºÉä®ú (ser)40 ºÉä®ú = 1 ¨ÉxÉ (maund)

(b) For weighing gold, silver etc. as well as medicines,the following weights are used:

8 JɺÉJÉºÉ = 1SÉɨɱÉ8 SÉÉ´É±É = 1 ®úkÉÒ8 ®úkÉÒ = 1 ¨ÉɶÉÉ12 ¨ÉɶÉÉ = 1 iÉÉä±ÉÉ5 iÉÉä±ÉÉ = 1 Uô]õÉEò (cha¶¡k)(c) The unit for linear measurement is MÉVÉ (yard) asshown below:12 <ÆSÉ = 1 ¡Öò]õ (feet)3 ¡Öò]õ = 1 MÉVÉ220 MÉVÉ = 1 �ò¡ôÌÊWó

8 �ò¡ôÌÊWó = 1 ¨ÉÉ<±É (mile)

1 ¤ÉÉʱɶiÉ = 1 +ÉvÉÉ ½þÉiÉ (half hand)

1 ½þÉlÉ = +ÉvÉÉ MÉVÉ (half yard)

1 MÉVÉ = 16 ÊMÉ®ú½þ

(d)Land areas are measured in the following way:

(a)144 ´ÉMÉÇ <ÆSÉ = 1 ´ÉMÉÇ ¡Öò]õ (Square foot)9 ´ÉMÉÇ ¡Öò] = 1 ´ÉMÉÇ MÉVÉ (Square yard)4840 ´ÉMÉÇ MÉVÉ = 1 BEòb÷ (acre)

(b)

20 ʤɺɴÉÉƺÉÒ = 1 ʤɶ´ÉÉ (bi¿w¡)

20 ʤɶ´ÉÉ = 1 ¤ÉÒPÉÉ31/4 ʤÉPÉÉ = 1 BEòb÷14,400 ´ÉMÉÇ ¡Öò]õ = 1 ʤÉPÉÉ

At present, new metric system in weights andmeasurement are used as in India. The descriptionsare given here.

(a) For measuring Solid things:

1 OÉÉ¨É = about 15.48 OÉäxÉ (grain)

or 1 ¨ÉɶÉÉ1000 OÉÉ¨É = 1 ÊEò±ÉÉäOÉÉ¨É (Kg)100 ÊEò±ÉÉäOÉÉ¨É = 1 ÏC´É]õ±É (quintal)

(b) For measuring liquid,

1 MªÉɱÉxÉ = about 4.5 ʱÉ]õ®ú (Litres)

1 ʱÉ]õ®ú = 1.75 Ë{É]õ (pint)1000 ʱÉ]õ® = 1 ÊEò±ÉÉäʱÉ]õ®ú (Kilolitre)

(c) For measuring areas etc.

(i) 100 ºÉäx]õÒ¨ÉÒ]õ®ú = 1 ¨ÉÒ]õ®ú (meter)

1000 ¨ÉÒ]õ® = 1 ÊEò±ÉÉä ÉÒ]õ®ú (Kilometer)

8 ÊEò±ÉÉä ÉÒ]õ® = 5 ¨ÉÒ±É

1 ¨ÉÒ]õ®ú = about 39.4 <ÆSÉ (inches)

(ii)1 ½äþC]äõªÉ® = more than 2 BEòb

100 ½äþC]äõªÉ® = 1 ´ÉMÉÇ ÊEò±ÉÉä ÉÒ]õ®ú (Sq. Kms)

AM/PM markers|ɦÉÉiÉ ({ÉÚ.) |ÉÉiÉ&/ºÉ֤ɽ dawn / morning{ÉÚ´ÉÉǼxÉ forenoon

+{É®úɼxÉ (+{É.) after noon

Ênù=ĺÉÉä Noon

ºÉÆvªÉÉ/ºÉÉÆZÉ/¤Éä±ÉÖEòÒ Evening

®úÉÊjÉ/®úÉiÉ Night

Time Zone

+ÉVÉ Today

¦ÉÉäʱÉ/ʽþVÉÉä Tomorrow & Yesterday

ÊnùxÉ Day

{ÉÌºÉ Day after tomorrow and daybefore yesterday

+κiÉ Two days after tomorrow and

two days before yesterday

ºÉ{iÉɽþ/½þ}iÉÉ Week

{ÉIÉ Fortnight

¨ÉɺÉ/¨ÉʽþxÉÉ Month

ÊiɨÉɽþÒ Quarterly/ Three monthlyUô¨ÉɽþÒ Half-yearly

´É¹ÉÇ/ºÉÉ±É Year

nù¶ÉɤnùÒ Decade

¶ÉiÉɤnùÒ Century

ºÉ½þºjÉɤnùÒ Millennium

(Courtesy : Sh. Prakash Prasad, All India Radio,New Delhi Tel : 011-3715411)

Contents Page 73

Page 76: tdiljan2002

Typical Colloquial Sentences in Nepali

GREETING

w Hello

½äþ±ÉÉèB EòºiÉÉä ? B CªÉÉ ½þÉä?YE KASTO ? YE KYA HO?

w Good MorningxɨɺiÉä * xɨɺEòÉ®úNAMASTE/ NAMASKËR

w Good AfternoonxɨɺiÉä * xɨɺEòÉ®úNAMASTE/ NAMASKËR

w Good Night

¶ÉÖ¦É - ®úÉÊjÉáHUBH RËTRI

w Good Bye½þºÉ iÉ * ±É VÉÉ=Ä ½èþ *

HAS TA/LA JAUN HÓ

w ThanksvÉxªÉ´ÉÉnù

DHANYAWËD

w How are youEòºiÉÉä UôºÉ * EòºiÉÉä UôÉè * EòºiÉÉä ½ÖþxÉÖ½ÖþxUô * ½þVÉÖ®ú±ÉÉ<ÇEòºiÉÉä Uô ?KASTO CHHASA/KASTO CHHAU/ KASTOHUNUHUNCHHA HAZURLËÌ KASTO CHHA

w I am fine thank you

¨É±ÉÉ<Ç ºÉxSÉè Uô vÉxªÉ´ÉÉnù

MALËÌ SANCHÓ CHHA DHANYAWËD

w Sorry¨É±ÉÉ<Ç +¡òºÉÉäºÉ Uô

MALËÌ APHSOS CHHA

WEATHER

w It is coldVÉÉb÷Éä UôJËÚO CHHA

w It is cool outside¤ÉÉʽþ®ú ÊSɺÉÉä UôBËHIRA CAISO CHHA

w It is hot

MÉ®ú É Uô * iÉÉiÉÉä Uô

GARAM CHHA/TËTO CHHA

w It is raining{ÉÉxÉÒ {ÉnêùUô

PËNI PARDÓ CHHA

GENERAL

w What is Your Name?ÊiÉ©ÉÉä xÉÉ=Ä Eäò ½þÉä? iÉ{ÉÉ<ÄEòÉä xÉÉ=Ä Eäò ½þÉä ?

TIMRO NAUN KE HO ? TAPËINKO NAUN KE HO?

w My Name is Ranjan

¨Éä®úÉä xÉÉ=Ä ®ú\VÉxÉ ½þÉäMERO NAUN RAØJAN HO

w Where do you live?ÊiɨÉÒ EòiÉÉ ¤ÉºUôÉè/¤ÉºUäô=?iÉ{ÉÉ<È EÖòxÉ Bkm¡ek ¤ÉºxÉÖ½ÖþxUô?TIMÌ KATË BASCHHAU/BASCHHEU?TAPËÌN KUN ÙHËUNMË BASNUHUNCHHA?

w I live near Ghantaghar¨É PÉx]õÉPÉ®ú ÉÉ ¤ÉºUÖôMA GHANÙËGHARMË BASCHHU.

w How old are you?

ÊiɨÉÒ EòÊiÉ ´É¹ÉÇEòÉ ¦ÉªÉÉè ? iÉ{ÉÉ<ÈEòÉä =¨Éä®ú EòÊiÉ ¦ÉÉä ?TIMÌ KATI VARâHAKË BHAYAU? TAPËÌNKOUMER KATI BHO?

w That building is tall

iªÉÉä PÉ®ú +M±ÉÉä/+±MÉÉä UôTYO GHAR AGLO/ALGO CHHA

w She is beautiful

=xÉÒ ®úÉ©ÉÒ ÊUôxÉÂUNÌ RËMRÌ CHHIN

w I like Bengali sweets¨É±ÉÉ<Ç ¤ÉRÂóMÉɱÉÒ feBkbZ ¨ÉxÉ {ÉUÇôMALËÌ BEÛGËLÌ MIÙHËI MANA PARCHHA

w I love birds¨É±ÉÉ<Ç SÉ®úɽþ°ü ¨ÉxÉ {ÉUÇôxÉÂMALËÌ CHARË HARÍ MANA PARCHHAN

w Where is Railway Station?®äú±É´Éä º]äõ¶ÉxÉ EòiÉÉ Uô ?RAILWAY STATION KATË CHHA

w How far is the Bus Terminal fr om here?

ªÉºÉ Bkm¡nsf[k ¤ÉºÉ ]ǫ̃ÉxÉ±É EòÊiÉ ]õÉføÉ ½þÉä±ÉÉ ?

Contents Page 74

Page 77: tdiljan2002

w Food is delicious

JÉÉxÉÉ º´ÉÉÊnù±ÉÉä Uô *KHËNË SWËDILO CHHA

w Congratulations¤ÉvÉÉ<Ç UôBADHËI CHHA

w You look lovelyiÉ{ÉÉ<È ®úÉ©ÉÉä * ®úÉ©ÉÒ näùËJÉnèù ½ÖþxÉÖ½ÖþxUôTAPËIN RËMRO/RËMRI DEKHINDÓHUNUHUNCHHA

w Wish you happy new yearxɪÉÉÄ ´É¹ÉÇ ÊiÉ©ÉÉä * iÉ{ÉÉ<È/½þVÉÖ®úEòÉ ±ÉÉÊMÉ ºÉÖJÉ¨ÉªÉ ½þÉäºÉ *NAYËN VARâA TIMRO/TAPËIN/HAZURKË LËGISUKHAMAYA HOS

w I wish you all the happiness¨ÉiÉ{ÉÉ<È/½þVÉÖ®úEòÉä JÉÖºÉÒ ®ú Eò±ªÉÉhÉEòÉä ¨ÉRÂóMɱÉEòɨÉxÉÉMÉnÇùUÖôMATAPËIN/ HAZURKO KHUSI RA KALYËÛ KOMA×GAL KËMANË GARDACHHU

w Congratulations on your marriageÊiÉ©ÉÉä/iÉ{ÉÉ<È/½þVÉÖ®úEòÉä ¶ÉÖ¦ÉÊ´É´ÉɽþEòÉ ±ÉÉÊMÉ ¤ÉvÉÉ<Ç UôTIMRO/TAPËIN HAZURKO áUBH VIVAHKË LËGIBADHËÌ CHHA

w Keep your eyes wide open before marriage andhalf shut afterwardsʤɽäþ¦ÉxnùÉ {Éʽþ±Éä SÉxÉÉJÉÉä ¦É<Ç ¤ÉºxÉÉäºÉ ®ú ʤɽäþ{ÉÊUô +ÉÄJÉÉ+ÉvÉÉ JÉÖ±ÉÉ ®úÉJxÉÉäºÉBIHEBHANDË PAHILE CANËKHO BHAÌ BASNOSRA BIHEPACHHI ËNKHË ËDHË KHULËRËRKHNOS

(Courtesy : Sh. Prakash Prasad, All India Radio,New Delhi Tel : 011-3715411)

YAS ÙHËUNDEKHI BUS TERMINAL ÙËÚHËHOLË?

w How long will it take to reach the airport?½þ´ÉÉ<Ç+bÂ÷b÷É {ÉÖMxÉ EòÊiÉ ºÉ¨ÉªÉ ±ÉÉM±ÉÉ ?HAWËÌAÚÚË PUGNA KATI SAMAY LËGLË?

w Is Mr. Raghunath There?

Eäò ®úPÉÖxÉÉlÉVªÉÚ iªÉ½þÉÄ ½ÖþxÉÖ½ÖþxUô ?KE RAGHUNËTHJYÍ TYANHËN HUNUHUNCHHA

w Please tell him to call back as soon as he is freeEÞò{ɪÉÉ =½þÉıÉÉ<Ç ¡ÖòºÉÇnù ½ÖþxÉɺÉÉlÉ ºÉEäòºÉ¨¨É SÉÉÄb÷Éä ¡òÉäxÉMÉxÉÇ ¦ÉzÉÖ½þÉä±ÉÉKÎIPYË UHËNLËÌ PHURSAD HUNË SËTHSAKESAMMA CHËNÚO PHONE GARNABHANNUHOLË.

w How much will it cost?ªÉºÉ±ÉÉ<Ç EòÊiÉ {ɱÉÉÇ? ªÉºÉEòÉä nùÉ¨É EòÊiÉ {ɱÉÉÇ ?YASLËÌ KATI PARLË ? YASKO DËM KATIPARLË?

w Excuse me¨ÉÉ¡ò MÉxÉÖǽþÉä±ÉÉMËPH GARNUHOLË

w From Which Platform can I get the train forChandigarh?¨Éè±Éä EÖòxÉ {±Éä]õ¡òɨÉǤÉÉ]õ SÉhb÷ÒMÉføEòÉ ±ÉÉÊMÉ ®äú±É ºÉ¨¨ÉÉxɺÉCUÖô ?MÓLE KUN PLATFORM BËÙA CHANÚÌGAÚHKËLAGI RAIL SAMËTNA SAKCHHU?

w Does this train stop at Aligarh?

Eäò ªÉÉä ®äú±ÉMÉÉb÷Ò +ʱÉMÉfø¨ÉÉ +bÂ÷Uô ?KE YO RELGËÚÌ ALIGAÚHMË AÚACHHA ?

w How many kids do you have?ÊiÉ©ÉÉ EòÊiÉ VÉxÉÉ UôÉä®úÉUôÉä®úÒ - ½þ°ü UôxÉ ? iÉ{ÉÉ<ÈEòɤÉɱɤÉSSÉÉ EòÊiÉ UôxÉÂ?TIMRË KATI JANË CHHORË CHHORÌ HARÍCHHAN? / TAPËINKA BËLBACCË KATI CHHAN?

w The gift is wonderful

ªÉÉä ={ɽþÉ®ú ¤Éb÷Éä ®úÉ©ÉÉä Uô *YO UPHËR BAÚO RËMRO CHHA

w It is really prettyªÉÉä ºÉÉÄÊSÉEèò ºÉÖxnù®ú Uô *YO SËNCHCHIKÓ SUNDAR CHHA

Contents Page 75

Page 78: tdiljan2002

3.3 INSFOC (Indian Standard Font Code)The proposed font standard is targeted towards thefollowing class of users(1) Data Processing(2) Office Users / Word Processing(3) Textbook Publishers(4) Web Content Creators(5) Desktop Applications

It is certainly not targeted towards professionaldesktop publishers, advertising agencies and highlySanskritized text content creators.

The font is laid out such that the font remainsunchanged between the character locations 0x80 to0xFF in Monolingual and Bilingual Font layout. Themonolingual font contains more compoundcharacters and conjuncts.

Rules for Composing Devanagari Text

(1)The Devanagari characters lying in the codes 0x80to 0xFF are designed to be kept in the same locationfor Devanagari bilingual font. Here a majority ofthe consonants are kept in their half form. The fullconsonant is formed by adding a ‘kana’ (Verticalstroke- 0xDE) to the half form. It is recommendedthat the kana located at 0xDC be used for thatpurpose. For example

M (0xAA) + É (0xDE) = MÉ

(2) There are two matras (Vowel Signs) of vowel I (<Ç)with different overhanging spans. These matras arelocated at 0x4C and 0xE1. The matra at 0x4C is usedfor the wider letters like ka (Eò), fa (¡ò) as shown below:

Eò (0xA7) + Ò (0x4C) = EòÒ

The matra at 0xE1 is used for other letters, which arenot wider like Ma (¨É), Ra (®), Ya (ªÉ) etc. for example

¨É + Ò (0xE1) = ¨ÉÒ®ú + Ò (0xE1) = ®Ò

The matras shown at code points 0x4A and 0x4Dare with the rakar (®ú Ra is coming in a syallable andbeing pronounced before the consonant to which itis applied) and with different overhanging spans.For example : |ÉÉlÉÔ, iÉÖEòÔ

The matras shown at code points 0x4B and 0x4Eare with the rakar & Anuswar and with differentoverhanging spans.

(3)Similarly, there are three types of the matras of

vowel sign (<) with different overhanging spans.These matras are located at locations 0xDF, 0xE0and 0x4F. The matra at 0xDF is used for normalsize letters such as ® (Ra), Eò (Ka), ¡ò (Fa), b÷ (dha)etc. For example

Ê (0xDF) + ® (0xC7) = Ê®Ê (0xDF) + Eò (0xA7) = ÊEò

The other form of matra of vowel I (<) is used forwider letters such as ºÉ (Sa), ¨É (Ma), ªÉ (Ya) etc. Forexample

Ê (0xE0) + ¨É (0xC5) = ΨÉÊ (0xE0) + ºÉ (0xCD) = κÉ

The third form of the matra of vowel I (<) is usedwhen there is a half form of a consonant in a word.In this case the matra is attached to the ‘Kana’(Vertical stroke) of the preceding consonant. Forexample in the words κlÉiÉ, ¶ÉÎCiÉ etc.

(4) The shifted ukar and ookar (Vowel signs for uand uu) located at 0x42 and 0x43 are to be usedwith characters which are not having full Kana andin this characters the matra is attached to the centerlower part of the characters such as in bÖ÷, ]Úõ

(5) The rakar located at ‘0xEB’ is provided forcharacters Eò (Ka), ¡ò (Pha), ¨É (Ma), ¦É (Bha), ´É(Va), xÉ (Na), ¤É (Ba), etc. This rakar is attached to thecharacters at slightly upwards shifted position (almostat the middle of Kana). For example as in ´ÉGò, xÉ©É

(6) The rakar located at 0x05 is provided forcharacters MÉ (Ga), SÉ (Ca), VÉ (Ja), lÉ (Tha), PÉ(Gha), vÉ (Dha), ªÉ (Ya) etc. This rakar is attachedto the characters at slightly downwards shiftedposition. For example ´ªÉOÉ, ´ÉXÉ, ´ªÉÉQÉ

(7) The widths of letters Eò (Ka), ¡ò (Pha), ¯ (RRu),

° (Roo) etc. are reduced by the width of the fixedkern space located at 0xFD to ensure properanchoring of matras.

(8) The widths of letters kÉ (TTa), lÉ (Tha), n (Da),iÉ + iÉ (Ta+Ta), lÉ + lÉ (Tha+Tha), etc. are reducedby the width of the fixed kern space located at 0xFEto ensure proper anchoring of matras.

(Courtesy : Sh. M.D. Kulkarni, C-DAC, PuneE-mail : [email protected] Tel : 020-5694092)

Contents Page 76

Page 79: tdiljan2002

Contents Page 77

Page 80: tdiljan2002

3.4 Indian Standard Lexware Format

Preamble

It is today an accepted fact that the lexicon is thesource of power in any language processing system.The more well structured and knowledge rich thelexicon, the better is the quality of the analysis andthe generation of linguistic data. The coverage ofthe whole gamut of the phenomena in any naturallanguage needs constructing accurate and vastlexicons of fine quality. This is often a very largetask needing quality manpower in large number.

In a country like India with multiple languages, theneed for the standardization of the Lexware formatwas long felt so that this resource can be developedin a collaborative manner. Enabling of Indianlanguages on the internet and the localization ofInformation Technology has been taken up as apriority task by the Ministry of Communications &Information Technology (MCIT) in India. The ideais to be able to pull the resources of lexicon buildingactivity going on at various places and in variouslanguages. If all these efforts follow a simple andstandard format, the collaborative development ofthis very important linguistic resource will proceedefficiently. Also the future linkages of variouslanguage lexicons will be easier.

Basic Concepts

For any language processing system, the three mostimportant Lexical resources are

1. Multilingual Lexicon2. Language Word Net3. Ontology

The multilingual lexicon defines the mapping ofwords from one language to another. The WordNetis a massive network which connects words withvarious semantic relations. The Ontology is acategorization of concepts, which is languageindependent, and guides the various decisions andchoices in lexicon and WordNet building.

In this document we primarily discuss thestandardization of the lexicon- the entities in it, theformat of the entries and the linkage of the entrieswith the WordNet and the Ontology. The diagramin Lex. 1 illustrates the various entities that must bepresent in a lexicon.

Lex.1 : Essential Entries in the Lexicon

Here the language specific strings map to ‘concepts’which are language independent. The concepts couldbe represented by the words of a language afterattaching disambiguation constructs to them. Thusdog(a-kind-of animal along with other word-netattributes) represents a language independent conceptwhich is linked to kuttaa in Hindi, kukur in Bengaliand so on. Here dog(a-kind-of animal along withother word-net attributes) belongs to the space ofconcepts while kuttaa or kukur belongs to the spaceof language specific strings.

The space of attributes is of great importance sincethe flags in the space guide the analysis andgeneration processes. The attributes are of two kinds–syntactic and semantic. The part of speech and themorphological behavior belong to the set of syntacticattributes, while the information like animate/inanimate, perishable, eatable etc. belongs to the setof semantic attributes. While the former is welldefined set, the later is quite open-ended and dependson the domain and the application envisaged.However, some semantic attributes seem to befrequently used across the domains and they alsocan looked upon as a standard set. Lex.2 depicts theinter-relationships.

It is reemphasized in this document that ourstandardization effort recognizes the supremeimportance of the semantic attributes. Though openended, the lexicon building activity can be acceleratedby setting up the ontology. Thus semantic attributesconnect the lexicon space with the ontology space.

The disambiguation constructs like a-kind-of, part-of, has-part etc. are semantic relations with otherwords. Thus the disambiguation set connects the lexiconspace with the WordNet space.

Contents Page 78

Page 81: tdiljan2002

Lex.2 : Interaction between Lexicon Space WorldNet Space and Ontology Space

Having described these basic ideas we now proceedto the recommendation for standardization.

Standardization

The nomenclature used in the lexicon is thefollowing:

w Root Word : Word from which the various formsare generated.

w Morphology Paradigm Number : An integerspecifying the table in which the morphologicaltransformation rule of the root are specified. Thetable specifies the strings that must be attacheddepending on the subject object and/or other caserelated information. For example, one suchparadigmatic information could beladakA (boy) in Hindi takes the string oM ne afterdeleitngA, in plural number and past tense for a transitiveverb.

w Domain : Used to indicate the domain to whicha particular meaning of the word belongs. Forexample~G is used for general, ~IT forinformation technology,~H for health and so on.

w English Meaning [Example Sentence] : EnglishMeaning of the word followed by an examplesentence that implicates the meaning.

w Syntactic Information : This is part of speech,morphology paradigm and such otherinformation. Also are included disambiguationrules for verbs. See below (verb pattern scheme).

w Semantic Tag : Tags or disambiguation rules foreach English sense of the word. These tags are oftwo kinds:

Lex.3 : A Basic Ontology

w Ontology Tags : These are obtained from thedirected acyclic graph (DAG) that represents thecategories of the concepts. A very top-level partof this is shown in Lex.3.

w WordNet tags : For a particular sense of the word,the words from the wordnet which have semanticrelations like hypernym, meronym etc. with thegiven word. For example, for dog we would keepanimal (hypernym), leg (meronym) etc. These tagsassist in uniquely identifying the meaning. Wewould also like to keep the sense number of theWordNet in this field.

Verb Pattern SchemeTag Description Example Sentence

La Linking verb + adjective The soup was delicious.

Ln Linking verb + noun Ram became a teacher.

I Intransitive verb Ram is sleeping.

Ipr Intransitive verb People complain about+ prepositional Phrase the traffic.

Ip Intransitive verb + particle The monkeys chatteredaway.

In/pr Intransitive verb + noun The meeting lasted threeor prepositional phrase hours/for three hours.

It Intransitive verb + Jane hesitated to phoneto-infinitive the office

Tn Transitive verb + noun A small boy opened thedoor.

Tn.pr Transitive verb + noun The accused convinced

Contents Page 79

Page 82: tdiljan2002

+ prepositional phrase the court of hisinnocence.

Tn.p Transitive verb + noun The nurse shook the+ particle medicine up.

Tf Transitive verb + finite Officials believe that a‘that’ Clause settlement is possible.

Tw Transitive verb We had not decided what+ wh-clause we ought to do next/

what to do next.

Tt Transitive verb + Mary hates to drive in theto-infintive rush-hour

Tnt Transitive verb + noun expect the parcel to arrive+ to-Infinitive tomorrow.

Tg Transitive verb +ing Peter enjoys playingform of a football.

Tsg Transitive verb + noun We dread Mary/Mary’s(+ ‘s) + ing form of a verb taking over the business.

Tng Transitive verb + noun + She spotted a maning form of a verb waving in the crowd.

Tni Transitive verb + noun + We watched the menInfinitive unpack the china.

Cn.a Complex-transitive verb The fridge keeps the beer+ noun + adjective cool.

Cn.n Complex-transitive verb The court considered+ noun +noun Smith a trustworthy witness.

Cn.n/a Complex-transitive verb The police didn’t accept+ noun +as + noun or the story as true (or as theadjective fact).

Cn.t Complex-transitive verb The thief forced Sita to+ noun -to-infinitive hand over the money.

Cn.g Complex-transitive verb The policeman got the+ noun + -ing form of a traffic moving.verb

Cn.i Complex-transitive verb Mother won’t let the+ noun +infinitive children play in the road.

Dn.n Double-transitive verb Henri taught the+ noun + noun children French.

Dn.pr Double-transitive verb Henri taught French to+ noun + prepositional the Children children.phrase

Dn.f Double-transitive verb Colleagues told Paul that+ noun + finite ‘that’ the job would not beclause easy.

Dpr.f Double-transitive verb + Employers announced toprepositional phrase journalists that the+ finite ‘that’ clause dispute had been settled.

Dn.w Double-transitive verb The porter reminded+ noun + wh-clause guests where they should

leave their luggage/whereto leave their luggage.

Dpr.w Double-transitive verb + You should indicate toprepositional phrase + the team where they arewh-clause to assemble/where

to assemble.

Dn.t Double-transitive verb The director warned the+ noun + to-infinitive actors not to be late.

Dpr.t Double-transitive verb + Fred signaled to theprepositional phrase + to waiter to bring chair.-infinititive

The above syntactic information are stored alongwith the verb entries in the lexicon to facilitatedisambiguation.

Semantic Information used for the Verbs (Used onlywhen there are multipal meaningssth(to sb) ............. something(to somebody)

sb(with sth) ......... somebody(with something)sth(from sb) ......... something(from somebody)sth(for sth) .......... something(for something)sb/sth(with sth) ... somebody/something (with

something)

ExamplesFor NounLine1: Root WordLine 2: Morphology-Paradigm No nounLine 3: ~Domain1 English Meaning1 [Example

Sentence];~Domain2 English Meaning2 [Examplesentence]>; ....

Line 4: hypernymy1[,meronymyny1];hypernymy2 [, meronymy2];....

Line 5: ~HN Hindi Meaning1: m/f/n paradigm no,ontology tags1;Hindi Meaning2: m/f/n paradigm no,ontology tags2; .…

Line 6: ~AS (Assamese meaning with otherinformations)

Line 7: ~( Similarly meanings for other IndianLanguages can be entered in separate linesone after another)

Example for nounLine 1: head /English root word/Line 2: 65 noun /Table number with morphological

and categorical information/

Contents Page 80

Page 83: tdiljan2002

Line 3: ~G part of the body containing the eyes, nose,mouth and brain[He fell and hit his head];~ADM chief person of a group or organisation[Report to the Head immediately];~FIN accounts head[Into which head should I put the givenexpenditure];~G head of the coin[We tossed a coin and it came down heads]/ Different English senses of the root word/

Line 4: body_part; post_holder; topic; thing/Different English senses of the word/

Line 5: ~HN sira:m 6, inanimate, concrete;aXyakRa:m 6, animate, concrete;SIrRaka:m 6, inanimate, abstract;ciwwa:m 8, inanimate, concrete/Hindi meanings with gender information,paradigm type and ontology/

Note : It may be noted that for machine translationtask, we avoid putting too many meanings in thelexical data-base unless these can be disambiguated.In this sense, it may be advisable to keep only twomeanings as ‘sira’ and ‘heda’ (since the English word‘head’ in all other contexts have been frequently usedas it is in Hindi).

For Verb

Line 1: Root WordLine 2: Morphology-Paradigm No verbLine 3: ~Domain1 English Meaning1 [Example

Sentence];~Domain2 English Meaning2 [ExampleSentence]; .....

Line 4: hypernymy1 [, Syntactic Info1] [, SemanticTag1 for Subject]

[,Semantic Tag1 for Object1] [, SemanticTag for Object2];[ ] [, ] [, ] [, ] [ ]; .....

Line 5: ~HN Hindi Meaning1: Paradigm-No[,Vibhakti Parasarg Infofor Subject] [,Vibhakti Prasarg Info forObject] [, ontology tags];(Similar information for other meanings);...

Line 6: ~(Similarly meanings along with thelanguage specific information

can be entered in other Indian languages inseparate lines below)

Example for verb

Line 1: murder /English root word/Line 2: 16 verb /Table of morphological and

categorical information/Line 3: ~G kill somebody unlawfully and

intentionally[He murdered her with a knife];…../Different English senses of the root word/

Line 4: kill, Tn, Tn. pr, I, sb, sb- with- sth, [ ],[ ]; /Disambiguation rules for disambiguating

different senses of the verb/Line 5: ~HN hawtyaA kara:257 ne, ko, VoA; /Hindi

meaning with subject and objectVibhakti parasarga and ontology tags/

For Adjective

Line 1: Root WordLine 2: Paradigm-No adjectiveLine 3: ~Domain1 English Meaning 1 [Example

Sentence];~Domain2 English Meaning 2 [ExampleSentence]; ..…

Line 4: hypernymy1[, antonymy1]; hypernymy2[,antonymy2];…..

Line 5: ~HN Hindi Meaning1, ontology tags1;Hindi Meaning2, ontology tags2; ...

Line 6: ~ (Similarly meanings in other indianLanguages can be entered in lines below)

Example for adjective

Line 1: mysterious / English root word /Line 2: 12 adjective /Table of morphological and

categorical information/Line 3: ~G difficult to understand or explain [She

gave me a mysterious look] /Different Englishsenses of the root word/

Line 4: state, clear /Wordnet Tag/Line 5: ~HN rahasyamaya, Hindi Meaning/

For Adverb

Line 1: Root WordLine 2: Morphology-Paradigm No adverbLine 3: ~Domain English Meaning1[Example

Sentence];~Domain English Meaning2 [ExampleSentence]; ......

Line 4: hypernymy1[, synonymy1]; hypernymy2[,synonymy2];…..

Line 5: ~HN Hindi Meaning1[, ontology tags1];

Contents Page 81

Page 84: tdiljan2002

Hindi Meaning2[, ontology tags2];…..Line 6: ~ (Similarly meanings in other indian

Languages can be entered in lines below)

Example for adverb

Line 1: mysteriously /English root word/Line 2: 13 adverb /Table of morphological and

categorical information /Line 3: ~G mysteriously [The main witness had

mysteriously disappeared];/Different English senses of the root word/

Line 4: manner /WordNet Hypernymy/Line 5: ~HN rahasyamaya DaMga se, /Hindi

Meaning/

Semantic Attributes For Lexware StandardNoun

l Proper Noun (PROP, eg, rAma)l Common Noun (If a noun is neither Proper nor

Collective then it is taken for granted that it isCommon Noun and hence, no symbol is keptfor it.)

l Collective Noun (COLCT, eg, BidZa)l Animate (ANIMT)

n Flora (FLORA)=> Shrubs (FLORA-SHRB , eg, tulasI)*=> Aquatic plants (FLORA-AQTC,eg kamala)=> Climbers (FLORA-CLMB, eg , aMgura kI

bela)=> Trees (FLORA-TREE, eg, Ama)n Fauna (FAUNA)=> Mammals

•Person(ANIMT-FAUNA-MML-PRSN,egladZakI)• Ape (ANIMT-FAUNA-MML-APE,eg,laMgUra )•Lesser Mammals (ANIMT-FAUNA-MML-LSMML, eg dAzlPZina)

=> Reptiles (ANIMT-FAUNA-RPTL, eg,sAMzpa)

=> Amphibians (ANIMT-FAUNA-AMPHB,eg, kaCuA)

=> Aquatic Animals(ANIMT-FAUNA-AQAN,kekadZA)

=> Birds (ANIMT-FAUNA-BIRD, eg, wowA)=> Fish (ANIMT-FAUNA-FISH, eg, SArka)=> Insects (ANIMT-FAUNA-INSCT, wiwalI)=> Micro organism (ANIMT-FAUNA-

MCORG, eg, bEktIriyA )=> Imaginary Animals(ANIMT-EAUNA-

IMGYAN, eg, drEgana)

l Inanimate (INANI)n Object=> Artifact (INANI-ARTFCT, eg, cammaca )=> Natural Object (INANI-NAT-OBJCT,

pahAdZa)=> Edible (INANI-EDBL-OBJCT, eg,miTAI)=> Anatomical (INANI-ANTM-OBJCT, eg

uMgalI, bAla)=> Chemical (INANI-CHML-OBJCT, eg

amla)=> Physical (INANI-PHSCL-OBJCT, eg

kalama, maMca)=> Imaginary (INANI-IMGN-OBJCT, eg

amQwa)

l Place=> Imaginary Place (INANI-IMGY-PLC,

eg,svarga)=> Physical Place (INANI-PHSCL-PLC, eg,

pATaSAlA)

l Event=> Natural Event (INANI-NAT-EVENT, eg,

BUkaMpa)=> Historical Event (INANI-HIST-EVENT, eg,

praWama viSvayuxXa)=> Planned Event (INANI-PLND-EVENT,

eg,bama-visPota)=> Social Event (INANI-SCL-EVENT, eg,

janma)=> Fateful Event (INANI-FTFL-EVENT, eg,

lAztarI nikalanA)=> Fatal Event (INANI-FTL-EVENT, eg,

xurGatanA)

l Abstract (ABS)=> Quality (INANI-ABS-QUAL, eg,acCAI)=> Perception (INANI-ABS-PRCP, eg,sUcanA)

Contents Page 82

Page 85: tdiljan2002

=> Cognition (INANI-ABS-COGN, egkalpanA‘)

=> Colour (INANI-ABS-COLR, eg, lAla)=> Title (INANI-ABS-TITL, eg, proPZesara)=> Measurement (INANI-ABS-MSRMNT, eg,

lambAI)=> Time (TIME)

1.Period (INANI-ABS-TIME-PRD, eg,GaMtA,)2.Season (INANI-ABS-TIME-SSN,eg,garmI)3.Historical ages (INANI-ABS-TIME-HIST, eg, pARANa yuga)

=> Action1.Social (INANI-ABS-ACT-SCL, eg, vivAha)2.Anti-social (INANI-ABS-ACT-ANTISCL, eg, corI)

3.Occupation (INANI-ABS-ACT-OCP, eg,axyApana)4.Communication (INANI-ABS-ACT-COMM, eg, parAmarSa)5.Physical Action (INANI-ABS-ACT-PHSCLACT, eg. dubakI)

=> Object (INANI-ABS-OBJCT, eg SabXa)=> Logos

1. Religion (RLGN)2. Philosophy (PHIL) (Metaphysics,Epistemology, Logic, Ethics)3. Social Sciences (SCLSC) : political science,economics, commerce, law, publicadministration, social services, education,anthropology, psychology, sociology, folklore.4. Language (LNG) (Grammar)5. The Arts (ARTS) : Fine and PerformingArts (Music, Literature, Painting, Sculpture,Film, Drama), Aesthetics and Rhetoric,Useful arts & Crafts6. Geography (GEOG)7. History (HIST)8. Natural Sciences (NATSC)(Physics,Chemistry, Bio-sciences, Astronomy, Geology)9. Mathematics (MATHS)10. Applied Sciences (APPSC): Engineering,

Agriculture, Medicine & Health, Manufacturing,Building & Construction, Ecology11. Sports (SPRT) (Indoor games, Outdoorgames)12. Transport (TRNSPT)13. Home Science (HSC) (Food & Nutrition)14. Mass media(MSMDA)(Journalism,Advertising)15. Fashion Designing(FSHD)(Dresses,Textile)16. Miscellaneous (MSL)

l State (STE)=> Physical State (INANI-STE-PHSCL)

1.Solid (INANI-STE-PHSCL-SLD, eg,pawWara)2.Liquid (INANI-STE-PHSCL-LQD,eg,xUXa)3.Gas (INANI-STE-PHSCL-GAS, eg,AzksIjana)

=> Disease (INANI-STE-DIS, eg, buKAra)=> Biological State (INANI-STE-BIO, eg,

bacapana)=> Mental State (INANI-STE-MNTL,

eg,avasAxa)=> Social State (INANI-STE-SCL eg,

haMgAmA)

l Process (PRCS)=> Physical Process (INANI-PHYS-PRCS eg

KAnA banAne kI viXi)=> Mental Process (INANI-MNTL-PRCS eg

yojanA)

Miscellaneous Noun Attributes :l Abbreviation (ABRV, eg, UNL)l Acronym (ACRNM, eg, UNESCO)l Heading (HEAD, eg, THE INTERNATIONAL

TELECOMMUNI-CATION UNION)

Verbl State (STE)

=> Physical State (INANI-STE-PHSCL)

l Verb of Action (VOA)n Change (VOA-CHNG, eg, baxalanA)n Cognition(VOA-COGN, eg, vicAra

karanA, nirNaya lenA)

Contents Page 83

Page 86: tdiljan2002

n Commencement (VOA-CMNCT, eg,barasane laganA)

n Communication (VOA-COMM, eg,liKanA,bolanA)

n Competition (VOA-COMPT, eg, ladZanA)n Completion (VOA-CMPLT, eg, KA cukanA)n Consumption (VOA-CNSMP, eg, KanA,

pInA)n Contact (VOA-CNTCT, eg, CUnA)n Creation (VOA-CRTE, eg, banAnA)n Destruction (VOA-DSTN, eg, cUra-cUra

karanA)n Emotion (VOA-EMOT, eg, haMzsanA)n Event (VOA-EVNT, eg, barasanA,

himapAwa honA)n Grooming(VOA-GROOM, eg, saMzvaranA)n Maintenance (VOA-MNTC, eg, yaWA-

sWiwi banAe raKanA)n Motion (VOA-MOTN, eg, calanA)n Perception (VOA-PRCP, eg, xeKanA, sunanA)n Performance (VOA-PRFM, eg, nQwya karanA)n Possession (VOA-POSS, eg, kabjZA karanA)n Social (VOA-SCL, eg, cunAva ladZanA,

SAxI karanA)

l Verb of State (VOS)n Physical State (VOS-PHY-ST, eg, KadZA

rahanA)n Mental State (VOS-MNT-ST, eg, ciDZanA)

l Temporal Verbs (TMP, eg calanA)l Verbs of Continuity (CONT, eg bahanA)l Verbs of Volition (VLTN, eg KAna)l Verbs of Non-volition (NVLTN, eg vivaSa honA)

Special Verb Attributes :l Idiom (V-IDM, eg, cUhe billI kA Kela karanA)

Adjectivel Descriptive

n Weight (ADJ-DES-WT , eg, BArI puswaka)n Shape (ADJ-DES-SHP, eg, laMbA rAswA)n Colour (ADJ-DES-CLR, eg, lAla kapadZA)n Strength (ADJ-DES-STRNGTH, eg,

kamajZora kadZI)n Qualitative (ADJ-DES-QUAL, eg, acCA

ladZakA)

n Appearance (ADJ-DES-APPR, eg, suMxaraceharA)

n Speed (ADJ-DES-SPD, eg XimI cAla)n Depth (ADJ-DES-DPTH, eg, gaharA jFAna)n Existence (ADJ-DES-EXST, eg, upasWiwa loga)n Numeral (ADJ-DES-NUM, eg, pAMzca

uMzgaliyAMz)n Temperature (ADJ-DES-TEMP, eg, garama

xUXa)n Quantitative (ADJ-DES-QUAN, eg,

WodZA pAnI)n Respective (ADJ-DES-RESP, eg, SrImawI

iMxirA gAMXI)n Emotion (ADJ-DES-EMOT, eg, kroXIwa

vyakwi)n Demonstrative (ADJ-DMON, eg, yaha

ladZakA)n Interrogative (ADJ-INTRO, eg, kisakA

makAna hE?)n Relational (ADJ-REL, eg, mOserA BAI)

Special Adjective Attribute :l Nouns used as Adjective (N-ADJ, eg,

kaMpyUtara Kela, majaxUra saMGa)

Adverbl Time (ADV-TIME, eg, ke pahale)

l Frequency (ADV-FREQ, eg, bAra-bAra, xo bAra)

l Place (ADV-PLC, eg, hara jagaha)

l Manner (ADV-MAN, eg, wejZI se)

l Quantity (ADV-QUAN, eg, bahuwa)

l Reason (ADV-RSN, eg, isalie)

l Interrogative (ADV-INTRO, eg, kaba)

l Affirmative (ADV-AFRM, eg, niSciwa hI)

l Negative (ADV-NGTV, eg nahIM, SAyaxa)

ConclusionsThe above discussion forms the basis of thestandardization work and is called the LexwareStandard Document: (Foundations). This has beenevolved from the experiences of the various lexicalresources development activity going on at IITKanpur, IIT Bombay and such other places. Thisdocument will be followed by:

Contents Page 84

Page 87: tdiljan2002

I. Lexware Standard Document: (MorphologyParadigm Tables).

II. Lexware Standard Document: (DomainCategorization).

III. Lexware Standard Document: (LexwareDevelopers’ Manual).

The main aim for all this effort is to ensure that amassive multilingual lexical resource is built for thelanguages of India. This will ultimately lead to (semi)automatic MT systems with capability for NLunderstanding. As mentioned before, this is amanpower intensive activity requiring well trainedpersonnel in large number. The task has to beexecuted in a distributed manner. It is envisaged thateach resource center will establish linkages to thelanguage of the center as shown in the examples above.

Transliteration scheme followed in the abovedescription is given below :

v a, vk A, b i, bZ I, m u, Å U, _ q, , e, ,s E, vkso, vkS O, a M, & H, ¡ z, d k, [k K, x g, ?k G, M- f,p c, N C, t j, > J, ´ F, V t, B T, M d, < D, .k N,r w, Fk W, n x, /k X, u n, i p, Q P, c b, Hk B, e m,; Y, j r, y l, o b, 'k S, "k R, l s, g h, {k kR.

Example : hiMxI akRaramAlA = fgUnh v{kjekyk

Note :

1. A character with nukta is represented bycorresponding Devanagari symbol followed by‘Z’ e.g. t -> jZ. ‘Z’ is “nukta operator”.

2. For other nondevanagari symbols we use ‘V’ &‘Y’ as “previous” & “following” operators respectivelye.g. Bà -> eV, G -> lY.

It is expected that this document will be carefullyread, discussed and reflected upon. Also somelanguages may have their special requirements. Anyfeedback or comment pertaining to this may be sentto one of the following:

(Prof. R.M.K. Sinha, [email protected], Tel: 0512-597174

Prof. Pushpak Bhattacharya, [email protected], Tel: 022-5767718

Prof. Uma Maheswar Rao, [email protected])

Proposed Standard for Indian Script to RomanTransliteration Table (INSROT)

Indian Scripts are phonetic and have similarity inalphabetic correspondence. Transliteration betweenIndian languages is simple, unambiguous andphonetically similar. However, there is need for aScheme for transliteration from Indian Scripts toRoman Script. There have been schemes but theylack in readability of the transliterated text.

INSROT (Indian Script Roman TransliterationTable) is proposed as a standard transliterationscheme considering the need for readability, and easein de-coding the romanized text unambiguously. Thisis orthographic representation. Transliteration table,rules of syllable formation & disambiguation/decodingand examples are given below:

VowelsSyllabic Form Intra Syllabic Form INSROT

+ - a

+É #É A

< Ê# i

<Ç #Ò I

= # Ö u

>ð # Ú U

B # ä e

Bä # è E

+Éä #Éä o

+Éè #Éè O

+ì # ì ah

+Éì #Éì Ah

@ # Þ Ri

Añ #ß RI

¡ßô lRi

¡áôô lRI

#Æ #M

#Ä #Mh

#& #:=> The character @ñ (Ri) / ¡ßô (lRi) does not representa single vocalic sound in Hindi, but is vocalic interms of the Script, having separate syllabic and in-tra syllabic forms.

Contents Page 85

Page 88: tdiljan2002

Consonants

Voiceless Voiceless Voiced Voiced NasalsUnaspir. aspirated unaspir. aspiratedv?kks"k v?kks"k ?kks"k ?kks"k ukflD;

vYiçk.k egkçk.k vYiçk.k egkçk.k

Velars Eò (ka) JÉ (kha) MÉ (ga) PÉ (gha) Ró (Nha)

¼daB~;½Palatals SÉ (cha) Uô (chha) VÉ (ja) ZÉ (jha) \É (nha)

¼rkyO;½Retroflexes ]õ (Ta) B (Tha) b÷ (Da) fø (Dha) hÉ (Na)

¼ew/kZU;½Dentals iÉ (ta) lÉ (tha) nù (da) vÉ (dha) xÉ (na)

¼nUR;½Labials {É (pa) ¡ò (pha) ¤É (ba) ¦É (bha) ¨É (ma)

¼vkS"B~;½Semi vowels ªÉ (ya) ´É (va)

¼v/kZ Loj½Liquid ®ú (ra) ±É (la)

¼rjy½Sibilants ¶É (sha) ¹É (Sa) ºÉ (sa)

¼la?k"khZ½Glottal ½þ (Ha)

¼dkdY; /ofu½IÉ (kSa) jÉ (tra) YÉ (jnha)

Nukta Consonants

Fò (k.a) KÉ (kh.a) NÉ (g.a) WÉ (j.a)c÷ (D.a) gø (Dh.a) ¢ò (ph.a) ®úÃ (r.a)´ÉÃ (v.a)

Explicit Halant# =X (not-join), {ÉC´É ~ pakva, {ÉE Âò´É ~ pakXva,{ÉÆEò~ paMka, �ÌSó ~ paNhka, {ÉRÂóEò ~ paNhXka

Intra Syllabic Vowels Forms combined withconsonant character

Eò (ka) EòÉ (kA) ÊEò (ki) EòÒ(kI)EÖò (ku) EÚò (kU) EÞò (kRi) Eäò (ke)Eèò (kE) EòÉä (ko) EòÉè (kO) EòÉì (kAh)EÆò (kaM) EÄò (kaMh)

Rule for decoding from Romanized text

‘a’ is integral part of consonant. Omission of ‘a’ re-sults into ‘pure’ consonants. e.g. ka (Eò), k (EÂò / D½.

Vowel combined with ‘pure’ consonant transformsinto mAtrA. e.g. khi (ÊJÉ), ghU (PÉÚ), kAh (EòÉì).

Full stop “.” should be followed by a space.

Contents Page 86

Capital letters denotes ‘dIrgh/prolonged vowel/mAtrA. e.g. [ a (+), A (+É)], [ i (<), I (<Ç)], [ ku (EÖò),kU (EÚò)], [ ke (Eäò), kE (Eèò)], ...

h is added to consonants/vowels to denote aspiratedor closer sounds, eg. kha (JÉ), chha (Uô), jha (ZÉ), ...

Nha (Ró), nha (\É), ah (+ì), Ah (+Éì), aMh (+Ä), AMh(+ÉÄ), kaMh (EÄò)

M denotes anusvAr/bindu and combines with pre-vious vowel as in

aM (+Æ) , kaM (EÆò) , kiM (ËEò), sviMga (Ϻ´ÉMÉ)

+Ä = aMh ; #Ä = #Mh

[Note: Graphemic code joining as +ì + #Æ = +Ä (ahM)is not permitted.]

eg. kh (JÉ), Nh (Ró), Ah (+Éì), #Ä (#Mh), ½Äþ (HaMh)

Examples of word/phrase level transliteration

Eò]õ (kaTa), JÉÉ] (khATa), PÉÞiÉ (ghRita), ÊSÉ{É (chipa),ZÉÒ±É (jhIla), ¤ÉÉhÉ (bANa), iÉÖEò (tuka), lÉÚEò (thUka),¡òÉäc÷ (phoD.a), Éè±É (mEla), ¹É]õ (SaTa), ºÉÉlÉ (sAtha),+ɶÉÉ (Asha), <ÇJÉ (Ikha), @ñÊ¹É (RiSi), vkSjú (Ora),¤É Úg ø É (bUDh.A), ¨ÉCJÉxÉ (makkhana), b Å ÷ É<´É® ú(DrAivara), +hb÷É (aNDA), MÉqùÒ (gaddI), ºÉkÉÉ<ǺÉ(sattAIsa), {ÉÞl´ÉÒ (pRithvI), |ÉlÉ¨É (prathama), EÖÄò´É®ú(kuMhvara), K¤ÉÉ¤É (kh.bAba), ¶´ÉÉxÉ (shvAna), |ÉÉhÉ(prANa), {ÉÖhªÉ (puNya), OÌWóÌ (gaNhgA), YÉÉxÉVªÉÉäÊiÉ(jnhAnajyoti), xÉÚ{ÉÖ®ú (nUpura), v´ÉÊxÉ (dhvani), ºÉÖÊxÉ(suni), EòÉ ì±É äVÉ (kAhleja), ½Æ þºÉ (HaMsa), ½Ä þºÉ,(HaMhsa), +ÉÄJÉå (AMhkheM), Ê´É{ÉÊkÉ (vipatti),{ÉÊ®úκlÉÊiÉ (paristhiti), ikWu (pahna), ¤ÉÉì±É (bAhla), ¤ÉÉÄEäòʤɽþÉ®úÒ (bAMhke biHarI), MÉRÂóMÉÉ (gaNhXgA), EÂò´ÉÉlÉ(kXvAtha), C´ÉÉ (kvA), {Ét (padya), {ÉnÂùªÉ (padXya),iwohZ (pUrvI), Vsªu (Trena), çk.k (prANa).

[Suggestions on the above draft

INSROT are requested.

Contact : [email protected]]

Page 89: tdiljan2002

n “......... Which gives at one place informationregarding the work being done for informationtechnology for Indian languages. This is a very usefulpublication and I am sure, it will help in expeditingthe progress of work in this critical area.”-Shri N. Vittal, CVC, Central Vigilance Commission

n “........ It is very informative and may be useful fordeveloping our programme Initiative B@bel. Thisprogramme is foreseen to start in 2002 and we hopeit will contribute to the implementation of theRecommendation on the promotion and use ofmultilinualism in cyberspace, which will be submittedfor approval to the forthcoming session of our GeneralConference.”

-Victor Montviloff, UNESCO, Paris

n “.......... Congratulate you for organizing successfullythe Meet 2001. I am thankful to you for updating uson the issue by sending the literature related to themeet 2001 on Indian Language Technology Vision. Ialso found your newsletter very useful andinformative.”

-Shri Rakesh Sharma, Secretary (IT),Government of Uttranchal, Department of

Information Technology Dehradun

n “............... Congratulations on bringing out yetanother excellent issue. I would request that in thenext version we could possibly include details aboutWIN XP, the latest OS from Microsoft that supports9 Indian languages.”

-Shri Raveesh Gupta, Manager Localisation, Microsoft

n “............... The information given in the Newsletteris highly useful and relevant for people using Indianlanguages for information technology. This is a veryvaluable document, which enable us to contact anumber of agencies involved in this area. It is laudableeffort put in by your group to bring out this kind ofuseful publication. Congratulations for excellentquality of print as well as information.”

-Dr. S. Ahmad, Director, CEERI, Pilani

n “............. The publication has been found extremelyuseful as it contains wealth of information relating totechnology development of languages.”

-Shri R.K.Singh, Consultant, Electronics and ComputerSoftware Export Promotion Council

n “............. I am writing to thank you for the recentcopies of the VishwaBharat@tdil newsletter which Ihave received from you. These are of great interest tome. I am a Sanskritist with experience in the computerrepresentation of Indian languages, and the newsletter

is an invaluable source of information on what workis being done where. I hope to remain on your mailinglist!

-Dr. J.D. Smith, Faculty of Oriental Studies,Sidgwick Avenue, Cambridge CB3 9DA

n “............... I find that very useful work has been takenup/completed in Machine Aided Translation, Human-Machine Interface System, Development of IndianLanguage Tools and Resources.”

-Prof. Ashoka Chandra, Special Secretary, MHRD

n “......We read with interest about your initiatives inIndia, especially the TDIL (Technology Developmentfor Indian Language) and ‘Digital Unite and knowl-edge for All’ projects and the possibility of setting upa Technological Interchange mechanism. We in ILCA( Instituto de Lengua y Cultura Aymara)are very in-terested in the advances in language technology thatyou are making, and wish to know more about them,to see how we can apply similar measures here.”

- Juan de Dios Yapita,Instituto de Lengua y Cultura Aymara, Bolivia,

Sudamerica

n “........You brought so much useful and importantinformation it was a great pleasure to me to find outthat so much had been achieved over the two yearssince my previous visit to India. I would particularlyappreciate something that described and cataloguedthe technologies now available, as described in theexcellent booklet that was distributed.”

-Prof. Pat Hall, Open University, Milton Keynes

Apreciation for TDIL information dissemination werereceived from a number of organisations. Some of themare:

n Electronics Test & Development Centre,Thiruvanmiyur, Chennai

n Embassy of India, Stockholmn Singapore High Commission, New Delhin Government of Assam Industries and commerce

Department, Guwahatin Secretariat for Information Technology, Haryanan Electronic Test & Development Centre, Mohali,

Punjabn Mauritius High Commission, New Delhin Embajada de Panama, New Delhin Indian Investment Centre, New Delhin Academy of Sanskrit Research, Melkot, Karnatakan Birla Institute of Technology, Mesra, Ranchin University of Hyderabad, Hyderabad

4.1 Reader’s Feedback

Contents Page 87

Page 90: tdiljan2002

1. What is Language Technology?

Language technology researches computer systems,which understand and/or synthesize spoken andwritten human languages. Included in this area arespeech processing (recognition, understanding, andsynthesis), information extraction, handwritingrecognition, machine translation, textsummarization, and language generation.

2. What is Computational Linguistics?

Computational linguistics (CL) is a disciplinebetween linguistics and computer science which isconcerned with the computational aspects of thehuman language faculty. It belongs to the cognitivesciences and overlaps with the field of artificialintelligence (AI), a branch of computer science thatis aiming at computational models of humancognition. There are two components of CL: appliedand theoretical. The applied component of CL ismore interested in the practical outcome of modellinghuman language use. The goal is to create softwareproducts that have some knowledge of humanlanguage.

3. What do you mean by bilingual software?

The software supports two languages. One is Englishand the other is any regional language.

4. What is a script?

A script is the set of symbols required to represent asingle writing system, which may in turn be used torepresent several languages. Latin, Arabic and Thaiare examples of scripts. English, French, German andLatin are all languages written using the Latin script.

5. What is speech synthesis?

Speech synthesis programs convert written input tospoken output by automatically generating syntheticspeech. Speech synthesis is often referred to a “Text-to-Speech” conversion (TTS).

6. What is ISCII?

Bureau of Indian Standards formed a standard knownas ISCII (Indian Script Code for InformationInterchange) for the use in all computer and

communication media, which allows usage of 7 or 8bit characters. In an 8 bit environment, the lower128 characters are the same as defined inIS10315:1982 (ISO 646 IRV) 7 bit coded characterset for information interchange also known as ASCIIcharacter set. The top 128 characters cater to all theIndian Scripts based on the ancient Brahmi script.In a 7-bit environment the control code SI can beused for invocation of the ISCII code set andcontrol code can be used for reselection of the ASCIIcode set.

There are 15 officially recognized languages in India.Apart from Perso-Arabic scripts, all the other 10scripts used for Indian languages have evolved fromthe ancient Brahmi script and have a commonphonetic structure, making a common character setpossible. An attribute mechanism has been providedfor selection of different Indian script font anddisplay attributes. An extension mechanismallows use of more characters along with the ISCIIcode. The ISCII Code table is a super set of all thecharacters required in the Brahmi based Indianscripts. For convenience, the alphabet of the officialscript Devnagari has been used in the standard. Thestandard number IS1319:1991 issued by Bureau ofIndian Standards is the latest Indian Standard forInformation Interchange, and is being widely usedfor development of IT products in Indian Languages.

7. What is ACII Script Code?

Alphabetic Code for Information Interchange(Pronounced as Ae-Kee). It is a new name given toISCII code which now encompasses national scriptsof SAARC countries also. This is a 8-bit code,containing the ASCII character set in the bottomhalf. The top half contains the ACII characters. PC-ACII Script code is the version of ACII script codewhere the characters are split in the upper-half forcompatibility with IBM PC.

8. How is text represented through ACII?

ACII (Alphabet code for Information Interchange)code contains all the basic characters available onthe ACII keyboard. For example, The ACII Indiancode and keyboard accommodates the requirements

4.2 Frequently Asked Questions

Contents Page 88

Page 91: tdiljan2002

for the 10 Indian scripts: Assamese, Bengali,Devanagri, Gujrati, Kannada, Malayalam, Oriya,Punjabi, Tamil and Telugu. The basic characters areordered such that direct sorting gives results, whichare almost the same as that for any of the scripts.The ACII codes have to be converted to ISFOC fordisplay purpose. This is done through an ISFAalgorithm for the selected script. An ACII text canbe displayed in any of the scripts. Transliteration toanother script can be achieved by merely selectingthat script. ACII code is used in communicationmedia, like telex, for optimal transfer of text. ALPword processor uses the ACII code internally to allowproper editing at alphabetic level and uniquerepresentation of spellings.

The existing window applications are unable tohandle ACII directly, as it requires an intelligentalgorithm for handling the display. They can,however handle the ISFOC codes, which were madefor this purpose. Thus, conversion is necessarybetween ACII and ISFOC whenever text has to betransferred from ALP to a window application. It ispossible to type ISFOC text directly within awindows application using the ACII keyboard. Thisis done through a custom keyboard driver who doesACII to ISFOC conversion internally.

9. Are there any new entities required for ensuringproper representation of complex scripts?

Following are the entities required for ensuringproper representation of complex scripts:

ACII- Alphabetic code for Information InterchangeThis is a computer code by which the basic alphabetof a script is represented. The basic letters and signsneeded in most of scripts (leaving aside ideographicscripts like Chinese) are less than 96. All the possibleshapes in a script can be expressed throughcombinations of these basic letters. The ACII codecan be typed through an ACII keyboard overlay. TheACII keyboard overlay fits on a standard Englishkeyboard. Each ASCII character has a uniqueposition on the keyboard overlay.

ISFOC- Intelligence Based Script Font Code ISFOC

is a coded character set containing all the basic shapesrequired for rendering a script. These shapes can beoverlapped linearly to compose any word in thescript. Each of the ISFOC characters is like a pieceof a jigsaw puzzle; it may not be a complete letter byitself. Each ISFOC set can contain a maximum of188 characters. This is adequate for most of thescripts. However, some require more.

ISFA- Intelligence Based scripts to Font AlgorithmA word is always typed in terms of its basic ACIIcharacters. It however, has to be displayed using thebasic ISFOC shapes. An algorithm is required forconverting the ACII codes to the appropriate ISFOCcode. This is the ISFA algorithm.

10. What is UNICODE?

Unicode is increasing being accepted as a standardfor Information Interchange worldwide Unicode forIndian Languages use ISCII-88 and not ISCII-91which is the latest official standard.

Unicode standard is the 16 Bit (2 Byte) Universalcharacter encoding standard, used for representationof text for Computer Processing. Unicode standardprovides the capacity to encode all of the charactersused for the written languages of the world. TheUnicode standards provide information about thecharacter and their use. Unicode Standards are veryuseful for Computer users who deal withmultilingual text, Business people, Linguists,Researchers, Scientists, Mathematicians andTechnicians. Unicode uses a 16 bit encoding thatprovides code point for more than 65000 characters(65536). Unicode Standards assigns each charactera unique numeric value and name. The Unicodestandard and ISO10646 Standard provide anextension mechanism called UTF-16 that allows forencoding as many as a million characters. PresentlyUnicode Standard provide codes for 49194characters.

11. What is a font?

A font, as far as a computer is concerned, is the fileor files necessary to display and print a particulartypeface. Dv-TTYogesh, for example, is a typeface.

Contents Page 89

Page 92: tdiljan2002

They are also referred to as fonts. Each font comprisesone or more files, depending on the font technologyused.

12. What is a font family?

Font families are collections of fonts which looksimilar but have slightly different attributes. Dv-TTYogesh Regular and Dv-TTYogesh Bold are twodifferent fonts, but in the same family.

13. What is a bitmapped font? What is a screenfont?

A bitmapped font is also referred to as a screen font.They are files which contain pixel information yourcomputer uses to display the font on the screen.Bitmapped font files are for a particular point size.If you have bitmapped fonts for Helvetica at 12 pointand 14 point, Helvetica at 13 point will look slightlypixelated on the screen. If all you have installed isbitmapped fonts, your printer will print fonts whichdon’t look very smooth. Font sizes which arephysically installed will look better, but there areproblems with having too many fonts opensimultaneously. There is font technology to avoidthat problem these days, and you are likely usingsome of that technology.

14. What is a printer font? What is a PostScriptfont?

When you talk about printer fonts, you are usuallytalking about PostScript. PostScript fonts come inpairs (there may be more than two files involved):one or more screen (bitmapped) fonts, and oneprinter font.The printer font is scalable, meaningthat whatever font size you are using will be scaledproperly by a PostScript-capable printer, and will looksmooth on the paper. The printer font is used forprinters, the screen font is normally only used foron-screen display. You may have multiple bitmappedfonts which are all linked to the same PostScript font.For example, you may have Helvetica Bold 12 pt,Helvetica Bold 14 pt, and Helvetica Bold 24 pt, butthey will all use the same printer font, HelveBol.

15. What is an OpenType font?

Open Type is a cross-platform font file format

developed jointly by Adobe and Microsoft. The twomain benefits of the Open Type format are its cross-platform compatibility (the same font file works onMacintosh and Windows computers), and its abilityto support widely expanded character sets and layoutfeatures, which provide richer linguistic support andadvanced typographic control.

The Open Type format is an extension of theTrueType SFNT format that also can support AdobePostScript font data and new typographic features.Open Type fonts containing PostScript data, such asthose in the Adobe Type Library, have an .otf suffixin the font file name, while TrueType-based OpenType fonts have a .ttf file name suffix.

Open Type fonts can include an expanded characterset and layout features, providing broader linguisticsupport and more precise typographic control.OpenType fonts can be installed and used alongsidePostScript Type 1 and TrueType fonts.

16. What is a dfont?

A dfont is a special version of a Macintosh TrueTypefont. All the information that is normally stored in aTrueType font’s resource fork has been moved tothe data fork. Typically, the only dfonts you will runinto will come with Mac OS X.

17. What is TrueType font?

TrueType is a font technology from Apple whichallows you to have smooth font screen displays andprinting without needing extra screen font sizes orPostScript. TrueType fonts will print smoothly tonon-PostScript printers. TrueType fonts consist ofone scalable TrueType file, and possibly one or morebitmapped screen fonts. Although TrueTypetechnology is very efficient, and removes the needfor Adobe Type Manager to smooth your fonts, somePostScript printers have problems with TrueType fonts.

18. What are dynamic fonts?

Dynamic fonts are the technology used for deliveringwindows true type fonts on the client side intransparent way. If the user needs to provide a facilityof viewing the pages in Indian Languages then fonts

Contents Page 90

Page 93: tdiljan2002

can be delivered to the client in EOT and PFR format.

19. What are EOT (Embedded Open Type) & PFR(Portable Font Resource) format?

EOT (Embedded Open Type) format of fonts isMicrosoft’s way of sending encoded fonts to theclients. Only Internet Explorer, (version 4.0Onwards) can use EOTs. EOTs have specific URL.If the web designer provides a link to an EOT thebrowser uses these EOTs to display the page. Thismeans that only particular websites with links to thespecific URL can use EOTs made for them. PFR(Portable Font Resource) is another way to send fontsdynamically to the user. It can be used both inNetscape (4.03 and above) and IE (4.0 and above).In IE however there is a one time download of acontrol on the clients machine. PFRs also have theURL security and can be locked to particular URLs.PFRs are more stable than EOTs but sometimes needEncoding changes in IE 5.0.

Usually a JavaScript is used to query the browserand accordingly PFRs or EOTs are given to the clientso that a particular font can be displayed withoutuser intervention.

20. How do fonts get activated?

Normally your fonts reside in the Fonts folder inthe windows directory. The computer boots up, looksin the folder, and turns the fonts on. They are thenavailable to all applications. If you put lots of fontsinto your windows fonts directory, your system willslow down drastically, and you could run into stabilityproblems as well.

21. What is font manager?

It helps you to have more fonts without causingsystem problems. When you use a font manager youkeep your fonts elsewhere on your system than inthe Fonts folder or directory. The font manager keepsa list of all your fonts, and you can turn on the onesyou need and turn off the ones you are done with.You will have to restart most applications before thefonts will be available.

22. What are different Keyboard Layouts for typingin Indian Languages?

There are 4 different keyboard layouts.

1. Romanised Layout : In Romanised layout,phonetic English mappings are used to compose theHindi Text. For example, the key raamaa (or rAmA)can be used to type ‘Rama’.

2. Typewriter Layout : This layout is similar to theHindi typewriter layout & useful for Hindi typists& other people familiar with Hindi Typewriterlayout. Typewriter Layout & Key Sequence Charts

3. Phonetic Layout : This layout is standardized bythe erstwhile Department Of Electronics (DOE),Govt. of India. The advantage of this layout is thatthe layout remains identical for all Indian Languages.For example, the key ‘k’ is used to represent the letter‘ka’ in all Indian Languages. The Keyboard Layoutand the Key Sequence Charts can be used to findthe correct key combinations.

4. Consonant Keyboard : The phonetic division ofIndian alphabets into Vowels and Consonants servesas a common base for all Indian scripts. Vowels arecalled soul and consonants are called body. Thecombination of these two become animated body.Without the addition of a vowel (soul) the consonant(body) is like a ‘dead letter’. The dead consonantcan also be termed as ‘Pure consonants’.

The keyboard which accommodates the phoneticpeculiarities efficiently and taking into account theinherent logic built in Indian scripts, is named asDESHA (consonant) keyboard (DESHA meaningCountry). DESHA is based on the Barahkhandi'concept. Through DESHA Keyboard, all possibleglyphs combinations as used in linguistic/information environment, are produced using only36 consonant keys and 12 vowel keys. In DeshaKeyboard there are no separate keys for vowel signsand vowel matra signs. The keys showing the vowelsigns produce vowel signs as well as vowel matra signsas per the inbuilt logic.

Contents Page 91

Page 94: tdiljan2002

TDIL PROGRAMMEMinistry of Communications & Information Technology

Department of Information TechnologyElectronics Niketan, 6, CGO Complex, New Delhi-110003

Telefax : 011-436 3076 E-mail : [email protected] http://tdil.mit.gov.in

Unicode (Devanagari)w Dr. Pushplata Taneja

Director, Central Hindi Directorate, Ministry of HRDDepartment of Education, West Block 7, R.K. PuramNew Delhi 110066 Ph. 6100758Language: Hindi

w Shri Umakant KhubalkarResearch Officer, Central Hindi Directorate, Ministry of HRDDepartment of Education, West Block 7, R.K. PuramNew Delhi 110066 Phone 011-6105211/6103219 Extn 236Language: Hindi

w Dr. R.M.K.SinhaIndian Institute of Technology, Kanpur 208016Tel: 598570, 597578, 597170 E-mail : [email protected]: Hindi, Nepali

w Shri Jalaj ShrivastavaSecretary (Information Technology), Govt. of Goa, Goa Secretariat, PanajiTel: 0832-221334 (O) Goa House New Delhi Tel: 4629967/68Language: Konkani

w Dr. Chandra LekhaKonkani Department, Goa University, Taleigaon PlateauGoa 403206 Phone 0832-514992(R) VC Office: 0832-451346Konkani Deptt. 0832-454341 Fax (VC Office): 0832-451184E-mail : [email protected]: Konkani

w Prof. K.J. MahaleChorao, Panji, Goa-403 102 Phone : 0832-222789, 239249/50Language: Konkani

w Prof. Pushpak Bhattacharya (CI)Department of Computer Sc. & Engg., Indian Institute of TechnologyPowai, Mumbai- 400 076 Tel. 022-5767718 (O), 022-5768718/5721955Fax: 022-5720290/5723480 E-mail : [email protected]: Marathi, Konkani

w Shri Sunil SoniSecretary (IT), General Administration Dept., Government of MaharastraMaharastra Secretariat, Mumbai 400 032 Ph. 022-2026534, 3363773Language: Marathi

w Shri R.S. BasnetSecretary (IT& Personnel), Government of Sikkim, Deoroli, GangtokSikkim 737101 Fax: 03592-22658Language: Nepali

w Prof. V. Kutumba ShastriDirector, Rashtriya Sanskrit Sansthan, 56-57, Institutional Area JanakpuriNew Delhi 110058 Phone 011-5541949 (O), 5540993Language: Sanskrit

w Dr. (Mrs) Shukla MukharjeeRashtriya Sanskrit Sansthan, 56-57, Institutional Area, JanakpuriNew Delhi 110058 Phone 011-5540993/5(O), 5541949 (O)Language: Sanskrit

w Prof. G.V. SinghSchool of Computer and Systems Sciences, Jawaharlal Nehru UniversityNew Mehruli Road, New Delhi – 110 067 Tel. 011-6101895, 6107676E-mail : [email protected], [email protected]: Sanskrit

w Dr. Kishore VaswaniDirector, National Council for Promotion of Sindhi LanguageDarpan Building, 6th Floor, R.C.Datta Road, AlkapuriVadodara 390007 Phone 0265-342246 Fax 0265-357331Language: Sindhi

w Dr. M.K. JetlyD-127, Vivek Vihar, Delhi 110095 Phone 011 2146121(R)Language: Sindhi

Addresses for ReferenceResource Centersw Prof. R.M.K.Sinha

Department of Computer Sc. & Engg., Indian Institute of TechnologyKanpur 208 016 Tel: 0512-597578/597170/597302(off ), 500034 (Res)E-mail : [email protected]

w Prof. B.B ChaudharyHead, Comp. Vision and Pattern Recognition, Indian Statistical Institute203,B.T road, Calcutta-700035Tel: 033- 5778085/5777694/5775502/5775402/5778049/5773241/5771927(R) Fax: 5776680/5773035 E-mail : [email protected]

w Prof. Pushpak BhattacharyaDepartment of Computer Sc. & Engg., Indian Institute of TechnologyPowai, Mumbai 400 076 Tel: 022- 5767718/(O), 5772195(R)Fax: 022-5720290/5723480 E-mail : [email protected]

w Prof. K. Narayan MurthyCentre for Applied Linguistics, Uni. of Hyderabad, Hyderabad 500 134Tel: 040-3010500/3010158 Extn. 4017, 4056(O) 040-3010064 (D)Fax: 040- 3010120/3010145 E-mail : [email protected]

w Prof.Gautam BaruaDepartment of Computer Sc. & Engg, Indian Institute of TechnologyPanbazar, Guwahati 781 001 (Assam)Tel: 0361- 690325/690787/ 452088 Extn. 2029, 690321-28(O),452088(R) Mobile 098640-25475 E-mail : [email protected]

w Shri M.D.KulakarniCentre for Development of Advanced Computing, GIST GroupPune University, Ganesh Khind Road, Pune - 411 007 (Maharashtra)Tel: 020- 5694000, 4002-09(O) E-mail : [email protected]

w Prof. G.V. SinghSchool of Computer and Systems Sciences, Jawaharlal Nehru UniversityNew Mehruli Road, New Delhi – 110 067 Tel: 011- 6101895, 6107676E-mail : [email protected], [email protected]

w Prof. G.S.LehalDepartment of computer science & EngineeringThapar Institute of Engg. & Technology, Deemed Univ., Patiala 147 001Tel: 0175- 393137/393374 (O), 283502 E-mail : [email protected]

w Prof. Ravinder KumarER & DCI, Vellayambalam, Tiruvanantapuram 695 033 Tel: 0471-320116,326718(R) Fax: 0471-331654 E-mail : [email protected]

w Dr. T.V.GeethaCo-Ordinator (RCILTS- Tamil), School of Computer Sc. & EngineeringAnna University, Chennai - 600 025Tel: 044- 2351723 Extn. 3342(Geetha), 3347(Ranjani), 2355997 (Ranjani),4422620(Geetha) Fax: 044- 2350397 E-mail : [email protected]

w Prof. Sitanshu Y MehtaDepartment of Gujarati, Faculty of Arts, M.S.University of BarodaBaroda – 390 002 Tel: 0265- 792959 E-mail : [email protected]

w Shri A.K. PujariCEO, IPICOL, Annexe Building, Janpath, Bhubaneshwar – 751 007Tel: 0674- 543113/540830/543485/541154 Fax: 0674- 542669E-mail : [email protected]

w Ms Sanghmitra MohantyDepartment of computer science & application, Vani ViharUtkal University, Bhubaneshwar – 751 004Tel: 0674- 580216(O), 540865(R) Fax: 0674- 581850E-mail : [email protected]

w Prof. N.J. RaoIndian Institute of Science, Department of Electrical EngineeringBanglore-560012 Tel.: 080-3092222 E-mail : [email protected]