Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
What is computational linguistics?
• Processing of human language by computers
• ... for linguistic research
• ... for practical applications
• Also known as natural language processing, speech processing
• Point of contact of Linguistics, Computer Science (AI), and Electrical Engineering (signal processing)
How does a spell checker work?
• Compares input text to a dictionary (+ morphological analyzer) to detect non-words
• Runs error types in reverse (insertion, deletion, transposition, substitution) to come up with candidate corrections
• Compares candidate corrections to dictionary to find viable alternatives
• Ranks candidate corrections according to probability (frequency of that word in context)
• What about irregular morphology?
• What about spelling mistakes which result in other actual words (e.g., three/there, stationery/stationary)?
Noisy channel model
Spell check ASR MT
Input I Word seq Word seq English
Output O Word seq (with mistakes) Acoustic signal French
Noise Mistakes Noise Babelfish
Target Eng. words Eng. words English
Est. Input Î Corrected words Text English
↔î = argmax(p(i|o)) î = argmax(p(o|i)p(i))
Linguistic Knowledge and Machine Learning
• Machine learning
• Design a probabilistic model (the ‘p’s on the previous slide)
• Estimate the probabilities for the model from some data
• Use model to predict labels/values/etc for new data
• Supervised: learner is given training data with target labels included
• Semi-supervised: learner is given a little bit of labeled data and a lot of unlabeled data
• Unsupervised: learner is only given unlabeled data (but lots!)
Linguistic Knowledge and Machine Learning
• What part of a spell-checker needs to be hand engineered?
• What part is learned by the machine?
• What data is used in the machine learning?
• Is the learning supervised, unsupervised, or semi-supervised?
Linguistic Knowledge and Machine Learning: Some history
• 1950s-1980s: Rule-based approaches
• 1990s: Machine learning/statistical revolution
• Now:
• Ceiling effects for machine learning;
• recognition that best solutions will combine both linguistic knowledge and machine learning;
• search for best hybridizations
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
Computational linguistics in every day life
• What NLP technology do you use?
• Spell checkers
• Grammar checkers
• Menu-based phone systems (with and without ASR)
• Voice activated cell phones
• Search engines
• In-car navigation systems
• Google Calendar
• Context-sensitive ads
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
Web demos
• English Resource Grammar (DELPH-IN): Broad-coverage precision grammar http://lingo.stanford.edu:8000/erg
• Grammar Matrix (UW/DELPH-IN): Multilingual grammar engineering http://www.delph-in.net/matrix/customize/matrix.cgi
• Text-to-speech system (Oddcast) http://www.oddcast.com/home/demos/tts/frameset.php?frame1=talk
• Machine translation (Babelfish) http://babelfish.altavista.com
Web demos continued
• KnowItAll (UW Turing Center): Autonomous, scalable information extraction from the web http://knowitall-1.cs.washington.edu/dbinterface/knowitall2/default.asp
• Cross-lingual image search (UW Turing Center): http://knowitall-3.cs.washington.edu/panimages/
• Jabberwacky, a chatbot: http://www.jabberwacky.com
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
All levels of linguistic analysis have a role to play
• Phonetics, phonology:
• text-to-speech, speech recognition
• Morphology:
• Spell checkers, plus support for all tasks at “higher” levels
• Syntax:
• Natural language understanding, generation
• Semantics:
• NLU, generation, reasoning, inference
• Pragmatics:
• Dialogue management, anaphora resolution, menu-based systems
Computational linguistics: Common subtasks
• Language identification
• Sentence tokenization
• Word-level tokenization
• Lemmatization
• Morphological analysis
• Syntactic parsing
• Word-sense disambiguation
• Named-entity recognition
• Sentence- and word-level alignment of parallel bitexts
• Reference resolution
• Dialogue management
• Generation (strategic, tactical)
• Disambiguation
Overview
• Introduction
• Computational linguistics in everyday life
• Web demos
• How linguistics fits in
• Who’s hiring
• So you want to be a computational linguist...
Lots of local and national employers need computational linguists!
• Microsoft
• Amazon.com
• AOL/Tegic
• Adapx
• InXight Software
• PARC
• VoiceBox
• Cataphora
• LCC
• SYSTRAN
• Nuance
http://depts.washington.edu/uwcl/twiki/bin/view.cgi/Main/JobList
There are many ways to be a computational linguist
• Most flexible option: Training in both Computer Science and linguistics
• To prepare for UW’s Professional Master’s in Computational Linguistics
• Ling 200
• CSE 142, 143, 373
• Stat 391
• To learn more:
• Ling/CSE 472 (prereq Ling 461 or CSE 326 + Ling 200)
• http://compling.washington.edu
• Linguistics colloquia, especially MS/UW Symposium