Upload
kawena
View
58
Download
1
Tags:
Embed Size (px)
DESCRIPTION
…. GATE Format Handlers. ANNIE. …. Named entity. HTML docs. RTF docs. XML docs. Core- ference. Document content Document metadata Document format data Linguistic data. POS tagger. …. Named entity. …. Event extraction. …. Custom application 1. Relational Database. - PowerPoint PPT Presentation
Citation preview
1(17)
Relational Database
…
GA
TE
Form
at Handlers
HTMLdocs
RTFdocs
XMLdocs
Named entity
Core-ference
…
ANNIE
POS tagger
Named entity
Eventextraction…
Custom application 1
…Document content
Document metadata
Document format data
Linguistic data
File storage
…
Oracle/PostgresQL
A Language AnalysisExample
2(17)
Vis
ual
Res
ourc
es
3(17)
Displaying Coreference Information
4(17)
Displaying Syntactic Information
5(17)
Lexicon Support – WordNet example
6(17)
Performance Evaluation
• At document level – annotation diff
• At corpus level – corpus benchmark tool – tracking system’s performance over time
7(17)
Regression Testing – Corpus Benchmark Tool
8(17)
Populating Ontologies with IE
9(17)
Protégé and Ontology Management
10(17)
Information Retrieval SupportBased on the Lucene IR engine
11(17)
GATE Unicode Kit (GUK) Java provides no special support for text input (this may change)
• Support for defining additional Input Methods (IMs)
• currently 30 IMs for 17 languages
• Pluggable in other applications
Editing Multilingual Data
12(17)
Processing Multilingual DataAll the visualisation and editing tools for ML LRs use enhanced Java facilities:
13(17)
Dialogue Systems
• GATE is being used in the Amities project for automating call centres• Creation of dialogue processing server components to run in the Galaxy Communicator Software Infrastructure• Easy adaptation of the portable IE components to work on noisy ASR output • Robustness and speed of GATE components for real-time dialogue systems
14(17)
Semantic Indexing in the MUMIS project
• Multimedia Indexing and Searching Environment • Composite index of a multimedia programme
from multiple sources in different languages• ASR, video processing, information extraction
(Dutch, English, German), merging, user interface• University of Twente/CTIT, University of Sheffield,
University of Nijmegen, DFKI, MPI, ESTEAM AB, VDA
15(17)
The Whole Picture
EN
DE FormalText
FormalText
FormalTextFormal
TextFormal
TextFormal
TextFormalText
FormalText
FormalTextText
Sources
IE
IE
IE
NL
FormalText
FormalText
FormalTextFormalText
FormalText
FormalTextFormalText
FormalText
FormalText
Transcriptions
ASR
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
Formal
Text
SpeechSignals
Merging Final Annotations
Formal
Text
Formal
TextForma
lText
Anno-tations
MultimediaData Base
Video & AudioSignal
UserInterface
Query
Results
Ontology & Lexicon
16(17)
User Interface
17(17)
Play