Upload
smarter-engagement
View
123
Download
0
Embed Size (px)
Citation preview
Ugo ScaiellaR&D Team Lead @ Smarter Engagement – Milano, 20.05.2016
Dandelionsemantic text analytics
as a service
The bag-of-words paradigm
The Mona Lisa is a 16th century oil on canvas painted by Leonardo.It's held at the Louvre in Paris.
The bag-of-words paradigmTerm Freqthe 2mona 1leonardo 1century 1oil 1Paris 1Lisa 1By 1painted 1at 1canvas 1... ...
Classic NLP pipeline
Segmentation Tokenization PoS Tagging Chunking Dependency
Parsing
Classic NLP pipeline1 The the DT O 3 det2 Mona Mona NNP O 3 compound3 Lisa Lisa NNP O 8 nsubj4 is be VBZ O 8 cop5 a a DT O 8 det6 16th 16th JJ DATE 8 amod7 century century NN DATE 8 compound8 oil oil NN O 0 ROOT9 on on IN O 10 case10 canvas canvas NN O 8 nmod11 painted paint VBN O 10 acl12 by by IN O 13 case13 Leonardo Leonardo NNP PERSON 11 nmod14 .. . O _ _
1 It it PRP O 3 nsubjpass2 's be VBZ O 3 auxpass3 held hold VBN O 0 ROOT...
Limitations
The book is on the table
“
”
Limitations
Training: expensive, hard
The graph of conceptsThe Mona Lisa is a 16th century oil on canvas painted by Leonardo. It's held at the Louvre in Paris.
The graph of conceptsThe Mona Lisa is a 16th century oil on canvas painted by Leonardo. It's held at the Louvre in Paris.
The graph of conceptsPERSONbirthDatebirthPlacedeathDateauthorOf...
CONCEPT...
WORK...
PLACEcoordscapitalOfpopulation...
BUILDINGcoords...
paris
leonardo
oil on canvas
mona lisa
Oil painting
Paris (mythology)
Mona Lisa (painting)
Mona Lisa (movie)
Paris (city)
Leonardo da Vinci
Leonardodo Nascimento
Spots(aka mentions, surface
forms)
Concepts
Advantages
• Less training• Speed• Customization• Robustness to syntax• … but still (may) use classic NLP to improve results
Applications
• Entity Extraction• Classification• Similarity & clustering … basically any IR task
Applications: an example
Cameron wins the Oscar
Cameron wins general elections
All nominees for the Academy Awards
See more onhttps://dandelion.eu
Real World Use Cases
Use case #1Lawful interception
Identify potential terrorism threats on social networks and message boards
Customized domain-specific taxonomy
Use case #2Website tagging
Profile a company looking at his website• Entity extraction: products, locations• People & Roles
Sales intelligencefor lead generation
http://atoka.io
Use case #3News stream monitoring
News stream of 70k articles per day• BI vertical of semantic engine• Entity extraction: companies, people• Business signals extraction
Use case #4Social media analysis
• Entity extraction, sentiment analysis• Dashboard, tag-cloud
Use case #5Travel recommendation
Crawl the web and understand people’s behaviorDisplay travel offers that match user preferences
Use case #6E-Commerce Optimization
Collect and annotate customer reviews from e-commerce websites
Dashboard for product ratings analysis