Upload
nate
View
85
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Twitter NLP. Named Entity Recognition in Tweets: TwitterNLP. Ludymila Lobo . Ludymila Lobo. Resources. Reading material - PowerPoint PPT Presentation
Citation preview
Named Entity Recognition in Tweets:TwitterNLP
Ludymila Lobo
Twitter NLP
Ludymila Lobo
Reading material
Named Entity Recognition in Tweets, RITTER, Alan, CLARK, Sam, Mausam and ETZIONI, Oren. Obtained on Association for Computational Linguistics website, at https://aclweb.org/anthology/D/D11/D11-1141.pdf
http://www.academia.edu/1128304/Shallow_parsing_as_part-of-speech_tagging
Twitter NLP Tool
https://github.com/aritter/twitter_nlp
Aplication with Twitter NLP
statuscalendar.com
Collecting Tweets
https://dev.twitter.com
http://www.webdevdoor.com/jquery/twitter-feed-authentication-search
https://github.com/abraham/twitteroauth
http://sourceforge.net/projects/xampp/
Resources
http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/
Big amount of data (even more than Library of Congress -Washington D.C.)*, with 151 millions of itens
Real time information, some times more up-to-date than articles.
Why Twitter?
http://pt.wikipedia.org/wiki/Library_of_Congress
*Hachman (2011)
Noisy and informal nature
Diversity of entities (companies, products, bands, teams, movies, etc), that are not relatively frequent, which makes a sample of Tweets with a few examples
Lack of context
Challenges
http://twitter.com
Tool
• https://github.com/aritter/twitter_nlp• Unzip file, on Linux terminal type:– sh build.sh
How it works
POS (Part of Speech) ->NLP, clustering Chunking (shallow parsing)
@paulwalk oIt b-np's b-vpthe b-npview i-npfrom b-ppwhere b-advpI b-np'm b-vpliving i-vpfor b-pptwo b-npweeks i-np
best ADJ ADV NP V better ADJ ADV V DET close ADV ADJ V N cut V N VN VD even ADV DET ADJ V grant NP N V hit V VD VN N DET
How it works
Capitalization classifier:Predicts whether or not a tweet is informatively capitalized (using SVM learning)
NER (Named Entity Recognition)
POS (Part of Speech) ->NLP, clustering
Chunking (shallow parsing)
Tom Hanks was awesome in Forrest Gump
actor movie
Tool
@cityofcalgary: Free swimming and golf tomorrow for @cbc Sports Day in Canada #yyc #sportsday http://ow.ly/2G4sf
@cityofcalgary/O :/O Free/O swimming/O and/O golf/O tomorrow/O for/O @cbc/O Sports/B-other Day/I-other in/O Canada/B-geo-loc #yyc/O #sportsday/O http://ow.ly/2G4sf/O
Adam Beyer: Swedish Techno Pioneer: When it comes to his own DJing and sound, he's slightly more diverse and likes...
Adam/B-person Beyer/I-person :/O Swedish/O Techno/O Pioneer/O :/O When/O it/O comes/O to/O his/O own/O DJing/O and/O sound/O ,/O he/O 's/O slightly/O more/O diverse/O and/O likes/O
How to retrieve data from Twitter?
https://dev.twitter.com
<?phpsession_start();require_once("twitteroauth/twitteroauth/twitteroauth.php"); //Path to twitteroauth library $search = "wpi OR #WPI";$notweets = 50;$consumerkey = “123456";$consumersecret = “123456";$accesstoken = "123456";$accesstokensecret = “123456"; function getConnectionWithAccessToken($cons_key, $cons_secret, $oauth_token, $oauth_token_secret) { $connection = new TwitterOAuth($cons_key, $cons_secret, $oauth_token, $oauth_token_secret); return $connection;} $connection = getConnectionWithAccessToken($consumerkey, $consumersecret, $accesstoken, $accesstokensecret); $search = str_replace("#", "%23", $search); $tweets = $connection->get("https://api.twitter.com/1.1/search/tweets.json?q=".$search."&count=".$notweets);
echo json_encode($tweets);?>
http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/
• Authentication libraryhttps://github.com/abraham/twitteroauth
Download and include in the same folder as the code
How to retrieve data from Twitter?
How to retrieve data from Twitter?
http://sourceforge.net/projects/xampp/
How to retrieve data from Twitter?
Copy the project folder to C:\xampp\htdocs
How to retrieve data from Twitter?
http://localhost/TwitterStreams/tweet.php on a browser