15
Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo

Named Entity Recognition in Tweets: TwitterNLP

  • Upload
    nate

  • View
    85

  • Download
    0

Embed Size (px)

DESCRIPTION

Twitter NLP. Named Entity Recognition in Tweets: TwitterNLP. Ludymila Lobo . Ludymila Lobo. Resources. Reading material - PowerPoint PPT Presentation

Citation preview

Page 1: Named Entity Recognition in Tweets: TwitterNLP

Named Entity Recognition in Tweets:TwitterNLP

Ludymila Lobo

Twitter NLP

Ludymila Lobo

Page 2: Named Entity Recognition in Tweets: TwitterNLP

Reading material

Named Entity Recognition in Tweets, RITTER, Alan, CLARK, Sam, Mausam and ETZIONI, Oren. Obtained on Association for Computational Linguistics website, at https://aclweb.org/anthology/D/D11/D11-1141.pdf

http://www.academia.edu/1128304/Shallow_parsing_as_part-of-speech_tagging

Twitter NLP Tool

https://github.com/aritter/twitter_nlp

Aplication with Twitter NLP

statuscalendar.com

Collecting Tweets

https://dev.twitter.com

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search

https://github.com/abraham/twitteroauth

http://sourceforge.net/projects/xampp/

Resources

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

Page 3: Named Entity Recognition in Tweets: TwitterNLP

Big amount of data (even more than Library of Congress -Washington D.C.)*, with 151 millions of itens

Real time information, some times more up-to-date than articles.

Why Twitter?

http://pt.wikipedia.org/wiki/Library_of_Congress

*Hachman (2011)

Page 4: Named Entity Recognition in Tweets: TwitterNLP

Noisy and informal nature

Diversity of entities (companies, products, bands, teams, movies, etc), that are not relatively frequent, which makes a sample of Tweets with a few examples

Lack of context

Challenges

http://twitter.com

Page 5: Named Entity Recognition in Tweets: TwitterNLP

Tool

• https://github.com/aritter/twitter_nlp• Unzip file, on Linux terminal type:– sh build.sh

Page 6: Named Entity Recognition in Tweets: TwitterNLP

Tool

• statuscalendar.com

Page 7: Named Entity Recognition in Tweets: TwitterNLP

How it works

POS (Part of Speech) ->NLP, clustering Chunking (shallow parsing)

@paulwalk oIt b-np's b-vpthe b-npview i-npfrom b-ppwhere b-advpI b-np'm b-vpliving i-vpfor b-pptwo b-npweeks i-np

best ADJ ADV NP V better ADJ ADV V DET close ADV ADJ V N cut V N VN VD even ADV DET ADJ V grant NP N V hit V VD VN N DET

Page 8: Named Entity Recognition in Tweets: TwitterNLP

How it works

Capitalization classifier:Predicts whether or not a tweet is informatively capitalized (using SVM learning)

NER (Named Entity Recognition)

POS (Part of Speech) ->NLP, clustering

Chunking (shallow parsing)

Tom Hanks was awesome in Forrest Gump

actor movie

Page 9: Named Entity Recognition in Tweets: TwitterNLP

Tool

@cityofcalgary: Free swimming and golf tomorrow for @cbc Sports Day in Canada #yyc #sportsday http://ow.ly/2G4sf

@cityofcalgary/O :/O Free/O swimming/O and/O golf/O tomorrow/O for/O @cbc/O Sports/B-other Day/I-other in/O Canada/B-geo-loc #yyc/O #sportsday/O http://ow.ly/2G4sf/O

Adam Beyer: Swedish Techno Pioneer: When it comes to his own DJing and sound, he's slightly more diverse and likes...

Adam/B-person Beyer/I-person :/O Swedish/O Techno/O Pioneer/O :/O When/O it/O comes/O to/O his/O own/O DJing/O and/O sound/O ,/O he/O 's/O slightly/O more/O diverse/O and/O likes/O

Page 10: Named Entity Recognition in Tweets: TwitterNLP

How to retrieve data from Twitter?

https://dev.twitter.com

Page 11: Named Entity Recognition in Tweets: TwitterNLP

<?phpsession_start();require_once("twitteroauth/twitteroauth/twitteroauth.php"); //Path to twitteroauth library $search = "wpi OR #WPI";$notweets = 50;$consumerkey = “123456";$consumersecret = “123456";$accesstoken = "123456";$accesstokensecret = “123456"; function getConnectionWithAccessToken($cons_key, $cons_secret, $oauth_token, $oauth_token_secret) { $connection = new TwitterOAuth($cons_key, $cons_secret, $oauth_token, $oauth_token_secret); return $connection;} $connection = getConnectionWithAccessToken($consumerkey, $consumersecret, $accesstoken, $accesstokensecret); $search = str_replace("#", "%23", $search); $tweets = $connection->get("https://api.twitter.com/1.1/search/tweets.json?q=".$search."&count=".$notweets);

echo json_encode($tweets);?>

http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

Page 12: Named Entity Recognition in Tweets: TwitterNLP

• Authentication libraryhttps://github.com/abraham/twitteroauth

Download and include in the same folder as the code

How to retrieve data from Twitter?

Page 13: Named Entity Recognition in Tweets: TwitterNLP

How to retrieve data from Twitter?

http://sourceforge.net/projects/xampp/

Page 14: Named Entity Recognition in Tweets: TwitterNLP

How to retrieve data from Twitter?

Copy the project folder to C:\xampp\htdocs

Page 15: Named Entity Recognition in Tweets: TwitterNLP

How to retrieve data from Twitter?

http://localhost/TwitterStreams/tweet.php on a browser