12
AIDR Tutorial Muhammad Imran Research Scien1st Qatar Compu1ng Research Ins1tute, HBKU Doha, Qatar h"p://aidr.qcri.org/

AIDR Tutorial (Artificial Intelligence for Disaster Response)

Embed Size (px)

Citation preview

AIDRTutorialMuhammadImranResearchScien1st

QatarCompu1ngResearchIns1tute,HBKUDoha,Qatar

h"p://aidr.qcri.org/

Outline

•  Datacollec2oninAIDR•  Dataclassifica2oninAIDR•  Dataview/downloadinAIDR

DataCollec2oninAIDR

•  Twi:erdatacollec2onstrategiesthatAIDRsupports–  Bykeywords–  Bygeographicalregions

•  Strict:coordinatesstrictlyinsidegeoboundaries•  Approximate:tweetsfromaplacethatoverlapswiththegeoboundaries.

–  ByfollowingTwi:erusers–  Bykeywords+regions

•  Tweetsthatmatchanyofthekeywordsandwithinthegeoboundaries.

DataCollec2onUsingKeywords

•  Keywordslimit=400•  Onekeywordcouldasinglewordlike“Suffolk”oraphrase“Suffolkaccident”

•  1keyword/phrasecannotbemorethan60bytes(1char=1byte)

•  Generickeywordscollectirrelevanttweets•  Specifickeywordsmostlikelycollectrelevanttweets

KeywordsExamples

Loca2on-basedCollec2on

•  Boundingboxesdonotactasfiltersforotherfilterparameters.Forexample:keyword=twi:er&loca2ons=-122.75,36.8,-121.75,37.8wouldmatchanytweetscontainingthetermTwi:er(evennon-geotweets)ORcomingfromtheSanFranciscoarea.

FollowingTwi:erUsers

Foreachuserspecified,thetoolwillcollect:•  Tweetscreatedbytheuser.•  Tweetswhichareretweetedbytheuser.•  RepliestoanyTweetcreatedbytheuser.•  RetweetsofanyTweetcreatedbytheuser.•  Manualreplies,createdwithoutpressingareplybu:on(e.g.

“@twi:erapiIagree”).

Thetoolwillnotcontain:•  Tweetsmen2oningtheuser(e.g.“Hello@twi:erapi!”).•  ManualRetweetscreatedwithoutpressingaRetweetbu:on(e.g.

“RT@twi:erapiTheAPIisgreat”).•  Tweetsbyprotectedusers.

Usecomma-separatedlistofTwiFeruserid(hFp://geFwiFerid.com/)

ClassifierUI

DetailedInforma2onofClassifiers

DataClassifica2oninAIDR

•  Defineclassifiers(name,descrip2on)– Definelabels(name,descrip2on)– Havinga“miscellaneous”categorywillbehelpful

•  Waitaround15-20minutes(forfastcollec2ons)and30-40minutes(forslowcollec2on)

•  Starttagging

ClassifierGenera2on

•  Checktheclassifierstatus(UI)–  Firstclassifier/modelwillbeupager50labeledtweets,ideallyequallydistributedamonglabels

–  Ifnomodelappearsager50tags,keeptagging•  Human-taggeditems(themorethebe:er)•  40moreneededtore-train(nextclassifiertarget)•  Machine-taggeditems(keepaneyeonmisclassifica2ons)

•  Quality(ideallyshouldbe90<AUC!=100)