17
Five Steps to Search and Store Tweets by Keywords Created by The Curiosity Bits Blog (curiositybits.com) With the support from Dr. Gregory D. Saxton (http://social-metrics.org/ )

Five steps to search and store tweets by keywords

Embed Size (px)

Citation preview

Page 1: Five steps to search and store tweets by keywords

Five Steps to Search and Store Tweets by Keywords

• Created by The Curiosity Bits Blog (curiositybits.com)

• With the support from Dr. Gregory D. Saxton

(http://social-metrics.org/ )

Page 2: Five steps to search and store tweets by keywords

The output you will get…

Let’s say I want to study Twitter discussions of the missing Malaysian airliner

MH370. I plan to gather all tweets that include the keywords MH370 or

Malaysian.

You will get an ample amount of metadata for each tweet. Here is a breakdown

of each metadata type:

name Def.

tweet_id The unique identifier for a tweet

inserted_date When the tweet is downloaded into your database

language language

retweeted_status Is the tweet a RETWEET?

content The content of the tweet

from_user_scree

n_name

The screen name of the tweet sender

Page 3: Five steps to search and store tweets by keywords

name Def.

from_user_followers_count The number of followers the sender has

from_user_friends_count The number of users the sender is following

from_user_listed_count How many times the sender is listed

from_user_statuses_count The number of tweets sent by the sender

from_user_description The profile bio of the sender

from_user_location The location of the sender

from_user_created_at When the Twitter account is created

retweet_count How many times the tweet is retweeted

entities_urls The URLs included in the tweet

entities_urls_count The number of URLs included in the tweet

entities_hashtags The hashtags included in the tweet

entities_hashtags_count The number of hashtags in the tweet

entities_mentions The screen-names mentioned in a tweet

Page 4: Five steps to search and store tweets by keywords

name Def.

in_reply_to_screen_name The screen name of the user who is replied to

by the sender

in_reply_to_status_id The unique identifier of a reply

entities_expanded_urls Complete URLs extracted from short URLs

json_output The ENTIRE metadata in JSON format,

including metadata not parsed into columns

entities_media_count NA

media_expanded_url NA

media_url NA

media_type NA

video_link NA

photo_link NA

twitpic NA

Page 5: Five steps to search and store tweets by keywords

Step 1: Checklist

• Do you know how to install necessary Python libraries? If not, please review pg.8 in http://curiositybits.com/python-for-mining-the-social-web/python-tutorial-mining-twitter-user-profile/

• Do you know how to browse and edit SQLite database through SQLite Database Browser? If not, please review pg.10-14 in http://curiositybits.com/python-for-mining-the-social-web/python-tutorial-mining-twitter-user-profile/

Download the codehttps://drive.google.com/file/d/0Bwwg6GLCW_I

Pdm1mcHNXeU85Nkk/edit?usp=sharing

Page 6: Five steps to search and store tweets by keywords

Have you installed these necessary

Python libraries?

Step 1: Checklist

Page 7: Five steps to search and store tweets by keywords

Step 1: Checklist

Most importantly, we need to install a Twitter mining

library called Twython

(https://twython.readthedocs.org/en/latest/index.html)

Page 8: Five steps to search and store tweets by keywords

Step 2: enter the search terms

You can enter multiple search terms, separated by comas. Please notice

that the last search term ends by a coma.

You can enter non-English search terms. But make sure the Python

script starts by the following block of code:

Page 9: Five steps to search and store tweets by keywords

Step 3: enter your API keys

API Key

API secret

Access token

Access token secret

Enter the key inside the quotation marks

Page 10: Five steps to search and store tweets by keywords

Step 3: enter your API keys

• Set up your API keys - 1

First, go to https://dev.twitter.com/, and sign in your Twitter account. Go to my applications page to create an application.

Page 11: Five steps to search and store tweets by keywords

Step 3: enter your API keys

• Set up your API keys - 2

Enter any name that makes sense to you

Enter any text that makes sense to you

you can enter any legitimate URL, here, I put in the URL of my institution.

Same as above, you can enter any legitimate

URL, here, I put in the URL of my institution.

Page 12: Five steps to search and store tweets by keywords

Step 4: change the parameter

result_type defined by the Twitter API Documents. Now, we

set it to recent, we can also set it to mixed or popular.

Page 13: Five steps to search and store tweets by keywords

Step 4: change the parameter

Here is a list of parameters you can tweak or add:

https://dev.twitter.com/docs/api/1.1/get/search/tweets

For example, if you want to limit the search to Chinese, you

can add lang = ‘zh’

Page 14: Five steps to search and store tweets by keywords

Step 4: change the parameter

For another example, if you want to limit the search to all

tweets sent until April 1 of 2014. You can add until = ‘2014-

04-01’

Page 15: Five steps to search and store tweets by keywords

Step 5: set up SQLite database

• When you type in just a file name, the database will be

saved in the same folder with the Python script. You can

use a full file path such as

sqlite:///C:/xxxx/xxx/MH370.sqlite.

Page 16: Five steps to search and store tweets by keywords

Hit RUN!

Page 17: Five steps to search and store tweets by keywords

If you run the script daily or twice a day, you should be good enough to cover all tweets generated on that day, and tweets a few days old.

But, historical tweets are EXPENSIVE! Tweets older than a week can be purchased through http://gnip.com/

Are we getting all the tweets?