9
SEARCH QUERY GUIDE BOOLEAN SEARCH QUERIES We are all used to making searches on Google. We begin by typing in keywords of our interest and Google matches documents that contain those keywords and return you search results. However life isn't always that simple. In research - where it is about precision and reliability of all search results for decision making - we need a way to express our searches in a more powerful way so that the computer can understand what exactly we need and what we don't need. For example, you could be doing a report on all Apple products, and if you were to simply enter your keyword as "apple" as you do in Google, you may end up with a big proportion of the search results referring to apple as a fruit instead of Apple products, or even people who have Apple as a surname or the American bank that has Apple as it's name. You may also be only interested in Apple products from the Middle East region and you should be able to easily convey that to the computer. To solve this problem, your search keywords should be written as “boolean search queries”. boolean search queries allow you to add special meaning to your keywords by adding “boolean operators". For example, a simple boolean query for the above problem could be: Apple AND (iPhone OR iPad OR Mac) NOT "Apple Bank". In this example, the words AND, OR and NOT, and the characters () and “” are examples of boolean operators. In this document, we will show you a comprehensive list of operators that you can use to create powerful searches using the Locus Elite tool. OPERATORS EXPLANATION EXAMPLE QUOTES “” Find messages that contain the exact term in the quote. Abu DhabiReturns messages that contain the words Abu and Dhabi in exactly the same order. AND Find messages that contain each and every term. Abu DhabiAND hotels AND flights Returns messages that contain all terms. AND is also the default operator, so you don’t have to mention it. Abu Dhabihotels flights has the same meaning. OR Find messages that contain either of the terms. Abu DhabiOR DubaiReturns messages that contain either Abu Dhabi or Dubai. NOT Find messages that contain a term and not another term. Abu DhabiNOT DubaiReturns messages that contain Abu Dhabi and do not contain Dubai. WILDCARD * Find messages using a root word. Safe* Returns messages that contain any word that starts with safe i.e. safety, safest etc. BRACKETS ( ) Terms enclosed within brackets are dealt first, before applying anything outside of it. ( UAE AND flights ) OR ( UAE AND hotels ) Returns with messages about either UAE flights or UAE hotels. BASIC OPERATORS

Boolean search query guide

Embed Size (px)

Citation preview

Page 1: Boolean search query guide

SEARCH QUERY GUIDE

BOOLEAN SEARCH QUERIESWe are all used to making searches on Google. We begin by typing in keywords of our interest andGoogle matches documents that contain those keywords and return you search results. Howeverlife isn't always that simple. In research - where it is about precision and reliability of all searchresults for decision making - we need a way to express our searches in a more powerful way sothat the computer can understand what exactly we need and what we don't need.

For example, you could be doing a report on all Apple products, and if you were to simply enteryour keyword as "apple" as you do in Google, you may end up with a big proportion of the searchresults referring to apple as a fruit instead of Apple products, or even people who have Apple as asurname or the American bank that has Apple as it's name. You may also be only interested inApple products from the Middle East region and you should be able to easily convey that to thecomputer.

To solve this problem, your search keywords should be written as “boolean search queries”.boolean search queries allow you to add special meaning to your keywords by adding “booleanoperators". For example, a simple boolean query for the above problem could be:Apple AND (iPhone OR iPad OR Mac) NOT "Apple Bank". In this example, the words AND,OR and NOT, and the characters ( ) and “ ” are examples of boolean operators.

In this document, we will show you a comprehensive list of operators that you can use to createpowerful searches using the Locus Elite tool.

OPERATORS EXPLANATION EXAMPLE

QUOTES “ ”Find messages that contain the exact term in the quote.

“Abu Dhabi”Returns messages that contain the words Abu and Dhabi in exactly the same order.

ANDFind messages that contain each and every term.

“Abu Dhabi” AND hotels AND flights

Returns messages that contain all terms.AND is also the default operator, so you don’t have to mention it. “Abu Dhabi” hotels flights has the same meaning.

ORFind messages that contain either of the terms.

“Abu Dhabi” OR “Dubai”Returns messages that contain either Abu Dhabi or Dubai.

NOTFind messages that contain a term and not another term.

“Abu Dhabi” NOT “Dubai”Returns messages that contain Abu Dhabi and do not contain Dubai.

WILDCARD *Find messages using a root word.

Safe*

Returns messages that contain any word that starts with safe i.e. safety, safest etc.

BRACKETS ( )Terms enclosed within brackets are dealt first, before applying anything outside of it.

( UAE AND flights ) OR ( UAE AND hotels )

Returns with messages about either UAE flights or UAE hotels.

BASIC OPERATORS

Page 2: Boolean search query guide

SEARCH QUERY GUIDE

OPERATORS EXPLANATION EXAMPLE

posttitle:

Find messages that contain the exact term in the title. Title refers to the headline in news articles and the post title in social media posts. As for tweets, title refers to the full tweet text.

posttitle:Dubai means retrieving posts with the word Dubai in the title.

postedat:Filter messages according to the date they are being posted.

postedat:[2015-06-25T00:00:00Z

TO 2015-06-30T00:00:00Z]

means retrieving posts that were posted between 25/6/15 to 30/6/15 early morning 12am.

postedat:[2015-06-25T10:30:00Z

TO 2015-06-30T13:30:00Z]

means retrieving posts that were posted between 10.30am 25/6/15 and 1.30pm 30/6/15. Timezone is UTC, UTC is 4 hours behind the gulf standard time.

postlanguage: Filter messages according to their language.

Dubai AND postlanguage:en

means retrieving posts that are in English and contain Dubai.

mediatype:

Filter messages according to their media type (Facebook, Twitter, Instagram etc).

*Refer to appendix for list of media type codes.

mediatype:(twitter OR facebook

OR news) means retrieving tweets, or Facebook posts or news articles.

mediatype:(twitter) means retrieving tweets only.

ADVANCED OPERATORS

Notes:

1. Queries are not case sensitive, thus all letters will be treated as if they were lower case letters. Punctuationsand symbols in your search query will also be ignored.

2. All Boolean operators must be in capital letters for e.g. AND, OR and NOT instead of “and”, “or” and “not”.

3. Brackets ( ) are used to group search terms closely and to provide clarity for the computer to betterunderstand your intentions. You can also put brackets inside brackets to create complex queries. For example:(cafe OR restaurant) AND ((cheap OR economic) AND (hygienic OR clean)) would mean thatyou are looking for clean and economic cafes or restaurants. It could also mean that you are looking forhygienic and cheap cafes or restaurant.

4. The mediatype: operator can be used to restrict results from specific media types while the usercountry:

and postlanguage: operators can be used to filter for results from certain countries and languagesrespectively. These operators are only useful when combined with other terms using the AND/ NOT operator.An example showing the above operators in use would be:summer AND ( postlanguage:en ) AND ( mediatype:twitter ) AND ( usercountry:( us OR gb

)). The complete list of media type, language and country codes currently supported can be found inAppendix.

Page 3: Boolean search query guide

SEARCH QUERY GUIDE

OPERATORS EXPLANATION EXAMPLE

userfollowers: Filter messages based on the followers of its author.

userfollowers:[2000 TO *]

represents 2000 or more followers.

userfollowers:[2000 TO

5000] represents followers between 2000 and 5000.

popularity_i:

Filter messages based on the popularity tier of the user who posted it or the publication that published it. (1 = most popular tier; 7 = least popular tier).

Popularity for news publications is based on: (a) web traffic to its website and (b) the local reputation. Popularity for social media users is based only on the followers the user has.

popularity_i:[1 TO 3]

means retrieve only messages between from the most popular tier (1) to (3).

usercountry:

Filter messages based on the country of origin of the user. Country is derived by looking at the profile information of the user.

Note that a significant proportion of social media users need not express their whereabouts on their profile. So a filtering based on a combination of country and language is generally recommended in many cases. For news publications, this is the country the news organisation is primarily targeting. The ISO two letter codes of the countries are required here.

*Refer to appendix for list of country codes.

usercountry:(ae OR sa)

means retrieve only messages posted by users who are said to be from the United ArabEmirates or Saudi Arabia.

usercountry:in means retrieve only messages posted by users in India or news publications from India.

gender:

Filter messages based on the gender (male or female) of the user who posted it. Gender is derived from the name of the person.

Note that this information is not available for all users as they don't have to mention their real name on their profiles. Names are also judged based on the location of the users in some cases, as a male name in a country could be a female name in another country. Unisex names in a country aren't tagged at all.

gender:male means retrieve only messages from men.

gender:female means retrieve only messages from women.

gender:(male OR female)

means retrieve only messages that we have been able to tag for gender.

sentiment:

Filter messages based on the sentiment of the message. Sentiment is classified as "positive", "negative" or "neutral“. When there is no sentiment expressed in a post, it may be tagged as "neutral".

sentiment:(positive OR

negative) means retrieve only messages that are either positive or negative.

ADVANCED OPERATORS

Page 4: Boolean search query guide

SEARCH QUERY GUIDE

CODE MEDIA TYPE

instagram Instagram

twitter Twitter

facebook Facebook

news News

pinterest Pinterest

youtube YouTube

googleplus Google+

linkedin LinkedIn

forum_posts Forum posts

forum_threads Forum threads

blog Blogs

APPENDIX – MEDIA TYPES CODES

OPERATORS EXPLANATION EXAMPLE

username:Filter messages based on the username or screen name of the user who posted them. Remember to always write the username in lower case letters.

username:filmabudhabi

means retrieve only messages posted by the user with the username or screen name as "filmabudhabi“.

username:(mbznews OR

mbzphotos) means retrieve only messages posted by the users with usernames being mbznews or mbzphotos. You can add any number of user names here.

userverified:

Filter messages based on whether the user who posted it was verified by the social network or not. This only applies to social media. Social networks typically verify some accounts as being authentic and not from an impersonator. This is shown with a blue tick mark beside the user's name on the social network.

userverified:true means retrieve only messages that are made from accounts that have been verified.

Page 5: Boolean search query guide

SEARCH QUERY GUIDE

CODE LANGUAGE CODE LANGUAGE CODE LANGUAGE

ar Arabic fa Persian la Latin

szh Chinese Simplified pl Polish vi Vietnamese

tzh Chinese Traditional ro Romanian ms Malay

nl Dutch sk Slovak tl Tagalog

en English sr Serbian ta Tamil

fr French sl Slovenian am Amharic

de German es Spanish mt Maltese

el Greek sv Swedish ga Irish

it Italian th Thai mk Macedonian

ja Japanese tr Turkish cy Welsh

ko Korean hr Croatian bn Bengali

pt Portuguese bg Bulgarian gu Gujarati

ru Russian ca Catalan kn Kannada

cs Czech et Estonian ml Malayalam

da Danish fi Finnish mr Marathi

he Hebrew af Afrikaans ne Nepali

hu Hungarian sq Albanian pa Punjabi

is Icelandic eu Basque so Somali

id Indonesian be Belorussian sw Swahili

zh Chinese br Breton te Telugu

lv Latvian fo Faroese uk Ukrainian

lt Lithuanian gd Gaelic ur Urdu

no Norwegian hi Hindi

APPENDIX – LANGUAGE CODES

Page 6: Boolean search query guide

SEARCH QUERY GUIDE

CODE LANGUAGE CODE LANGUAGE CODE LANGUAGE CODE LANGUAGE

af Afghanistan bt Bhutan cdDemocratic Republic

of Congo gf French Guiana

al Albania bo Bolivia ck Cook Islands pf French Polynesia

dz Algeria baBosnia and

Herzegovina cr Costa Rica ga Gabon

asAmerican

Samoa bw Botswana ci Côte d'Ivoire gm Gambia

ad Andorra br Brazil hr Croatia ge Georgia

ao Angola vgBritish Virgin

Islands cu Cuba de Germany

ai Anguilla bnBrunei

Darussalam cy Cyprus gh Ghana

aq Antarctic bg Bulgaria cz Czech Republic gi Gibraltar

agAntigua and

Barbuda bf Burkina Faso dk Denmark gr Greece

ar Argentina bi Burundi dj Djibouti gl Greenland

am Armenia kh Cambodia dm Dominica gd Grenada

aw Aruba cm Cameroon do Dominican Republic gp Guadeloupe

au Australia ca Canada ec Ecuador gu Guam

at Austria cv Cape Verde eg Egypt gt Guatemala

az Azerbaijan ky Cayman Islands sv El Salvador gg Guernsey

bs Bahamas cfCentral African

Republic gq Equatorial Guinea gn Guinea

bh Bahrain td Chad er Eritrea gw Guinea-Bissau

bd Bangladesh cl Chile ee Estonia gy Guyana

bb Barbados cn China et Ethiopia ht Haiti

by Belarus cx Christmas Island fk Falkland Islands hn Honduras

be Belgium cc Cocos Islands fo Faroe Islands hk Hong Kong

bz Belize co Colombia fj Fiji hu Hungary

bj Benin km Comoros fi Finland is Iceland

bm Bermuda cg Congo fr France in India

APPENDIX – COUNTRY CODES

Page 7: Boolean search query guide

SEARCH QUERY GUIDE

CODE LANGUAGE CODE LANGUAGE CODE LANGUAGE CODE LANGUAGE

id Indonesia lu Luxembourg na Namibia pn Pitcairn Islands

ir Iran mo Macau nr Nauru pl Poland

iq Iraq mk Macedonia np Nepal pt Portugal

ie Ireland mg Madagascar nl Netherlands pr Puerto Rico

im Isle of Man mw Malawi anNetherlands

Antilles qa Qatar

il Israel my Malaysia nc New Caledonia re Reunion

it Italy mv Maldives nz New Zealand ro Romania

jm Jamaica ml Mali ni Nicaragua ru Russia

jp Japan mt Malta ne Niger rw Rwanda

je Jersey mh Marshall Islands ng Nigeria bl Saint Barthelemy

jo Jordan mq Martinique nu Niue sh Saint Helena

kz Kazakhstan mr Mauritania nf Norfolk Island kn Saint Kitts and Nevis

ke Kenya mu Mauritius kp North Korea lc Saint Lucia

ki Kiribati yt Mayotte mpNorthern Mariana

Islands mf Saint Martin

kw Kuwait mx Mexico no Norway pmSaint Pierre and

Miquelon

kg Kyrgyzstan fmMicronesia

Federated States of om Oman vcSaint Vincent and

the Grenadines

la Laos md Moldova pk Pakistan ws Samoa

lv Latvia mc Monaco pw Palau sm San Marino

lb Lebanon mn Mongolia psPalestinian

Territory stSao Tome and

Principe

ls Lesotho me Montenegro pa Panama sa Saudi Arabia

lr Liberia ms Montserrat pgPapua New

Guinea sn Senegal

ly Libya ma Morocco py Paraguay rs Serbia

li Liechtenstein mz Mozambique pe Peru sc Seychelles

lt Lithuania mm Myanmar ph Philippines sl Sierra Leone

APPENDIX – COUNTRY CODES

Page 8: Boolean search query guide

SEARCH QUERY GUIDE

CODE LANGUAGE CODE LANGUAGE

sg Singapore to Tonga

sk Slovakia tt Trinidad and Tobago

si Slovenia tn Tunisia

sb Solomon Islands tr Turkey

so Somalia tm Turkmenistan

za South Africa tc Turks and Caicos

kr South Korea tv Tuvalu

es Spain ug Uganda

lk Sri Lanka ua Ukraine

sd Sudan ae United Arab Emirates

sr Suriname gb United Kingdom

sj Svalbard and Jan Mayen us United States

sz Swaziland uy Uruguay

se Sweden vi US Virgin Islands

ch Switzerland uz Uzbekistan

sy Syria vu Vanuatu

tw Taiwan va Vatican City

tj Tajikistan ve Venezuela

tz Tanzania vn Vietnam

th Thailand ye Yemen

tl Timor-Leste zm Zambia

tg Togo zw Zimbabwe

tk Tokelau

APPENDIX – COUNTRY CODES

Page 9: Boolean search query guide

ABOUT LOCUS ELITE

By marrying real time search insights and social data, Locus offers a dedicatedphysical space for policy makers, strategists and leaders to spot trends, followcurrent affairs and track chatter around government services using state- of-the-art visualisations projected on large flat-screen displays.

For more details, kindly visit our website at http://locuselite.com or email us [email protected].