167
1 Google Search Methods: Finding things you didn’t know existed Daniel M. Russell , PhD Search Quality & User Experience Research Tipsheet: bit.ly/Dan-ADV-Tipsheet2015 Slides: bit.ly/Dan-SKUP-preso Search Tips & Strategies for Investigative Journalists

Google Search Methods: Search Tips & Strategies for Investigative

Embed Size (px)

Citation preview

1

Google  Search  Methods:    Finding  things  you  didn’t  know  existed  

Daniel M. Russell , PhD Search Quality & User Experience Research

Tipsheet: bit.ly/Dan-ADV-Tipsheet2015 Slides: bit.ly/Dan-SKUP-preso

Search Tips & Strategies for Investigative Journalists

1.    Search-­‐by-­‐image  

•  Suppose you have an image…

… how can you figure out what it is? … where did it come from? … who has copyright?

2

Demo

•  .

3

•  f

4

You can ask impossible questions…

Where is this?

Search by Image

I found this in the basement… what is it?

7

   What  kind  of  a  caterpillar  is  this?    

8  

Search-­‐by-­‐Image  

1.  Drag  the  image  into  Image  Search  

9  

Modify  the  query  to  provide  context  

2.    Add  a  couple  of            keywords…    

10  

Subimaging as a way to get SBI to work

11

Crop to just the salient bit

•  f

12

 Summary:  Search-­‐by-­‐Image  

1.  Use SBI on images from books (finds images from many resources)

2.  Highest probability is for “common images / common views”

3.  Modify query on search to focus in with additional information

4.  Crop image to region with single element (e.g., logo or distinctive feature) 13

2. Image Search: Filtering by color

•  Use the image filters to drill into results

14

Select “Search Tools” > Color > Green

•  f

15

•  f

16

17

3. Alerts aka “standing queries”

•  http://www.google.com/alerts •  Scan news, groups, web, videos, comprehensive… •  Generate emails automatically

–  Use in conjunction with advanced search techniques

•  f

18

4. Number of results

•  The # of results is an estimate. (Repeat that!)

19

1.08M

73K

84K

5. Google Trends

•  f

20

Trends

•  search volume patterns across specific regions, categories, time frames and properties

[ Google Trends ]

21

•  f

22

23

Lower half: filter by region

•  f

24

6. Google Correlate

•  Allows searches for queries that correlate in volume over time

25

•  f

26

7. Other Googles Exist

•  The Google you know is the US version.

•  Many countries, most major languages have their own version of Google.

27

For example…

•  f

28

Methods  to  find  informa0on  from  other  languages  

1.  Go to the Google web search for that country.

2.  Use the built-in other-language tool in Advanced Search

29

Go  to  country’s  own  Google    

•  Example: Google.co.in

30

Selec0ng  Hindi  from  the  home  page:    [  eurozone  ]    

31

•  f

Different  Googles  to  try…    

•  Pay attention to the languages offered by each country’s localized versions

–  Google.co.za (S. Africa) –  Google.co.ke (Kenya) –  Google.co.id (Indonesia) –  Google.co.vt (Việt Nam) –  etc....

32

•  Fastest  way  to  find  country  Google  access?    

–  [  Google  <country>  ]    

–  [  Google  Ireland  ]    –  [  Google  Singapore  ]    –  [  Google  Tasmania  ]    

33

Different  Googles  to  try…    

Excep0ons  

•  Notes: –  MOST countries use Google.co.?? as their domain –  BUT.. Some are Google.com.?? (e.g., Ghana:

Google.com.gh ) –  SOME.. Are Google.?? (e.g., Rwanda: Google.rw )

•  Not possible to use Google to search some domains: –  Bhutan –  Mayotte –  etc…

–  But you CAN use the site:yt to search Mayotte (YT) or site:bt to search Bhutan (BT)

34

Excep0ons  

Why might you care?

•  The news is very different depending on where you stand 35

What matters / what doesn’t matter in search

•  Capitalization doesn’t matter (except for OR)

36

What matters / what doesn’t matter in search

•  Diacritical characters DO matter (á, é, í, ó, ú, ý, ö, ø, å, Ç, Ğ, I, İ...)

37

What matters / what doesn’t matter in search

•  But special characters DO NOT matter

38

8. Language Translation

•  Constantly improving… •  Romance-to-Romance translations are (currently) the best

[ Google translate ]

39

Pay attention to the suggestions…

•  f

•  f

41

Combine  content  +  tools:  Other  Wikipedias  

42  

•  f

43

44

English (6K words) Russian (7K)

9. Finding Tools

•  Good searchers know that tools can help their searching

•  3 tools to start you off: 1. Reverse dictionary 2. Finding ListServs 3. Control-F

45

46

9.A. Finding and using other tools

You’re writing a story about a recent air disaster, and there’s a question about the integrity of the outside of the jet engine.

Question: What is that part of a jet engine called?

47

Answer: I don’t know.

•  This is a really hard question. The best way to answer it is to first look for a tool that can help: ….. a reverse dictionary. [ reverse dictionary ]

Reverse dictionary

•  Then, go to the reverse dictionary http://www.onelook.com/reverse-dictionary.shtml … and type in the words [ jet engine housing streamlined] then look thru the list of words it shows you.

•  Answer: “nacelle”

48

9.B. Finding LISTSERVs

•  Why LISTSERVs? Superb source for people complaining.

[ list of LISTSERVS ]

•  Pay attention to suggestions as they appear

49

9.C Control-F to find a word on the page

•  Does the California Vehicle Code regulate the use of “pocket bikes” on roads?

[ California Vehicle Code ]

50

It’ll look like this…

•  It’s 65 pages long

•  Is the phrase “pocket bike” used here?

51

52

Control-F aka CMD-F aka Edit>Find

53

Control-F

54

55

NOTE!

How knowing Control-F changes things

•  Question: How many times does the word

behold

appear in the King James bible?

56

•  .

57

9.D.  There  is  a  regexp  “Find”  Chrome  extension  

58  

59  

When does Control-F NOT work?

•  A: When the entire document isn’t loaded into the browser page

60

All 177,000 results don’t all fit onto one page

•  f

61

Advance to page 2

•  f

62

Or… the entire page auto-loads as you scroll

•  ff

63

•  f

64

65

9.D. Tools: Search web history •  Works only if logged-in AND you’ve opted-in

www.google.com/history

66

Your web search history is searchable (if you have it turned on)

67

9.E. Define

•  Example: [ define loxodrome ] [ define Mollweide projection ] 68

•  f

69

•  /

70

10.A. Scholar

•  Collection of scholarly papers from the research literature

•  Legal content (growing in quantity and coverage)

•  Has its own Alerts and Notifications

71

72

Scholar  

Scholar contains legal opinions as well

73

Change  type  here  

74

Current  drawbacks  

•   Lack  of  an  easy  way  to  look  for  important  legal  issues  in  a  case          (Keycite  from  WestLaw)      

•  specific  issues  and  authoriRes  which  serve  as  the  basis  for  the  ruling,  and  indexes  them  in  a  summary  at  the  start  of  the  case  

•  No  way  to  Shepardize  cases  •   Shepards  CitaRon  Service  is  aLexis  tool  which  enables  a  researcher  to  quickly  

determine  whether  or  not  a  case  is  sRll  good  law.    (manually  review  each  of  the  results)  

•  Federal  case  law  goes  back  80  years  

•  State  case  law  goes  back  50  years  

75  

Legal  content  in  Scholar:  Advantages  

•  Ge[ng  the  "big  picture"  of  the  scholarly  discourse  around  a  topic.  

•  Viewing  books,  arRcles,  conference  proceedings,  and  more  all  in  one  list.  

•  Determining  authors  and  publicaRons  in  an  area  of  interest.  

•  Tracking  down  incomplete  citaRons.  

76  

Author  search  in  Scholar  

For  example:  The  search  [friedman  regression]  returns  papers  on  the  subject  of  regression  wri`en  by  people  named  Friedman.  If  you  want  to  search  on  an  author's  full  name,  or  last  name  and  iniRals,  enter  the  name  in  quotes:  ["jh  friedman"].    There  must  be  no  space  between  "author:"  and  your  search  term.  

 For  example:  

[author:flowers]  returns  papers  wri`en  by  people  with  the  name  Flowers,  whereas  [flowers  -­‐author:flowers]  returns  papers  about  flowers,  and  ignores  papers  wri`en  by  people  with  the  name  Flowers.      

Try  alternate  spellings  of  name:    To  find  papers  by  Donald  E.  Knuth  try    [author:"d  knuth”]    [  author:"de  knuth”]              [author:"donald  e  knuth"].  

77  

Publication  restrict  

(This  opRon  is  only  available  on  the  Advanced  Scholar  Search  page.)  Example:  

If  you  want  to  search  the  Journal  of  Finance  for  arRcles  about  mutual  funds,  you  might  start  like  this:  

               

In  general,  publicaRon-­‐restricted  searches  are  effecRve  if  you're  certain  of  what  you're  looking  for,  but  they‘re  oien  narrower  than  you  might  expect.  

 For  instance:  

You  might  find  that  a  search  across  all  publicaRons  for  [mutual  funds]  gives  more  useful  results  than  a  more  specific  search  for  "funds"  only  in  the  Journal  of  Finance.    

Remember  to  try  alternate  spellings:  e.g.,  Journal  of  Biological  Chemistry  is  oien  abbreviated  as  J  Biol  Chem,  or  J  Bio  Chem.  

 78  

79

10.B. Books

80

•  Books.google.com –  scanned page images [ manta ray ]

•  “Find in a library”

•  f

81

82

10.C. Patents

•  Google.com/patents  

•  Usually  want  to  use    advanced  search  here  

•  Now  includes  EU  patents  

10.D. Data table search: Research.Google.com/tables

•  Can now search for data tables directly

83

•  f

84

85

86

•  v

87

88

11. Public Data Explorer: Search / Visualize Public Data

http://www.google.com/publicdata/

89

Search, Visualize, and Upload datasets

http://www.google.com/publicdata/

90

Caution: Read metadata on tables carefully!

searchresearch1.blogspot.com/2014/02/answer-how-many-students-how-many-years.html

91

Caution: Read metadata on tables carefully!

searchresearch1.blogspot.com/2014/02/answer-how-many-students-how-many-years.html

Small difference: The NSF numbers were 1/3rd the OECD’s numbers… Why? Turns out they count differently.

92

12. Google Maps / Earth / Geo in general

Important  skill:    Search  for  tools  •  f  

93  

94  

95  

96  

97  

98  

opposite  /  adjacent  =  tan  (  Θ  )    1.45  /  8.80  =  tan  (  Θ  )  1.45  /  8.80  =  0.1651  [  arctan  (0.1651)  ]  [  0.16362  radians  in  degrees  ]  9.3749  /  0.27  =    34.7  minutes    ….  Or  7:39AM    

99  

100  

Or… you could just look for a tool to do it…

•  .

101

BUT….      

[ right triangle angle calculator ]

102  

•  f

103

104

105

Flying into JFK (from the east to west)

106

What’s causing those rectilinear features ?

How big are those features?

•  Use Google Earth (or Maps) to zoom in with a measuring tool

•  Realize that these aren’t CANALS, they’re more like DITCHES!

107

Google Maps Terrain View

•  f

108

•  f

109

To  turn  on  Terrain  view  (TrafUic,  etc.)    

•  f

110

Turn  on  Terrain  view    

•  f

111

Turn  on  Terrain  view  

•  f

112

How  to  measure  distances  in  Google  Map  

•  Right click (control-click on Mac) to start:

113

For our purposes, what can YOU find?

114

“What’s around here?”

115

What’s the news story…

•  … now that you know what the company is, you can find associated news stories.

•  With the map, you can identify the source of the company’s pollution, where it’s going, and who is (should-be) worried about it!

116

12.B. StreetView Archives

•  f

117

Upper left: use slider

118

Slider back to Apr 2008

•  f

119

12.C. Maps Gallery – maps.google.com/gallery/

•  Maps Gallery: a collection of time-based / geo-ref maps

120

•  f

121

Locating things in the world

122

Where in the world is this?

Examine the EXIF data…

•  EXIF is metadata built into the image file.

123

•  f

124

Enter Lat/Long directly into Google Maps

•  f

125

Get  lat/long  and  drop  into  Google  Maps  

•  Streetview of Beach Dr in Aptos

126

13. Operators

•  There are many operators (filetype: site: inurl: intext: etc.) •  See Tipsheet for more details

127

128

Finding a particular kind of document

•  Your brother is a teacher at the local high school, and needs to find a lesson plan for a unit on superconducting materials.

•  Question: Can you find a lesson plan for him?

•  Hint: Look for a particular KIND of document…

129

Answer

•  Use the operator FILETYPE: to focus in just on presentations [ superconductor high school filetype:ppt ]

•  Note that filetype: can take on ANY file extension – –  PDF, PPT, XLS, DOC, WMV, TXT, CSV, SKP, KMV, …

(In fact, arbitary extensions… e.g., AQS)

130

* Searching within a particular site

•  Someone told me that I’d been quoted in the New York Times. OMG! What did I say that was quotable?

•  Can you find a page in the New York Times where I (Dan Russell) was quoted?

131

Answer

•  Use the site: operator to search within a particular web site… [ “Daniel M Russell” site:nytimes.com ]

… and see the number 1 hit. (Yes, I worked at IBM.)

Answer: 1. Because “Daniel Russell” is a very common name.

2. The NYTimes has the convention of always spelling a person’s name out completely, including middle initials

Gotcha: [ site:.EDU query ] careful about EDU

•  .

132

Another use of site: -- to search within

•  Example: Want to find all mentions of the composer Alan Hovhaness in the U. Maryland Music Archives collection.

•  How?

133

•  f

134

digital.lib.umd.edu

•  f

135

On the other hand… Don’t overlimit your search!

•  My friend Sean Carlson posts to Facebook “I’ve written an article in a major NYC paper…”

•  I foolishly search for: [ Sean Carlson Ireland site:nytimes.com ]

136

•  .

137

What SHOULD I have done?

•  Tried the simpler case first: [ Sean Carlson Ireland New York ]

138

•  f

139

Limit search by time…

•  .

140

* Advanced search tool

•  How to get to the advanced search UI

141

Advanced Search UI

142

Or…  the  obvious  

•  f

143

Useful  Advanced  Search  Feature…    

•  Can select language of the result AND the location from which the results should be drawn…

144

•  f

145

Results  

•  f

146

Tactic:  use  OR  for  synonym  control  

•  When  you  want  to  control  synonyms  used  •  Note:    MUST  be  capitalized!    

147  

intext:    requires  that  text  be  ON  the  page  

•  For instance, if you search for: [ chilean toothfish ]

148

intext:    requires  that  text  be  ON  the  page  

•  However, if you quote it… [ “chilean toothfish” ]

149

intext:    requires  that  text  be  ON  the  page  

•  Using intext: [ intext:“chilean toothfish” ]

150

14. Google Custom Search Engines (CSE)

•  Custom Search Engine lets you build a specialized search engine

[ Google Custom Search Engine ]

•  www.google.com/cse/

•  Example: How can you search over all of the content in all of the 10 UC campuses?

151

•  You could do a query like:

[ coral seminar June 2014 site:ucsd.edu OR site:ucdavis.edu OR ... etc... ]

152

•  f

153

•  f

154

•  f

155

Now… to use this:

•  Use just like Google (but the results come ONLY from the UC sites you selected)

156

Sample results

•  f

157

15.    Question-­‐asking/answering  

•  .  

158  

159

Summary

•  When in doubt, search it out!

•  Your search skills will become stale quickly… …. keep tracking the new features that we offer!

•  Practice deliberately. When you get the chance, try the same search a few different ways and note the differences. Ask why!

How to learn this stuff: MOOCS

•  PowerSearchingWithGoogle.com

160

PowerSearchingWithGoogle.com      

•  “Behind  me  is  a  ruin  at  the  western  edge  of  the  city  by  the  Bay…    Once,  on  this  site  stood  an  impressive  structure,  one  that  is    now  veiled  in  mystery  and  exists  only  as  a  ruin.    

•  Can  you  find  out  what  was  once  here,  and  once  you  know  that,  can  you  determine  how  many  cubic  feet  of  cement  it  took  to  build  this  amazing  structure?    

161

•  f

162

Open now!

Google Cheat Sheet PDF file

bit.ly/Google-cheat-sheet

Try out AGoogleADay.com

164

Feedback from you! (email me: [email protected])

•  What resources do YOU use in your searches? –  A la Online Encyclopedia of Integers; BLS; etc.

•  What information maps do YOU use? –  a la the “reverse dictionary”

•  What’s the hardest problem (or kind of problem) you’ve had to deal with?

end

166

Tipsheet:

bit.ly/Dan-IRE-Tipsheet2015 bit.ly/Dan-IRE-Tipsheet2014 bit.ly/Dan-IRE-Tipsheet2013

My home site:

bit.ly/DanHome My blog:

SearchResearch1.blogspot.com @dmrussell [email protected]

167