29
Data Mining en Data Driven Story Telling Computer Assisted Research and Reporting Peter Verweij School of Journalism Utrecht The Netherlands

Computer assisted research and reporting

Embed Size (px)

DESCRIPTION

CARR

Citation preview

Page 1: Computer assisted research and reporting

Data Mining en Data Driven Story Telling

Computer Assisted Research and Reporting

Peter Verweij

School of JournalismUtrecht The Netherlands

Page 2: Computer assisted research and reporting

Wikileaks

• Begonnen als citizen reporters site• een klokkenluiders site• Onderzoeksjournalisten zijn altijd afhankelijk

van hulp van buitenaf• 300.000 documenten uit de US diplomatieke

post geven, bewerkt door journalisten, een beeld van de buitenlandse politiek

• Julian Assange op de lijst van interpol• Nieuw is het aantal: door digitalisering. • Titel: datamining- datadriven story telling

Page 3: Computer assisted research and reporting

Nick Davies Copy/paste or new possibilities Feiten, nauwkeurigheid en geloofwaardigheid:

Nick Davies Boek, interview, sites http://www.humedia.nl/profiles/blogs/

nick-davies-over recensie

http://extra.volkskrant.nl/select/boeken/artikel.php?id=843

website boek http://www.flatearthnews.net/ Nederlands onderzoek:

http://www.volkskrant.nl/multimedia/article1135829.ece/Eenderde_nieuws_is_voorverpakt

Page 4: Computer assisted research and reporting

Nick Davies 2

Wat is de conclusie? Geen research of onderzoekjournalistiek is

nog mogelijk? Maar: nieuws is overal en de media

focussen op eigen productie: bv nrcnext en parlementaire verslaggeving

Maar: digitalisering biedt meer mogelijkheden

Voorbeeld: twitternetwerken

Page 5: Computer assisted research and reporting

Verkeersongelukken in Utrecht

Page 6: Computer assisted research and reporting

Some classical US examples School bus and drunken drivers

convictions drunken driven; driver licence number, school bus drivers

Hurricane Andrew damage map related to wind strength;

building construction fraude

Page 7: Computer assisted research and reporting

What’s in a name? Phil Meyer(Precision Journalism):

Some practitioners of the "new journalism" took to making up their facts in order to keep up with the deadline pressures. Others stopped short of making things up, but combined facts from different cases to write composite portrayals of reality that they passed off as real cases. Despite the problems, the new nonfiction remains an interesting effort at coping with information complexity and finding a way to communicate essential truth. It pushes journalism toward art. Its problem is that journalism requires discipline, and the discipline of art may not be the most appropriate kind. A better solution is to push journalism toward science, incorporating both the powerful data-gathering and analysis tools of science and its disciplined search for verifiable truth.

After the introduction of internet and spreadsheets: CARR: computer assisted research and reporting

Now because of analysis of databases: Data-mining

Page 8: Computer assisted research and reporting

Some other examples NY Times: Gap in life expectancy USA Today: delegate tracker Volkskrant: topsalarissen NRC: voedselprijzen NRC: WOZ waarde

Page 9: Computer assisted research and reporting

World Food Prices Simple Story

World Bank echoes food cost alarm Research and background

Food price crisis Costs of food

Continuum for reporting: Re-active reporting Proactive reporting

From one column press release story to a full investigative scoop

Page 10: Computer assisted research and reporting

Examples

Page 11: Computer assisted research and reporting

How to follow the story about food prices on the web? Find leading media: FT, BBC, Economist and

subscribe to RSS feed Search newspaper archives: lexis/nexis Search the web with Google

Use more keywords; quotation marks Look different source type: doc, xls, ppt

Use Google news and create RSS feeds Find institutions and their databases or use

Google public data Bloggers: using technorati Use Twitter search hashtags # related to food

prices

Page 12: Computer assisted research and reporting

What has been changed in reporting? Internet:

More sources in number and in full text Geographical range wider Direct access Multi media: including audio/video/graphics

Google indexes about 10 bill pages but that is 20% of the information on the web

Databases: more data How do you find databases?

Institutional approach for searching CBS, Worldbank, IMF FAO, UN, eurostat

Page 13: Computer assisted research and reporting

What has been changed in reporting? (2) Tools for handling data from databases

Spreadsheet; excel Database; access GIS, geographic information systems; mapping;

arcgis How do you store your data?– Create your own database or spreadsheets to

store your data Asksam Google notes

Page 14: Computer assisted research and reporting

New tools

Google public data: directe analyse van databases

Google forms: online enquete maken Google fusion tables: data aan kaarten

koppelen Google maps mashups: adding data to

google maps Links: memeburn en blog Wordpress plugin voor poll

Page 15: Computer assisted research and reporting

Maps masups Web 2.0 and mashups: merging data on

the web http://projects.latimes.com/homicide/map/

Using google API to create poi’s FCJ Utrecht Maps in slideshow with audio

http://www.fao.org/hunger/en/

http://www.gapminder.org/world/

Page 16: Computer assisted research and reporting

Asielaanvragen naar land van nationaliteit

Sep-97 Sep-98Toename AandeelTotaal 3758 5107 35.90

Afganistan 794 820 3.27 Afganistan 16.06Bosnie- Herzogowina 182 526 189.01 Bosnie- Herzogowina 10.30Irak 1154 904 -21.66 Irak 17.70Iran 106 159 50.00 Iran 3.11Servie en Montenegro 192 537 179.69 Servie en Montenegro 10.51Liberia 11 12 9.09 Liberia 0.23Soedan 89 248 178.65 Soedan 4.86Somalia 109 294 169.72 Somalia 5.76Sri Lanka 138 121 -12.32 Sri Lanka 2.37Turkije 127 158 24.41 Turkije 3.09Overige 856 1328 55.14 Overige 26.00

3758 5107 35.90

16%

10%

18%

3%11%

0%

5%

6%

2%

3%

26%

Nationaliteiten

AfganistanBosnie- HerzogowinaIrakIranServie en MontenegroLiberiaSoedanSomaliaSri LankaTurkijeOverigeAfganistan

IrakServie en Montenegro

SoedanSri Lanka

Overige

0.00

5.00

10.00

15.00

20.00

25.00

30.00

16.06

10.30

17.70

3.11

10.51

0.23

4.86 5.76

2.37 3.09

26.00

nationaliteiten

Column G

Page 17: Computer assisted research and reporting
Page 18: Computer assisted research and reporting

Verkiezingen 1998/2003 Grootste partij per gemeentewww.nederlandkiest.nl

Page 19: Computer assisted research and reporting

Gemeente data

• Gemeente utrecht:• http://www.utrecht.nl/smartsite.dws?

id=86964• Interactieve databank:• http://utrecht.buurtmonitor.nl/

Page 20: Computer assisted research and reporting
Page 21: Computer assisted research and reporting

What can we do with these tools? Calculations: averages Graphs: bar, line, pie Maps: Interactive graphs UNDP data by gapminder

Page 22: Computer assisted research and reporting

What is the objective? In journalism:

Graphs are analysis not illustrations Cooperation between programmers, design and

journalists Aim is better journalism; better storytelling, informing

public

What do you need? Knowledge about statistics How to handle spreadsheets, graphs, maps

Page 23: Computer assisted research and reporting

Other techniques Social network analysis: From IRE

Terrorist NetworkValdis Krebs published "Uncloaking Terrorist Networks," an analysis of the Sept. 11, 2001, terrorist network in the April 2002 issue of First Monday, a peer-reviewed Internet journal. This article explains how Krebs was able to construct a visual representation of the network as well as what this visualization can tell us about the network that was previously unknown. Other papers Krebs has authored, including information on InFlow software, can be found at the researcher's Web site: www.orgnet.com

527 Committee DonorsIn the 2004 presidential election "huge donations of a handful of wealthy liberals named Linda Pritzker, Stephen L. Bing, Peter B. Lewis and George Soros could determine the outcome. Together, they have given more than $26 million to help finance the most extensive get-out-the vote operation in history, the goal of which is to make John F. Kerry president." These donations were made to 527 organizations. "Named after a section of the tax code, the 527 groups are doing much of the advertising and field work traditionally left to party organizations." Included with this story is a diagram displaying contributions to Democratic 527s and a list of the biggest donors to these groups.

They RuleThey Rule is a Web site that allows you to create maps of the interlocking directories of the top 100 companies in the United States in 2001. The data is static, so it is fast becoming out of date, as companies merge and disappear and directors shift boards. A new version of this site is being developed.

Page 24: Computer assisted research and reporting

Overzicht uit NRC

Page 25: Computer assisted research and reporting

Netwerken in journalistiek 2

Jury Lidmaatschap Literaire Prijzen

Page 26: Computer assisted research and reporting

Twitter netwerken tussen politici en journalistenMore

Page 27: Computer assisted research and reporting

Other techniques 2 Collect your own data

From secondary to primary data Design your own survey and collect data online using Or Content analysis: for example NRC; talkshow and

partij Google forms or wordpress plugin surveymonkey Analysis:

Online Importing in spreadsheet Datamatrix using SPSS

Page 28: Computer assisted research and reporting
Page 29: Computer assisted research and reporting