Upload
ngotuong
View
229
Download
0
Embed Size (px)
Citation preview
2nd Kavli Symposium on
Science Journalism: Data Mining Dolce Hayes Manison, San Jose, CA, USA, 16
th-18
th February 2015
Detailed Report
Published August 25, 2015
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
2 | P a g e
PREFACE
The 2nd
Kavli Symposium on Science Journalism brought together an international group of leading
science journalists and specialists to explore data mining and innovative data tools.
Advisory committee
Damien Chalaud
Executive Director
World Federation of Science Journalists
Veronique Morin
Science Journalist; Symposium project leader
World Federation of Science Journalists
Mariette DiChristina
Editor-in-Chief
Scientific American
Ivan Oransky
Global Editorial Director
MedPage Today
Dan Fagin
Director, Science, Health and Environmental
Reporting Program, New York University
Ginger Pinholster
Director, Office of Public Programs
American Association for the Advancement of Science
Pallab Ghosh
Science Correspondent
BBC
Volker Stollorz
Science Journalist
Frankfurter Allgemeine Sonntagszeitung
James Hamilton
Director, School of Journalism
Stanford University
Richard Stone
International New Editor
Science Magazine
Thomas Hayden
Science journalist/author
Lecturer, Stanford University
James Cohen (Advisor)
Director of Communications & Public Outreach
The Kavli Foundation
Robert Lee Hotz
Science Editor
Wall Street Journal
Participants
Fifty-six participants from 10 countries, who work as either science journalists or data mining experts,
attended the symposium. Their names can be found at the end of this document.
Report Authors
David M. Secko1 and Veronique Morin
2
1 Department of Journalism, Concordia University, Montreal; email: [email protected]
2 Science journalist, Kavli Symposium project leader, former president of WFSJ, CSWA
Acknowledgment
We would like to thank the organizing committee for their time and effort in making the symposium a
success. In addition to the principal funding provided to the symposium by the Kavli Foundation,
generous support was also received by the German Science Academy.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
3 | P a g e
FOREWORD
Dear Readers,
The Kavli Symposium on Science Journalism is a unique platform/forum that brings together different
stakeholders – science journalists, academics, policy makers, and researchers – to deeply discuss topics
and issues important to field.
The 2nd
Kavli Symposium on Science Journalism focused on the technological revolution of recent
decades that has allowed us to collect, store and exploit increasingly vast quantities of data and to
communicate them globally, irrespective of location. Data is everywhere, it is making headway into every
area of the global economy. The impact of data in science is felt across many domains: from medical
research to the environment, genomics to clinical trials. It has created major changes in societies, offering
great challenges and opportunities for science.
The purpose of the Symposium was to address how tools and applications of data mining in the context of
science coverage can benefit the international science journalism community. Data journalism is a
relatively new but rapidly evolving discipline, with enormous potential. As journalists around the world
try to tackle the vast quantity of data that is produced every day, what are the drawbacks and challenges
to its adoption throughout the media? Data journalism incorporates an increasing set of tools, and
techniques and as journalists begin to be data-savvy and bring the technology into their workplace, is this
revolution helping create new opportunities for data consolidation into compelling stories?
Several overarching themes emerged during the Symposium and this report aims to create a snapshot of
the processes, incentives, organization and resources behind science data gathering for journalists. The
meeting focused on a handful of case studies and some of the new innovative tools and work
methodologies that were discussed should help to empower our science journalism community around the
world. Already stemming from this Symposium, some promising initiatives are underway. A group is
working to create a data journalism resource, such as a curated link library of data repositories, another
group is striving to deploy air quality sensors in a number of locations around the world, and some of
lessons learned have been integrated into the World Federation of Science Journalists (WFSJ) ongoing
projects. Last but not least, the WFSJ organized a couple of data journalism workshops at the World
Conference of Science Journalism in June 2015 in Seoul.
This report is not intended to be a static document. It is designed to be a “request for comments.” As you
read this report, we encourage you to ask questions, think of potential areas of development and critique.
We would appreciate if you could send us your notes to - [email protected] - so we can start a dialogue
between WFSJ and the wider science journalism community. We believe the challenges of data driven
science journalism are too important and too difficult for a single organization to assume that it has all the
appropriate answers.
So let the conversation begin. Let us connect as a community, look forward to the future, and embrace
change together. Hopefully some of the projects outlined in the document will open a new world of
possibilities.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
4 | P a g e
Please note: the report attempts to remain true to what was said during the invited presentations and
subsequent breakout sessions so as to reflect the group deliberations that occurred during the event.
We would like to thank the members of the Advisory Committee: Mariette DiChristina, Dan Fagin, Pallab
Ghosh, Robert Lee Hotz, Ivan Oransky, Ginger Pinholster, Richard Stone and Volker Stollorz. We also
thank David Secko and his team of rapporteurs: Reade Levinson, Alessandra Santiago, Shara Tonn, Diane
Wu.
Special thanks to the symposium project leader, Veronique Morin, for her hard work, advice and
dedication.
Yours truly,
Damien Chalaud
Executive Director
World Federation of Science Journalists
@WFSJ
James Cohen
Director, Communications & Public Outreach
The Kavli Foundation
@KavliFoundation
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
5 | P a g e
INTRODUCTION
What is the future of data mining for science journalists?
On February 16th
-18th
, 2015, 56 journalists and experts from 10 countries assembled in the heart
of Silicon Valley to openly discuss the future of data mining. The need for science journalists
worldwide to have powerful data mining tools emerged as a key issue during the 1st Kavli
Symposium on the Future of Science Journalism. The 2nd
symposium sought to advance a clear
vision for how to build new tools to help international collaboration amongst science journalists.
This report summarises the recommendations that emerged from the free exchange of ideas
during the discussion of current and future practice in data mining. No sessions during the
workshop were tape recorded, so the report was constructed from notes taken during the day.
The organizers welcome any additional reader comments.
RATIONALE FOR THE 2ND
SYMPOSIUM
The 2nd symposium discusses the tools of data mining in the context of science news and their
benefits to increase international collaboration amongst the science journalism community.
Joining the international perspective of our participating journalists where experts who can help
us identify the promises, problems, and opportunities of data mining tools.
KEYNOTE SPEAKERS
The symposium was kicked off and stimulated by two keynote speakers:
Alexander de Sherbinin, (Columbia University) and Co-Chair of the CODATA Task Group on
Global Roads Data Development
Title: “Challenges of Accessing Big Data”
A debate has arisen as to the utility of investments in data and monitoring for the SDGs versus
direct investment in development interventions. Drawing from experience working in different
data domains and in a number of countries, Dr. de Sherbinin argued that data and monitoring are
vital for gauging progress towards goals and also contribute to an informed civil society.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
6 | P a g e
He also provided examples of data products from CIESIN and third party holdings that can
inform public debates about the future we want.
Vincent McCurley, Creative Technologist, National Film Board of Canada
TITLE: “Data Storytelling in Interactive Documentaries”
Creative Technologist Vincent McCurley presented data-driven interactive projects created and
produced by the National Film Board of Canada and discussed their approach to storytelling. He
demonstrated how data is mined and weaved into stories to engage the audience.
WHAT WAS DISCUSSED?
The recommendations were generated from sessions in four topic areas. Each topic was
approached through the discussion of example tools and related case studies.
Session Topic Area Example Tools Case Study
1 New Tools Metromaps Health and Infectious Diseases
2 Applications Quakebot Geojournalism and Ecohealth
3 Platforms IPython IPython in Storytelling
4 Investigative Data tools and storytelling The Use of Overview
HOW WAS IT DISCUSSED?
After presentations on each topic, the participants were broken into four working groups for
intensive discussion. The goal was to examine the opportunities, roadblocks, potential uses of the
tools presented, and how they can contribute to the enhancement of international collaboration
amongst science journalists. Working groups were asked to propose key messages at a final
plenary session, where three recommendations were chosen for future advancement.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
7 | P a g e
Session 1 Report: New Tools / Health and Infectious
Diseases
Session Leader: Volker Stollorz
Contributing Team: Alan Boyle, Kathryn Brown, Tien Dung Bui, Damien Chalaud, Christina
Elmer, Richard Harris, Chul Joong Kim, Kathryn O’Hara, Thuy Huong Pham, Dafna Shahaf,
Ron Winslow
Rapporteur: Shara Tonn
KEY CONCEPTUAL POINTS
It is not enough to have data, you need to be able to understand it
Information overload is a key driver for wanting better, usable data mining tools
The desire is to give structure to a data set so that you can learn its pattern, major
components and storyline
PLENARY SESSION
Session 1 examined new data mining tools in the context of health and infectious diseases. The
session began with a presentation by Dafna Shahaf, Stanford University InfoLab, who spoke
about her work on Metromaps. This research tool is capable of organizing data (e.g. a set of
documents) into a set of major storylines that is outputted as a map. The map is designed to have
coherence (puts documents into an ordered chain), coverage (maps will cover a diverse set of
topics from the input documents) and connectivity (maps will show how storylines are
connected). These maps are not about small details. Metromaps provide the “big picture” and are
most useful to high level stories with one dominant storyline. One eventual goal of such maps is
to provide insight into large data sets in a way that leads to surprise (data analysis showing new
things) and plausibility (the data supports the new surprises well). Key questions raised by the
group included: How much annotation is required for Metromaps? Could it be used to query the
scientific literature? How plug and play is it? Can it pick out jargon? Can it be personalized?
How do you program surprise?
Christina Elmer, Spiegel Online, spoke next about tracing epidemics and whether journalists’
could use data mining to detect new waves of infectious disease outbreaks. Detection, however,
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
8 | P a g e
requires data. Elmer suggested this data might come from official statistics, reliable information
from sources, or side effects related to an outbreak. Tools such as Outwit Hub, Import.io, and
Tabula could be used to structure found information and make it possible to visualize it, but this
does risk only asking questions you can answer from the data you have. Elmer warned that there
is a big data hubris to be aware of, where most big data is biased and prone to manipulation. Big
data can therefore only be used as a clue. Key questions raised by the group included: What data
sets are there? Has anyone gotten stories this way? Could a journalist detect an epidemic this
way? Is there a way to share real-time data? What are the challenges of getting data?
WORK GROUP 1 DISCUSSION
The work group for Session 1 had a vibrant discussion that began with two interconnected
questions: (1) Where are the data sets that could help with tracing epidemics? (2) How should
journalists convey the data sets in an interesting way for the reader? Each participant spoke of
what they saw as the challenges and potentials in data journalism.
Key questions from Group 1: Challenges and Potentials in Data Journalism
Where do the data sets live? Which datasets help us with early detection?
I’m interested in infographics. How do you make science stories relevant and easy to
read? What tools can you use?
How can you translate data journalism into Vietnam’s journalism world?
I like the idea of using metro maps for tracing publications on certain topics. How do you
build a network like the geojournalism group (but for health) that could connect with
local journalists and break stories? What kind of data is there and what could be do with
it?
It would be nice to have hands on regular training with tools for data journalism. It would
also be nice to have a much better sense of the data universe. Where is everything?
How do you build datasets in countries without resources? We can also ask communities
what they want to know and go from there.
How do you train local journalists in third world countries? How can we make tools that
are relevant to the locals as well as journalists in other countries?
We need to think of ways to upgrade the quality and quantity of science information
that’s reaching the public. How can we keep science writers informed and then get the
info out? Metro Maps could also be helpful for early detection or for pre-discovery—
going back into coverage and reports to elucidate what is happening now such as fleshing
out the big picture of tracking diseases.
How do I connect with journalists and other outside professionals earlier in the research
process? How can we share datasets?
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
9 | P a g e
The group explored these questions, firstly, by discussing various issues with datasets. These
issues revolved around legal restrictions and a lack of clarity on how to access data. It was clear
that some sort of clearinghouse for datasets and epidemiological case studies of how to use them
would be useful for science journalists. If was, however, noted that local data can be unreliable
and that rules of access can vary widely. It was further noted that the line between research,
science and journalism is fuzzy with data journalism.
The group then explored the potential of Metromaps as a tool for science journalists, keying on
the need for functional tools that many journalists could employ. This was discussed with
reference to working with Dafna Shanhaf, the potential of Metromaps as way to connect the dots
on a story, and the challenges of committing to and funding its development. The group felt this
goal could not be realized without pairing it with the development of a fixed group of journalists
that is able to provide regular, hands-on training with a focus on science topics. This type of
synergy might be initiated with primers or briefing portals for journalists on how to cover
controversial issues with data. The end goal was seen as a new tool (e.g. a version of Metromaps)
able to illustrate science stories with data as well as the resources to help guide journalists in how
to report with it. The group discussed the importance of documenting how science journalists use
the proposed new tool so that others have case studies to follow.
WORK GROUP 1: SUGGESTIONS FOR ADVANCEMENT
The breakout group suggested the following be advanced:
Key Idea Description Goal
Create a Clearinghouse Broad clearing house that
includes health and
environment datasets
Create an interface and a
curated list of datasets
Bring people and data sets
together to develop better stories
Develop Metromaps Create a working group of
journalists to partner with
Dafna Shanhaf
Develop a flexible prototype for
science journalists to map
emerging stories
Skills Training for Data
Mining Tools Intermingle data mining
experts and journalists
Skills training with the tools
needed for stories
Build capacity for people to use
the clearinghouse and new tools
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
10 | P a g e
Session 2 Report: Applications / Ecohealth and
Pollution
Theme Leader: Thomas Hayden
Contributing Team: Irma, Curtis Brainard, Alex DeSherbinin, Dominique Forget, Robert Lee
Hotz, Veronique Morin, Subhransu Priyadarsini, Willie Shubert, Harry Surjadi, Mariko
Takahashi
Rapporteur: Reade Levinson
KEY CONCEPTUAL POINTS
There are common problems in getting anything done (e.g. time, attention, etc.) and
computers should be used to alleviate these problems
We have too little data that matters and lots of data that does not matter (there needs to be
active data collection from journalists)
We need to lower the cost of finding stories with data
We need to begin to discuss the best ways to personalize a story found with big data
We need to push for sharing and openness in data mining
PLENARY SESSION
Session 2 examined applications in the context of ecohealth and pollution. The session began
with a presentation by Dan Nguyen, Stanford University, on teaching computers to solve
fundamental problems in reporting. Nguyen began by discussing the example of computers
playing chess, suggesting that the most powerful opponent is a human + computer combination
(e.g. average people with computational understanding that have computer power available to
them). Nguyen characterised these players as centaurs, human/computer hybrids that play better
together. Nguyen raised the question of how to become a centaur in journalism? He provided the
example of Quakebot, an algorithm written by Ken Schwencke that can gather information and
is known for alerting its creator to a story about an earthquake while he slept. Quakebot was
characterized as supplemental. It is designed to save time and make things more interesting, not
to replace journalists. Nguyen suggested the key point in data mining was to accept what
computer/algorithms were designed to do, and to find ways to apply these features to journalism.
Data should be used to begin the assessment of an issue and journalists should then step in.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
11 | P a g e
Willie Shubert, Internews Network, spoke next and provided a conceptualization of the future
of science journalism where data, and techniques to make data accessible, are key drivers for a
journalist’s work. Shubert suggested the desire is to use data to give a global context to a story,
followed by journalists building off the data to do local stories. Geojournalism is an example of
this desire. Shubert spoke of geojournalism as storytelling that combines data visualization,
environmental data and geo-tagged stories to create visualization maps of news stories that use
data as their evidence. These efforts are not just about data. They seek a combination of data and
storytelling. Infoamazonia is an example of how this works. It started with a picture from
NASA, included journalists in various countries who collected data, and resulted in interactive
maps about environmental issues in the Amazon basin. Shubert noted that Infoamazonia is an
open repository of data visualizations that continues to be built on (see: geojournalism.org). Key
questions raised by the group included: What type of training is needed to do geojournalism?
How do you fund it? What is the workflow? Who is in such a network? How do you decide
which stories to cover? Could geojournalism be a model applied to other areas?
WORK GROUP 2 DISCUSSION
The work group for Session 2 began by noting that journalism is no longer an exclusive pursuit
and is being done by people who are working out of their backpacks. Whatever the group
decided should be relevant to this broader community. Alex de Sherbinin and Willie Shubert
agreed to work on a list of resources for interested science journalists (see Appendix D).
The group discussed that one simple thing that could be done is to create a platform to which
people all over could contribute data in every form and it could be archived under broad topics.
This platform would help science journalists get over the challenge that they may not have good
data to access, could include a handbook on data science journalism, and a list of 10-12 stories
that used data in an interesting way to show people the possibilities of what sort of data might be
available. The group discussed that this platform should (i) include a spatial data search so that
datasets could be located by geographical region, (ii) be reusable and (iii) be open source. The
goal was to position the group and the WFSJ as active in open source journalism where tools for
data storytelling are created together and publicly available.
“In Canada, we don’t have access to that much data. Not like, in
the US, anyway, where you have this open policy. In Canada, a lot
of the data is very hard to access.”
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
12 | P a g e
It should be not for profit and not proprietary. The group did not see the value in creating a “one
off” data journalism project as it was time to invest in reusable, open source technologies.
The second key idea to emerge from the group’s discussion was that journalists need to take a
role in organizing the data and that the divide between hard core, data journalists and those that
want to use data is narrowed. There was debate over whether science journalists should be
collecting raw data and whether this was crossing a line into activism.
The group noted that data mining is just another way of getting information, and learning the
tools of data mining is sort of like learning how to conduct an interview. The group saw an
important need for education. The group noted that science journalism was facing a “whole new
set of skills” to learn and will require engagement with the technical community to learn them.
WORK GROUP 2: SUGGESTIONS FOR ADVANCEMENT
The breakout group suggested the following be advanced:
Key Idea Description Goal
Clearinghouse for Data Collate and curate data sources
and tools
Add examples that are inspiring
on how the data is used
Create a go to place for science
journalists to identify major,
reliable data sources and, over
time, curate them
Training Build on the Geojournalism.org
model and integrate into WFSJ
training
Have a list of journalists willing
to do one-on-one training
Build capacity for people to use
the clearinghouse and tools
New Data Generation Pilot a project between EJN and
WFSJ for collection of new data
for storytelling
Create local data journalism and
the documentation of how a
global pilot can be accomplished
“Right now, we all just make information. We don’t really
participate in [data] collection. You’ve got to have journalists
collecting raw data.”
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
13 | P a g e
Session 3 Report: Platforms
Theme Leader: Richard Stone and John Bohannon
Contributing Team: Tim De Chant, Jim Giles, Fred Guteri, Brandon Joo, Eun Sung Kim,
Thomas Lin, Jason Palmer, Fernando Perez, Aditi Risbud
Rapporteur: Alessandra Santiago
KEY CONCEPTUAL POINTS
We have a desire to better understand how tools can be used by journalists (tools are
often built for understanding science and not journalism)
Our emphasis should be on combining computer analysis with storytelling
There is a need to move away from static stories to interactive stories, ones were the
underlying data can be manipulated by others after a story is done
The reason to move to platforms for data mining is that once a problem is solved on
them, it is solved forever and for everyone
PLENARY SESSION
Session 3 examined platforms used for data mining. The session began with a presentation by
Fernando Perez, UC Berkeley, on the data mining platform IPython. The original purpose of
IPython was to provide insight into data (Perez was also concerned with reproducibility and
education). Perez commented that, in giving his talk, he was interested in how to engage with
journalists and how to create a link between storytelling and data processing. He discussed
IPython as a web application that works in any browser and allows exploratory data analysis;
with it, you can run code, complete data analysis, add explanatory text and provide data outputs
in one place. Perez noted that IPython allow interactive “notebooks” to be created – plain text
documents that record your results – that provide an interactive space for working with data and
are easily shared. The desire with IPython is to provide a platform able to lower the barrier to
sharing a dataset and how it has been analyzed. Key questions raised by the group included:
With notebooks, what are the copyright issues? Would notebooks help with openness?
John Bohannon, Science Magazine, spoke next and discussed the use of IPython in storytelling.
Bohannon spoke about the reason he learned to code: He was always dependent on others to
open, analyze and work with the data. This made it hard for him to meet deadlines and added
stress to story production, so he took it upon himself to learn.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
14 | P a g e
These skills have led to stories such as an examination of which science journals will publish
fake articles. Bohannon provided the following advice: If you want to code, there will be pain,
but IPython take 98% of it away. In particular, once you solve a problem with IPython, it is
solved for every other story and can be done again easily. Bohannon ended by suggesting a
platform such as IPython might be the start of a movement in “reproducible journalism”, where a
data story gives all the components that went into it. Key questions raised by the group included:
Why use IPython over other platforms? Could it be adapted further for journalists? Would
reproducible journalism ever work?
WORK GROUP 3 DISCUSSION
The work group for Session 3 conducted a hack-a-thon with IPython where everyone did
something on a computer after coming up with one interesting question as the beginning of a
story. The group explored topics such as (a) how many drugs on the market are actually
cannabinoids, (b) self-reported health information inaccuracies in research studies, (c) how to
explain complex research coming out of math and physics, and (d) examining how scientists
engage with the public. This process lead the group to discuss the need for IPython, if it was ever
to become an accepted platform, to be taught by developers and journalists working together.
The group felt it was important to reach out to healthcare journalists and political journalists and
design a 3-day workshop to learn how to code/hack. Various training resources -
datajournalism.stanford.edu, datajournalismhandbook.org, datadrivenjournalism.net - were noted
as important to this endeavor. The overall message was that people are truly interested in
platforms for data mining, but few know how to use them, making it important to empower
journalists and build support in newsrooms.
WORK GROUP 3: SUGGESTIONS FOR ADVANCEMENT
The breakout group suggested the following be advanced:
Key Idea Description Goal
IPython workshop 3-day summer camp for data
journalism
To show and tell
Framework for scraping Work with developers to
build a better model of
IPython Notebook
Obtain a platform anyone,
anywhere can use easily
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
15 | P a g e
Session 4 Report: Investigative
Theme Leader: Deborah Blum
Contributing Team: Pallab Ghosh, Manuel Lino, Vincent McCurley, Margaret Munro, Dan
Nguyen, Cheryl Phillips, Jonathan Stray, Dana Topousis
Rapporteur: Diane Wu
KEY CONCEPTUAL POINTS
Real data is messy and investigative journalism does not use “clean” data set like
university researchers
Stories do not just emerge from data
The problems of data journalism are poorly characterized
We need new ways of finding patterns in data
We need data literacy first, and then ways to study data
PLENARY SESSION
Session 4 examined data mining for investigative science journalism. The session began with a
presentation by Cheryl Philips, Stanford University, who spoke about data tools and
storytelling. Philips discussed the desire to know when to code during data mining investigations
and when to use non-coding tools. She suggested there is a history to using new tools in
investigative reporting, but it takes time to adapt to them. Journalists therefore need time to
develop a tool box and the expertise on how to use it. Philips suggested there are three key
elements of adapting to data mining: (i) getting the data; (ii) organizing the data; and (iii)
analyzing the data. She noted that these steps might use tabula, comentdocs, open refine, tableau,
highcharts, silk, javascript, Access, MySQL, Python, R, dataminer, import.io, downthemall. But
the point is knowing what data is out there and how reliable it is.
Jonathan Stray, The Overview Project, spoke next about how the documents used in journalism
are getting longer and longer. He questioned how a journalist will be able to read them all. Stray
suggested we need algorithms for stories with the idea of doing journalism through science. The
example given was the tool Overview, which uses clustering to find story trends. The value of
Overview, said Stray, is that it orients a journalist (user) to the important material in a data set,
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
16 | P a g e
which otherwise would be too large or time consuming to examine. Overview is therefore an
investigative tool for story discovery within very large document sets. Stray went on to speak
about the lessons he learned from creating Overview, namely that (a) how people work with
documents (their workflow) is more important than the algorithm for a data mining tool, (b) it is
about combining humans and machines to get the story done on time, and (c) real data is messy.
Investigative journalists, in particular, want to analyze the inputs into journalism (e.g. source
material) not the outputs of journalism (e.g. NYT articles), but many tools are not designed to
cope with the input data investigative journalists need. As such, Stray suggested the problems of
data journalism are poorly characterized and developers have failed to realize that journalists
don’t explore data randomly, they actively search for things (i.e., the investigative issue in data
mining is how to improve journalists searching for useful information). In the end, Stray asked:
How we would you know if a data mining tool was good? He suggested asking three questions:
o How many stories got done with the data mining tool? (Did it give journalistic outputs…)
o How long did it take to use the data mining tool? (Faster for journalists to use it…)
o What happened after it was used? (Was there an impact on the world...)
WORK GROUP 4 DISCUSSION
The work group for Session 4 was a lively discussion of the rich field of data journalism. The
group framed their discussion with the following questions: How can science journalists do more
investigative reporting? Can data journalism help? Have any of you done it?
The group discussion of these questions noted that data users are largely investigative reporters.
It was suggested that the institution you work for, or the position you have, changes your access
to resources and your time to investigate. It was discussed that science journalists may not want
initially to move to more investigative journalism, but, that having access to freely available
databases could be a game-changer for all science journalists. It was suggested that we therefore
need to make a list of 15 datasets, such as the EPA toxic release database, that could serve as a
Swiss-army knife for science journalists. This should be combined with example stories that
have emerged from these datasets and tutorials for how to use them.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
17 | P a g e
Data Journalism Starter Kit for Science Journalists
1. Data Sets Relevant to science
Non-duplicative of IRE
International
2. Toolkit Tools: tabula, comentdocs, open refine, tableau, highcharts, silk, javascript,
Access, MySQL, Python, R, dataminer, import.io, downthemall, etc.
Clustering information with Overview
Visualization tools
3. Training modules Case studies
Training videos; tool-based and skills-based
Webinars (live tutorials)
Translated to make it global
The group noted how there are available tools, but science journalists have not dived into their
use. The group wanted to help science journalists get into data mining with a curated library of
already publicly accessible data that is science specific (perhaps hosted by the Knight Science
Journalism program at MIT or AAAS’s Eurekalert). Resources are there; the group discussed
putting them up in a curated, topic-organized international format. It was noted that it is
important people use them, so how do we get the word out? How do we get science journalists
excited about this? The group discussed how this could be done in stages with a possible demo
at NASW meeting in October in Boston or WFSJ in South Korea. The group felt it important to
network outside the science journalist bubble (e.g. with IRE and NICAR). The goal for the
group: to enable more science journalists to use data journalism, with the expectation that
more of data journalism is going to mean better science journalism.
WORK GROUP 4: SUGGESTIONS FOR ADVANCEMENT
The breakout group suggested the following be advanced:
Key Idea Description Goal
Curated library Topic organized library of
data with an international
slant
Get science journalists excited
about data mining
Demos and tutorials Create a demo for training
Include video tutorials
Obtain a platform anyone,
anywhere can use easily
Prototype training workshop Network science journalists
with other organizations
Examine how to do this
internationally
Get organized
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
18 | P a g e
Summary and Outcomes
FINAL PLENARY DISCUSSION
The 2nd Kavli Symposium on Science Journalism: Data Mining ended with a group
discussion of the 11 suggestions generated from the four working groups (see the
suggestions at the end of each work group summary above). There was a clear overlap in
ideas and needs for the future of data mining in science journalism. Discussion of the
overlaps in suggestions from the four working groups allowed three collective
recommendations to emerge from the symposium.
FINAL RECOMMENDATIONS
1. Create a place on the web for data (clearinghouse)
The clearinghouse should be publicly accessible, have the ability to upload and download data,
and be curated. A volunteer working group was created to facilitate this recommendation.
2. Create opportunities for training in data mining
These training activities should be in-person and online. They should be specific to science
journalism but also providing networking opportunities with the data journalism community. It
was recommended that they be built around a specific project and include the creation of an
online hub where people can make connection with people and event.
3. Develop a demonstration project
Further discussion should be held to develop a project to showcase best practice in data mining
for science journalism. This might include work with EJN on environmental monitors or the
development of a Metromaps tool.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
19 | P a g e
Appendices
A. WORKING GROUPS
LAST NAME
FIRST
NAME POSITION
GROUP 1: HEALTH AND INFECTIOUS DISEASES
Boyle Alan Science Editor, NBC News
Brown Kathryn Head of Communication, Howard Hughes Medical Institute
Bui Tien Dung Science Editor, Youth Daily, Vietnam
Chalaud Damien Executive Director, World Federation of Science Journalists
Elmer Christina Editor, Science Department, Der Spiegel
Harris Richard Science Correspondent, NPR News
Kim Chul Joong Health/Science Editor, Chosun Ilbo Newspaper, South Korea
O'Hara Kathryn
Associate Professor, School of Journalism and Communication,
Carleton University
Pham Thuy Huong Reporter, Vietnam News Agency, Vietnam
Shahaf Dafna Researcher, Stanford University InfoLab
Stollorz Volker
Science Journalist, Frankfurter Allgemeine Sonntagszeitung,
Cologne
Tonn Shara Journalism Student, Stanford University
Winslow Ron Deputy Bureau Chief Health/Science, The Wall Street Journal
GROUP 2: ECOHEALTH / POLLUTION
Irma Reporter, Kompas Media Nusantara, Indonesia
Cohen Jim Director of Communications, Kavli Foundation
DeSherbinin Alex Co-Chair of the CODATA Task Group
Forget Dominique Science Journalist, Quebec Science
Hayden Thomas Lecturer, Earth sciences, Stanford University
Hotz Lee Science Journalist, The Wall Street Journal
Levinson Reade Journalism Student, Stanford University
Morin Veronique Science Journalist/Project Director of the Kavli Symposium
Priyadarsini Subhransu Editor, Nature India
Shubert Willie EJN, Internews
Surjadi Harry Freelance Science Journalist, Indonesia
Takahashi Mariko Science journalist, Asahi Shimbun, Tokyo
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
20 | P a g e
GROUP 3: PLATFORMS
Bohannon John Contributing correspondent, Science
Brainard Curtis Contributing editor, Columbia Journalism Review
De Chant Tim Senior Digital Editor, PBS Nova
Guterl Fred Executive Editor, Scientific American
Joo Brandon Conference Secretariat, WCSJ 2015, South Korea
Kim Eun Sung
Head of Public Relations Division, National Research Council
of Science and Technology, South Korea
Lin Thomas Editor in Chief, Quanta Magazine, Simons Foundation
Palmer Jason Science and Technology Correspondent, The Economist
Perez Fernando Physicist, Applied Mathematician, UC Berkeley (via phone)
Risbud Aditi
Science Communications Officer, Gordon and Betty Moore
Foundation
Santiago Alessandra Journalism Student, Stanford University
GROUP 4: INVESTIGATIVE
Blum Deborah Director, Knight Science Journalism Fellowships, MIT
Diane Wu Journalism Student, Stanford University
Ghosh Pallab Science Correspondent, BBC
Lino Manuel Editor, El Economista, Mexico
McCurley Vincent Creative Technologist, National Film Board of Canada
Munro Margaret Senior Writer, Science, PostMedia News
Nguyen Dan
Developer, ProPublica and Lecturer, Journalism, Stanford
University
Phillips Cheryl Hearst Professional in Residence, Stanford University
Stray Jonathan
Data Journalist, Computer Scientist and Head of The Overview
Project
Topousis Dana Head of Office of Legislative and Public Affairs, NSF
ROAMING
Fagin Dan Director, Science, Health & Environment Program, New York
University
Secko David Associate Professor of Journalism, Concordia University,
Canada
Stone Richard International Editor, Science
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
21 | P a g e
B. SYMPOSIUM PRIMER
Symposium Primer
Introduction
This document contains four primers for each of the themed sessions at the upcoming 2nd
Kavli
Symposium. Each primer was prepared by a journalism student from Stanford University under the
guidance of a session moderator and the project leader. These students have also kindly offered to serve
as rapporteurs during the symposium. They have written these brief documents to allow us to think in
advance about making use of data-mining tools for potential collaborative projects to empower science
journalism. Included are suggested readings to provide additional food for thought.
Goal of the Symposium:
Developing new tools to help international collaboration amongst science journalists.
Breakout sessions:
Group 1: Health and infectious diseases (Adapting the tools)
Group 2: Ecohealth/pollution (Identify the best tools, stories and storytelling)
Group 3: Platforms (training, increase collaboration, stories and storytelling)
Group 4: Investigative (Collaboration, discuss the best tools, complexity, security, sources)
This primer is intended to stimulate your thinking before the 2nd Kavli Symposium. As an
introduction, it is not intended to encompass all available information, viewpoints and
opinions. Instead, we expect you to bring your expertise, your views on missing but vital
topics, and your energy to the event to build knew knowledge on the topic.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
22 | P a g e
Session 1: DATA MINING / New Tools
Presenter and Moderator: Volker Stollorz, Science Journalist, Frankfurter Allgemeine
Sonntagszeitung, Cologne
Rapporteur: Shara Tonn
Journalists and other regular consumers of news are turning more and more to big data to make sense of
political, social, economic, scientific and technological trends. But not many tools can accurately and
efficiently organize datasets on the spot into easy-to-read and easy-to-search visualizations, which would
be a born to both professionals and laypeople. The task of fitting small nuggets of relevant information
into coherent big pictures remains difficult. This session will talk about how data mining and
computational tools help to avoid information overload and search for insights.
Dafna Shahaf, researcher at Stanford University InfoLab, will first talk about her work with Metro Maps.
With her team, Shahaf has built a search tool with information retrieval algorithms that can funnel a few
tens of thousands of documents (news, legal documents, books, research domains) into color-coded
storylines that resemble a subway map. Different lines of inquiry within a single issue or subject come
together, branch off, dead end or intertwine to show unfolding news or science stories, and viewers can
zoom in or out depending on the desired level of detail. With her computational approaches, journalists
and the audience may be able to search new topics, organize articles they want to read or discover how
retrieved documents are related. Metro Maps could be a new tool to search for insights in huge data sets –
be it the news or unfamiliar research domains.
Christina Elmer, editor for Data Journalism and Science Reporting based in Hamburg, will follow up
with a case study that looks at tools available to journalists tracking breaking public health news.
Focusing on the spread of Ebola in West Africa, Elmer will discuss the techniques that can lead to success
as well as the caveats that face those relying on data in a real-time, crisis situation.
Considering these two presentations, this session will explore questions such as how can you develop or
refine tools to be useful for different types of science stories? How do you ensure that sources are
unbiased and diverse? What tools are good for breaking news stories versus longer story timelines? How
can Metro Maps clarify scientific controversies for different audiences? How can you use these tools to
connect science journalists around the world? For example, could you compare the worldwide reporting
about Ebola by creating regional news metro maps?
References
Information Cartography: Creating Zoomable, Large-Scale Maps of Information. Dafna Shahaf, Jaewon
Yang, Caroline Suen, Jeff Jacobs, Heidi Wang, and Jure Leskovec.
Connecting the Dots Between News Articles. Dafna Shahaf and Carlos Guestrin.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
23 | P a g e
Session 2: DATA MINING / Applications
Moderator: Thomas Hayden, Lecturer, Earth Sciences, Stanford University
Rapporteur: Reade Levinson
What’s the problem?
The relationship between investigative journalism and big data should be extraordinarily fruitful. But,
says source TK, there is “too much material too difficult to obtain containing too little information”.
Common, frustrating problems include the challenges of extracting data from handwritten forms and
reports, dealing with search tools that “are not robust enough to find the patterns journalists might seek,”
and trying to integrate and compare information from widely disparate sources. Often, cost prohibits
indexing audio and video sources, meaning that large potential sources of data are effectively
inaccessible.
What’s possible?
Despite these challenges, today’s news feeds already hint at a new generation of investigative reporting.
For example, a recent series from Reuters, “The crisis of rising sea levels: Water’s Edge,” shows the
powerful role data manipulation promises to play in the future of journalism and how a simple, arithmetic
program applied to publicly accessible data can lead to a very compelling environmental story.
A 2011 article in the journal Communications of the ACM, “Computational Journalism,” argues that there
is enormous potential for public-interest journalism to thrive if computer scientists and journalists can
work together. “Being able to analyze and visualize interactions among entities within and even outside a
document collection…would give stories more depth, reduce the cost of reporting, and expand the
potential for new stories and new leads,” write the paper’s authors. Current projects in data mining could
“provide a breakthrough in investigative reporting” (Cohen et al. 2011).
What’s being done right now?
A number of news organizations, academic programs, and nonprofits are focusing data journalism
and working to build more ties between computer science and journalism. For example, the Earth
Journalism Network (EJN) is a global Internews organization working to empower local media. EJN’s
flagship GeoJournalism project began in 2012 with a partnership with the Brazilian non-profit O Eco.
Together, they created InfoAmazonia, an interactive digital map describing rainforest degradation in the
Amazon. In 2013, EJN published a map-based interactive program – the US Climate Commons project –
with the goal of creating media-friendly, accessible data about Climate Change.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
24 | P a g e
Further Reading:
Cohen, Sarah, James T. Hamilton, and Fred Turner. (2011). “Computational Journalism.”
DocumentCloud.org
Faleiros, Gustavo. (2013). “Geojournalism Handbook shows how to capture earth science knowledge for
reporting.”
Nguyen, Dan. (2010). “Scraping for Journalism: A Guide for Collecting Data.”
Thomson Reuters’ OpenCalais service: www.OpenCalais.org
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
25 | P a g e
Session 3 – DATA MINING / Platforms
Moderator: Richard Stone, International Editor, Science and John Bohannon, Contributing
Correspondent, Science
Rapporteur: Alessandra Santiago
Accessing shared data as well as data visualization and analysis is a key component of the reporting
process for science journalists. But how are these massive data sets best shared across the globe? The
speakers of this session seek to provide an answer to this question by introducing a new tool that aims to
enhance international data collaboration.
Fernando Perez is a physicist and applied mathematician at UC Berkeley. He is the creator of IPython,
an interactive data-mining platform. His talk will focus on the role the platform can play in facilitating
shared data analysis and visualization by integrating pre-existing online tools and journals. Perez will
speak about IPython’s multi-user capacity and interactive computing mechanism, as well as the story of
its creation.
John Bohannon, contributing correspondent for Science Magazine, will then provide a real-time
adaptation of one of his own stories using the IPython platform. His demonstration of the uses and
potential of IPython will provide further evidence for the universal applicability of the platform for
journalists across the globe.
This session seeks to provide insight into the usefulness of this particular platform in creating an easy-to-
use international data-sharing mechanism. The speakers will explore questions like: how can IPython be
applied as a science journalist? What are the specific uses of such a platform? How is this different from
pre-existing technologies and online data sets? How can data-sharing best be streamlined internationally,
and how can this tool better connect journalists around the world?
Further reading:
IPython founder details road map for interactive computing platform. Paul Krill.
Pick up Python. Jeffrey M. Perkel.
Interactive notebooks: Sharing the code. Helen Shen.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
26 | P a g e
Session 4 – MINING THE HIDDEN STORY / Investigative
Moderator: Debora Blum, Director, Knight Science Journalism Fellowships, MIT
Rapporteur: Diane Wu
As we have seen in many examples from previous sessions, projects that incorporate data mining and
analysis can yield powerful new insights. Effective visualization of these results can help you (and your
readers) understand an issue in a completely different way. But how do you actually go about doing it?
What are the essential skills and tools? Is it important to learn a scripting language, or to invest in
mapping software? The tools available to data journalists are numerous, often overlapping, and
potentially intimidating to navigate as a beginner.
Here, we break down the workflow into three steps: acquisition, analysis, and presentation.
Step Specific tasks Examples Some tools to consider
Acquisition Finding sources of existing
data, soliciting large data
sets, converting data to a
useful format, data cleaning
PriceCheck by KQED to
crowdsource data on health
costs
DocumentCloud, Google
Refine, EveryBlock
Analysis Calculating statistics,
arranging data
The Guardian’s Datablog
gives good examples of how
to answer questions using
data analysis
Excel, SQL, CSVKit, R,
Jigsaw
Presentation Visualizing analysis results,
making maps, creating
interactions between the
reader and the data
"Methadone and the Politics
of Pain" by the Seattle Times
Snake Oil Supplements? at
Information is Beautiful
Excel, Overview, R,
Google Fusion Tables,
Tableau Public, QGIS
References & further reading:
http://datajournalismhandbook.org/1.0/en/introduction_0.html Clear explanation of data journalism
https://knightcenter.utexas.edu/blog/00-13749-data-driven-journalism-explained Another concise and
clear summary
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
27 | P a g e
http://gijn.org/resources/data-journalism/ Comprehensive list of resources for data journalism; includes
blogs, books, and conferences
http://www.poynter.org/news/media-innovation/147736/10-tools-for-the-data-journalists-tool-belt/
Recommended tools for various aspects of data journalism
http://www.mulinblog.com/essential-data-journalism-skills-conversation-three-data-journalists/ Quick
interview with three data journalists (including session speaker Cheryl Phillips) on essential skills
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
28 | P a g e
C. EVENT PROGRAM
Monday Evening, 16th
February 2015
17:00 Informal Cocktail
18:30 DINNER BUFFET – Westwood Patio
19:30 Evening Speaker
Alexander de Sherbinin, Associate Director for Science Applications, CIESIN, Columbia
University, and Co-Chair of the CODATA Task Group on Global Roads Data Development
Title: “Challenges of Accessing Big Data”
Summary: As the amount of data is increasing at incredible rate; access to these data are
becoming equally increasingly difficult. Dr. de Sherbinin argues that they should be made readily
available to better monitor issues of global concerns in public health and the state of the
environment
Discussion moderated by Robert Lee Hotz, Science Journalist, The Wall Street Journal
Each participant introduces themselves
Discussion hosted by Pallab Ghosh, Science Correspondent, BBC
Tuesday, 17th
February 2015
07:30 BREAKFAST - Silver Creek Dining Room
08:20 Welcoming remarks – San Jose Room
Richard Stone, Chair of World Conference of Science Journalists (WCSJ) 2015, Seoul
08:30 SESSION 1 – DATA MINING / New Tools
Presenter and Moderator: Volker Stollorz, Science journalist, Frankfurter Allgemeine
Sonntagszeitung, Cologne
Speaker: Dafna Shahaf, Researcher, Stanford University InfoLab
Title: “The Aha! Moment: From Data to Insight”
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
29 | P a g e
Summary: This session will examine how big data is being harnessed for journalists by focusing
on Metro Maps of Information –structured summaries that can help us understand the information
landscape, connect the dots between pieces of information, and uncover the big picture. We also
focus on a possible framework for automatically uncovering insightful connections in data.
09:15 Case Study #1: Public Health/Infectious Diseases/Ebola
Speaker: Christina Elmer, Editor for Data Journalism and Science Reporting, Spiegel Online,
Hamburg
Title: “Tracing The Epidemic – How Journalists Can Detect Waves of Infection”
Summary: When viruses spread out like Ebola in West Africa, reliable numbers can at first be
rare. But for journalists, estimating the extent of an epidemic is crucial. This case study gives an
overview of techniques to detect, investigate and visualize infection waves. Which data mining
tools can be used in those cases? Which pitfalls can be hiding deep inside the datasets? And are
there further needs that could be met by innovative tools developed in the scope of science?
10:00 COFFEE BREAK
10:15 SESSION 2 – DATA MINING / Applications
Presenter and Moderator: Jay Hamilton, Hearst Professor of Communication and the Director
of the Journalism Program, Stanford University
Speaker: Dan Nguyen, leading developer in the field of data journalism, ProPublica and
Lecturer, Journalism, Stanford University
Title: “Teaching Computers To Solve Fundamental Problems In Reporting “
Summary: Because we are continually pressured to learn the latest in computers, or face
obsolescence, we often forget that we should be teaching computers and not just the other way
around. This session looks at how the challenges journalists face can be broken down into
computational tasks, leaving the most unique parts of investigation and discovery to the human
reporter.
11:00 Case Study #2: Climate Change/ Mapping rivers of the world
Presenter and moderator: Thomas Hayden, Science Journalist and Lecturer, Stanford
University
Speaker: Willie Shubert, Senior Program Coordinator, Internews Network
Title: “Geojournalism: Communicating Environmental Change With Earth Data”
Summary: The earth is undergoing a massive ecological change. Tracking the impact and
influences of these changes is becoming an increasingly important challenge for science
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
30 | P a g e
journalists. This case study focuses on the promises, limitations, and opportunities of using data
to provide audiences with the evidence and context needed to understand the Earth’s
environment.
12:00 LUNCH BUFFET – Westwood Patio
13:30 SESSION 3 – DATA MINING / Platforms
Presenter and moderator: Richard Stone, Science magazine
Speaker: Fernando Perez, Physicist, applied mathematician, UC Berkeley; IPython creator
Title: “Designing a Data Mining Platform”
Summary: Google teamed up with Mr. Perez to integrate Google Docs with IPython Notebook,
allowing journalists to not only share data but share their analyses/visualizations of the data.
IPython Notebook is fully integrated with all the online tools/platforms that journalists already
use. Fernando tells us the story behind the creation of the platform.
14:00 Case Study #3: Adapting the platform
Speaker: John Bohannon, Science Magazine
Title: “Who's Afraid of IPython Notebook?”
Summary: As a case study, science correspondent John Bohannon will adapt one of his latest
stories “Who’s afraid of peer review” using IPython platform. He will argue that this new data-
driven platform is currently the best tool for reporting, he will “take it for a spin”, identifying
some of its potential as well as its shortcomings.
14:45 COFFEE BREAK
15:00 SESSION 4 – MINING THE HIDDEN STORY / Investigative
Presenter and moderator: Deborah Blum, Director, Knight Science Journalism Fellowships,
MIT
Speaker: Cheryl Philips, Hearst Professional in Residence, Stanford University
Title: “Data Tools And Storytelling, How They Go Together”
Summary: Tools such as Excel, data visualization and mapping can help reporters find their
stories. Cheryl Philips provides examples from breaking news to medical investigations and she
will also suggest a few tools and programs to master.
15:45 Case Study #4: Finding data mining
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
31 | P a g e
Speaker: Jonathan Stray, Science Journalist, Computer Scientist and Head of The Overview
Project
Title: "Closing the Gap: From Computer Science Research to Published Stories"
Summary: Data mining algorithms hold huge promise for journalism but have not been widely
adopted. Obstacles include research work with unrealistically clean input, inattention to basic
usability issues, and poor understanding of the journalist's problems and workflow. Jonathan
Stray discusses facing these challenges in developing Overview, a visual document mining
platform that originated at the Associated Press in 2011 and is now widely used by investigative
reporters.
17:00 BREAK
18:00 DINNER BUFFET & Informal Group discussion – Westwood Patio
Moderated by Robert Lee Hotz, The Wall Street Journal
19:00 Evening Speaker
Vincent McCurley, Creative Technologist, National Film Board of Canada
TITLE: “Data Storytelling In Interactive Documentaries”
SUMMARY: Creative Technologist Vincent McCurley presents data-driven interactive projects
created and produced by the National Film Board of Canada and discusses their approach to
storytelling. He will demonstrate how data is mined and weaved into stories to engage the
audience.
Wednesday, 18th
February 2015
08:00 BREAKFAST - Silver Creek Dining Room
09:00 BREAKOUT SESSIONS (Four Groups)
Breakout sessions take a closer look at the four case examples. In small workgroups, participants
examine the opportunities, roadblocks, potential uses the tools presented, and how they can
contribute to enhance international collaboration amongst science journalists.
Group 1: Health and infectious diseases (Adapting the tools) – San Jose Room
Group 2: Ecohealth/pollution (Identify the best tools, stories and storytelling) – Almaden
Group 3: Platforms (training, increase collaboration, stories and storytelling) – Palm
Plaza 2
Group 4: Investigative (Collaboration, discuss the best tools, complexity, security,
sources) – Blossom Valley
12:00 LUNCH BUFFET – Westwood Patio
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
32 | P a g e
13:00 PLENARY DISCUSSION AND GOAL REVIEW – San Jose Room
A. Discussion
Each Work Group shares key points from breakout sessions
Discussion of main takeaways of the symposium
Moderator: Dan Fagin
B. Goal review
Solicit ideas, opportunities and even partnerships identified during the symposium;
Moderator: Dan Fagin
Solicit recommendations for continuing momentum at the 2015 World Conference of
Science Journalists; Moderator: John Bohannon
Solicit tool considerations/priorities for the World Federation of Science Journalists;
Moderator: Damien Chalaud
Solicit ideas for additional symposia
15:00 CLOSURE
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
33 | P a g e
D. DATA RESOURCES
List of Data Resources for Journalists on the
Environment and Sustainable Development
Produced for the World Federation of Science Journalists (WFSJ) by Alex de Sherbinin, CIESIN, The
Earth Institute at Columbia University ([email protected]), and Willie Shubert,
Internews ([email protected])
Aqueduct: http://www.wri.org/our-work/project/aqueduct
An online atlas of water resources and the risks to those resources
Biodiversity Media Network: http://biodiversitymedia.ning.com/
Aims to boost the quantity and quality of media coverage of biodiversity issues; hosted by IUCN, IIED
and Internews
Climate Central: http://www.climatecentral.org/
Find information on the impacts of climate change; Surging Seas map tool helps identify localities at risk
Climate Data: http://www.data.gov/climate/
Data related to climate change for America’s communities, businesses, and citizens
Earth Journalism: http://earthjournalism.net/
Tools to enable journalists from developing countries to cover the environment more effectively
Future Earth: http://www.futureearth.org/
An umbrella for all global change research; includes articles, graphs, and media contacts
Sustainability Competitiveness Index: http://www.weforum.org/content/pages/sustainable-
competitiveness/
World Economic Forum (WEF) index focused on the environmental side of economic competitiveness
Environmental Performance Index (EPI): http://epi.yale.edu/
2014 EPI country rankings and associated data; data explorer; case studies on indicators in practice
GeoJournalism: http://geojournalism.org/
Tools to produce multimedia stories or simple maps and data visualization to help creating context for
complex environmental issues
Global Forest Watch: http://www.globalforestwatch.org/
Satellite derived maps of deforestation data; create custom maps, analyze forest trends, subscribe to alerts,
or download data for their local area or the entire world
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
34 | P a g e
Human Development Report Data: http://hdr.undp.org/en/data
Human Development Index; Public Data Explorer; Multidimensional Poverty Index
NASA EarthData: https://earthdata.nasa.gov/
WorldView tool allows visualization of near-realtime imagery from NASA satellites related to fires, dust,
ash clouds, air quality, drought, floods, etc.; other mapping and visualization tools such as FIRMS for
fires; access to more than a dozen NASA data centers and associated satellite data products
Ocean Health Index: http://www.oceanhealthindex.org/
A quantifiable assessment of the capacity of our oceans to deliver benefits and resources sustainably
Population Estimation Service: http://sedac.ciesin.columbia.edu/data/collection/gpw-v3/population-
estimation-service
Hosted by SEDAC, draw a polygon around an area (e.g., a cyclone track or toxic release) and get the
population in that polygon in real time; iOS app under development
Population Reference Bureau (PRB): http://www.prb.org/
2014 World Population Data Sheet including interactive map, population clock, etc.
Socioeconomic Data and Applications Center (SEDAC): http://sedac.ciesin.columbia.edu
Gridded population data, poverty maps, infrastructure data (dams, nuclear plants, roads), PM2.5 maps,
national estimates of population and land area by climate/biome/elevation (PLACEv3); map gallery
available under Creative Commons licenses; country treaty participation data
Stockholm Environment Institute (SEI) Tools: http://www.sei-international.org/tools
Range of interactive tools on several global environmental issues
UNEP Live: http://uneplive.unep.org/
Environmental indicators on multiple issues, at multiple scales, in multiple formats
UNEP-WCMC Protected Planet: http://www.protectedplanet.net/
Search for protected areas and learn about natural features and species found in them
UNData: http://data.un.org/
This tool allows you to search for country level statistics on a huge range of topics from across all UN
agencies (e.g., UN Statistics Division, FAO, UNESCO, etc.)
Vizzuality: http://www.vizzuality.com/
Cool maps and graphics on a range of topics
World Bank Data: http://data.worldbank.org/
Includes the World Development Indicators and free and open access to data about development in
countries around the globe; thematic portals for agriculture and rural development, climate change,
environment and urban development
World Resources Institute (WRI): http://www.wri.org
A variety of data and information on the environment
WorldView: https://earthdata.nasa.gov/labs/worldview/
Near-realtime satellite imagery on wildfires, snow and ice cover, aerosol plumes, among others
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
35 | P a g e
Regional resources:
Ekuatorial: http://ekuatorial.com/en/datasets
Indonesia specific datasets on marine, forests, and natural disasters
InfoAmazonia: http://infoamazonia.org/datasets/
An aggregation of environmental datasets for the 9 countries of the Amazon basin including forests,
watersheds, industries and indigenous lands
Open Development Mekong: http://OpenDevelopmentMekong.net
Development tracker focused on the Mekong (to be launched Feb 24, 2015)
Third Pole Data Network: http://data.thethirdpole.net/
An open source geospatial database: a simple, searchable catalog of water-related datasets sourced from
leading organization monitoring water in Asia
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
36 | P a g e
E. PARTICIPANT BIOS
Irma, Reporter, Kompas Media Nusantara, Indonesia
Irma has been working for the last 10 years at Indonesian KOMPAS Daily, which
specializes in environmental issues. She was a participant of science fellowships such
as the Environmental Reporting Fellowship 2012 held by GIZ-Germany and SjCOOP
Asia held by WFSJ in 2013-2014. She won the Media Award Prize in 2013 for her
report on sustainable energy.
Deborah Blum, Director, Knight Science Journalism Fellowships, MIT
Deborah Blum is a Pulitzer-prize winning science writer for primate research and the
author of five books, such as the best-seller The Poisoner's Handbook. She writes a
blog about environmental chemistry for The NYT and has written for a wide range of
other publications including Wired, Time, Scientific American, Discover, The WSJ
and The Guardian. She is a co-editor of A Field Guide for Science Writers and the
2014 guest editor of Best American Science and Nature Writing. Blum is a former
president of the National Association of Science Writers (USA) and a former board
member of the WFSJ. She now serves as vice president of the Council for the Advancement of Science
Writing (USA). She was recently chosen to become the director of the Knight Science Journalism
Program at MIT and will assume that position in July.
John Bohannon, Contributing Correspondent, Science
John Bohannon is a contributing correspondent for Science. One of his projects was to
use Python to generate fake, fatally flawed scientific papers and to submit them to
hundreds of journals. About half of them were accepted for publication.
John also runs the annual Dance Your Ph.D. contest.
Alan Boyle, Science Editor, NBC News
As NBC News Digital's science editor, Alan Boyle focuses on developments in space
exploration and the physical sciences, plus paleontology, archaeology and
anthropology. Boyle is also the author of The Case for Pluto: How a Little Planet
Made a Big Difference. He has won awards from the National Academies, the
American Association for the Advancement of Science, the National Association of
Science Writers, Sigma Delta Chi, the Society of Professional Journalists, the Space
Frontier Foundation, IEEE-USA, the Pirelli Relativity Challenge and the CMU
Cybersecurity Journalism Awards program. He was in the 1990s a co-organizer of the
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
37 | P a g e
New Media for a New World conference for ex-Soviet journalists. Before joining NBC, Boyle was an
editor at the Seattle Post-Intelligencer, The Spokesman-Review in Spokane and The Cincinnati Post.
Curtis Brainard, Contributing Editor, Columbia Journalism Review
Curtis Brainard is the editor of the Scientific American blog network and a
contributing editor at Columbia Journalism Review where as a staff writer (2006-
2013) he covered science, environment, and medical news. He launched in January
2008 The Observatory, the CJR's first fulltime department dedicated to critically
analyzing science coverage in the media as well as the opportunities and challenges
facing science journalists. Brainard has written for The New York Times, The
Washington Post, The New Yorker, Popular Science and OnEarth magazine. He
serves on the executive board of the World Federation of Science Journalists since 2013.
Kathryn Brown, Head of Communication, Howard Hughes Medical Institute
Kathryn Brown joined Howard Hughes Medical Institute (HHMI) as Head of
Communications in 2013. She oversees communications and public relations
for the Institute, including overall strategy, media relations, web presence,
editorial services, outreach, and internal communications. Before joining
HHMI, Brown served for six years as vice president of marketing and
communications at The Conservation Fund, a top-ranked environmental non-
profit. Brown also has worked as a communications consultant to the National Academies and other
leading science institutions. An award-winning writer, she is a former contributing correspondent
for Science magazine and has written for Scientific American, Discover, Popular Science, New
Scientist, Technology Review, and other popular magazines.
Bui Tien Dung, Science Editor, Youth Daily, Vietnam
Bui Tien Dung is currently working as an educational and scientific editor for Tuoi
Tre (Youth Daily), one of Vietnam's leading dailies. The Educational and scientific
section are considered a top priority for Tuoi Tre. Before joining Youth Daily, Bui
was a lecturer at the Faculty of Journalism and Communications at the University of
Social Sciences and Humanities, Hanoi National University. He also worked as an
editor for Vietnam Economic News, the leading online newspaper VietnamNet and
some other publications on Vietnam. He participated as a mentor in SjCOOP Asia – the World Federation
of Science Journalists’ mentoring program. Bui Tien Dung currently works on setting up a network of
Vietnamese Science Journalists.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
38 | P a g e
Damien Chalaud, Executive Director, World Federation of Science Journalists
Damien Chalaud is the Executive Director of the World Federation of Science
Journalists. He graduated from the University of London – Goldsmiths College with a
Masters degree in Communications and a Masters degree in Journalism. From 1993-
1997 he was a journalist and producer at BBC Radio and the BBC World Service. In
1998 he joined the European Broadcasting Union in Geneva as Director of Eurosonic
satellite operations. In 2001 he was appointed Director of the cross-media platform at
RFO-France Télévisions. From 2004 to 2007 he became Director of content for the Radio France
CityRadio network in Paris. From 2008-2013 he has been a project manager and consultant for different
international broadcasters and web/mobile entities: BBC, CBC, Danmarks Radio, Radio-Canada, ARD,
RTE, Vodafone, O2, etc.
James Cohen, Director of Communications, Kavli Foundation
James Cohen is the communications head for The Kavli Foundation, which is
dedicated to advancing science for the benefit of humanity, promoting public
understanding of scientific research, and supporting scientists and their work. As
director of communications and public outreach, Cohen provides strategic
direction and oversight for all of the Foundation's communications initiatives
and programs, such as support of science journalism to helping scientists become
better communicators and a variety of direct public outreach activities. Prior to
joining the Foundation, Cohen was director of media relations at the University
of California as well as associate director of communications. He is a member of
the Author's Guild and Writer's Guild of America, West.
Tim De Chant, Senior Digital Editor, PBS Nova
Tim De Chant is the senior digital editor at NOVA where he is the founding
editor of NOVA Next, a digital publication featuring in-depth articles and
commentaries, and head of social media strategy. He has also written for
Wired Magazine, the Chicago Tribune, Scientific American, Ars Technica,
and others. Tim received his PhD in Environmental Science, Policy, and
Management from the University of California, Berkeley and BA in
Environmental Studies, English, and Biology from St. Olaf College.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
39 | P a g e
Alex de Sherbinin, Associate Director for Science Applications, CIESIN, Columbia University, Co-
Chair of the CODATA Task Group
Alex de Sherbinin is the Associate Director for Science Applications at the
Center for International Earth Science Information Network (CIESIN), an
environmental data and analysis center within the Earth Institute at Columbia
University. He also serves as Deputy Manager of the NASA Socioeconomic Data
and Applications Center (SEDAC), Chair of the ICSU CODATA Global Roads
Data Development Task Group, and Coordinator of the Population-Environment
Research Network (PERN), a network of 2,200 researchers from around the
world. Alex is a geographer who has published widely on the human aspects of
environmental change at local, national and global scales. In the past he served as a Population-
Environment Fellow at the International Union for Conservation of Nature (IUCN) and a population
geographer at the Population Reference Bureau (PRB).
Diane Wu, Ph D Candidate in Chemistry, Stanford University
Diane is a senior graduate student with a strong interest in pursuing science
journalism after graduation. She is managing producer of Green Grid Radio, a
student-run podcast and radio show on sustainability and the environment. Diane
has written for Nature, the Stanford Energy Journal, and the Cantor Arts Center
conservation blog. She recently produced a video about her research for the
American Chemical Society. Before graduate school she taught chemistry at Weill
Cornell Medical College in Doha, Qatar.
Christina Elmer, Editor, Science Department, Der Spiegel
Christina Elmer works as a data and science journalist at Spiegel Online in
Hamburg, Germany. Before that she was part of the investigative reporting team
at Stern magazine and worked as an infographics editor for DPA, the German
press agency. She also trains journalists in data reporting and online research.
Besides journalism, she studied Biology.
Dan Fagin, Director, Science, Health & Environment Program, New York University
A Pulitzer Prize-winning journalist who writes frequently about environmental
science, Dan Fagin is a science journalism professor at New York University,
where he directs the Science, Health and Environmental Reporting Program.
His book, Toms River: A Story of Science and Salvation, was awarded the 2014
Pulitzer for General Nonfiction, as well as the New York Public Library’s
Helen Bernstein Book Award for Excellence in Journalism, the National
Academies Science Book Award, and the Society of Environmental
Journalists’ Rachel Carson Environment Book Award. Dan’s recent
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
40 | P a g e
publications includes articles for The NYT, Scientific American, Nature and Slate. His new book
project is about monarch butterflies in the Anthropocene.
Dominique Forget, Science Journalist, Quebec Science
Dominique Forget is a contributing editor for Québec Science, a popular French-
Canadian science magazine. She also writes the health column for L'actualité, a
main news magazine, and works as a health reporter for Radio-Canada's radio show
Les Éclaireurs. She freelances for La Recherche, a French science magazine, and
other magazines. Many of her articles have had widespread recognition, including
major awards from the Québec and Canadian magazine industries. She's the author
of two books: Perdre le Nord? (on climate change in the Canadian Arctic) and
Bébés illimités. La procréation assistée et ses petits (on the topic of assisted
reproduction). Both books have received widespread national recognition.
Pallab Ghosh, Science Correspondent, BBC
Pallab Ghosh is a science correspondent for BBC news. He reports for BBC Radio and
Television News, The Today Programme, Newsnight, The BBC News Website and
The BBC News Channel. He began in 1984 at the British Electronics and Computer
Press before joining New Scientist as the magazine's Science News Editor. He joined
BBC News in 1989, where he became a Senior Producer on BBC Radio 4's Today
Programme. He is a former Chair of the Association of British Science Writers, and
was President of the World Federation of Science Journalists. He was part of a BBC
news team that won the Arthur C. Clarke award in recognition of BBC News's
coverage of Space. He has also won the Media Natural Environment Award and has been named BT
Technology Journalist of the Year.
Jim Giles, Mosaic, UK
Jim Giles is an editor and media consultant. In 2012 he co-founded Matter, a long-
form start-up that launched via a record-breaking Kickstarter campaign and was
later acquired by Medium, a publishing platform created by Ev Williams, one of the
founders of Twitter. Before starting Matter he wrote for The NYT, The
Atlantic, New Scientist, Nature and many other publications.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
41 | P a g e
Mark Glaser, Executive Editor, PBS MediaShift and Idea Lab
Mark Glaser is founder, executive editor and publisher of PBS MediaShift and Idea
Lab. He is a long-time freelance journalist whose career includes columns on hip-
hop, reviews of videogames, travel stories, and humor columns that poked fun at the
titans of technology. From 2001 to 2005, he wrote a weekly column for USC
Annenberg School of Communication's Online Journalism Review. Glaser has
written essays for Harvard's Nieman Reports and the website for the Yale Center for
Globalization. He has written columns on the Internet and technology for the Los Angeles Times, CNET
and HotWired, and features for many other publications. Mark Glaser won the Innovation Journalism
Award in 2010 from the Stanford Center for Innovation and Communication.
Fred Guterl, Executive Editor, Scientific American
Fred Guterl, executive editor of Scientific American, has been writing about science
for more than 25 years. He is author of The Fate of the Species: Why the
Human Race May Cause Its Own Extinction and How We Can Stop It .
Guterl led Scientific American to a General Excellence Award from the National
Society of Magazine editors in 2011 for the first time in its 169-year history. Guterl
was formerly deputy editor at Newsweek International, an editor of Discover and
IEEE Spectrum. He also worked as a foreign correspondent based in London,
England. Guterl has appeared on CNN, Charlie Rose, the Today Show, and “The Daily Show” with Jon
Stewart to discuss popular issues in science. Guterl holds a bachelor's degree in electrical engineering
from the University of Rochester, and has taught science writing at Princeton University. He lives in the
New York City area.
James Hamilton, Hearst Professor of Communication, Stanford University
James T. Hamilton is the Hearst Professor of Communication and the
Director of the Journalism Program. His books on media markets and
information provision include All the News That’s Fit to Sell: How the
Market Transforms Information into News, Regulation Through
Revelation: The Origin, Politics, and Impacts of the Toxics Release
Inventory Program and Channeling Violence: The Economic Market
for Violent Television Programming. He is currently working on a book about economic markets for
investigative reporting and a book about the information lives of low-income individuals. Through
research in the field of computational journalism, he is also exploring how the costs of story discovery
can be lowered through better use of data and algorithms.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
42 | P a g e
Richard Harris, Science Correspondent, NPR News
Richard Harris has been covering science for NPR since 1986. Over the years he
has covered a wide variety of subjects, from physics, biology, astronomy,
environment, earth science, climate policy and biomedicine (his current focus).
Among his many awards: three AAAS prizes, a Peabody, a share of the
NAS/Keck award, and an AGU Presidential Citation for Science and Society. He
is past president of the National Association of Science Writers and currently on
the board of the Council for the Advancement of Science Writing.
Thomas Hayden, Lecturer, Earth Sciences, Stanford University
Thomas Hayden teaches science and environmental journalism and
communication at Stanford University. He has been an oceanographer, a staffer
at Newsweek, and a senior editor at US News & World Report. His cover
stories have appeared in Wired, National Geographic, Smithsonian, and
many other publications. He is coauthor of On Call in Hell, a national
bestseller about battlefield medicine, and Sex and War, about the biological
and social evolution of warfare. He is co-editor of The Science Writer’s
Handbook: Everything You Need to Know to Pitch, Publish, and Prosper in the Digital Age .
Lee Hotz, Science Journalist, The Wall Street Journal
Robert Lee Hotz is a reporter at The Wall Street Journal where he covers
basic research issues. He is a Distinguished Writer in Residence at New York
University and also president of the Alicia Patterson Foundation. He is a
Fellow of the American Association for the Advancement of Science
(AAAS); an honorary life member of Sigma Xi, The Research Society; and is
a past president of the National Association of Science Writers. He is among
America’s most honored science journalists and shared in The Los Angeles
Times’ 1995 Pulitzer Prize for coverage of the Northridge Earthquake. He has
received awards from the National Academy of Science; The Society of
Professional Journalists, AAAS, the American Society of Civil Engineers, and
the American Geophysical Union, among others. He has traveled widely in
Antarctica, under the auspices of the National Science Foundation Office of Polar Programs.
Brandon Joo, Conference Secretariat, WCSJ 2015, South Korea
Brandon Joo is project manager at the conference secretariat for the 9the edition
of the World Conference of Science Journalists which will be held 8-12 June in
Seoul, Korea. Brandon worked previously as a public relations consultant for
information technology companies such as IBM Korea, Facebook Korea, and
LG-Nortel. He has for more than 4 years been a journalist on IT for the daily
newspaper The Digital Times. He has covered various IT issues, including
government policies, industry trends, products and articles on corporations such
as Intel, HP, Samsung, and LG.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
43 | P a g e
Eun Sung Kim, Head of Public Relations Division, National Research Council of Science and
Technology, South Korea
Eun Sung Kim is Head of Public Relations Division at the National Research
Council of Science and Technology, the umbrella organization of 25 public research
institutes in science and technology. She worked together with the Korean Science
Journalists Association to invite the WCSJ 2015 to Korea. Eun Sung is currently the
Operation Advisor of the events’ organizing committee. Her career includes 10
years in the convention and exhibition industry and 10 years in public relations.
Chul Joong Kim, Health/Science Editor, Chosun Ilbo Newspaper, South Korea
Born in 1963, he graduated the college of medicine, Korea University in 1990 and
got the medical doctor license. He had finished the resident course of radiology
department in Korea university hospital during 1991-1995 and got the radiology
board license which is certificated by Korean Medical Association. He has
worked for Chosunilbo Daily Newspaper as a fulltime medical journalist and
columnist since 1999. His newspaper company is the oldest and biggest
newspaper in Korea, and publishes 2.0 million copies per day. He is the President
of the Executive Board of WFSJ 2013-2015.
Reade Levinson, Journalism Student, Stanford University
Reade Levinson spends her summers in the Sierras teaching Leave No
Trace ethics to children. During normal school months, she is a junior in
the Earth Systems program at Stanford University, where she works as an
instructor at the climbing wall and competes on the university’s climbing
team. She studies climate change ethics and the cultural barriers to social
and political action in the United States. She is passionate about using
storytelling to cross cultural and ideological barriers and interested in the role that new forms of media
will play in shaping the future of environmental communication. The moment final exams conclude,
Reade sprints home to the mountains.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
44 | P a g e
Thomas Lin, Editor in Chief, Quanta Magazine, Simons Foundation
Thomas Lin is the founding editor of Quanta Magazine, an editorially
independent science news site published by the Simons Foundation. He
previously managed the online science and national news sections at The NYT,
edited the Scientist at Work blog, created the Profiles in Science video series,
produced the Science Times podcast, and wrote about tennis, science and
technology. He has also been a home page editor for The Indianapolis Star, a
reporter and photographer covering Queens, N.Y., a teacher and a mechanical
engineer.
Manuel Lino, Editor, El Economista, Mexico
Manuel Lino is the editor, and sometimes reporter, of the cultural section of the
Mexican newspaper El Economista, where he includes science topics as a part of
culture and also, given the nature of the publication, from a business perspective
(which in Mexico is kind of a movel subject). Lately he's also covering economic
sciences. He studied Biology, Music and Creative Writing and has a Latin-American
prize for a fictional short-story and a national one for a book of short-stories.
Vincent McCurley, Creative Technologist, National Film Board of Canada
The son of a chocolatier and a cosmetic chemist, Vincent naturally gravitated
toward a degree in Mechanical Engineering on a journey to becoming a Creative
Technologist at the National Film Board of Canada’s Digital Studio in Vancouver,
Canada. It's taken him 15 years of building multi-platform interactive projects
(apps, installations, games and websites) to be useful at the NFB where he calls
upon this experience to help filmmakers and artists use technology to tell stories in
creative and innovative ways. When not tinkering with the latest storytelling technologies, Vincent enjoys
sharing his knowledge of interactive experiences at the University of British Columbia, British Columbia
Institute of Technology and Fairleigh Dickinson University; and he still enjoys the occasional chocolate.
Véronique Morin, Science Journalist/Project Director - Kavli Symposium
Véronique Morin is a science journalist and Project leader of the 2nd Kavli/WFSJ
Symposium. She was president of the Canadian Science Writers’ Association
(CSWA) and the first president of the WFSJ. She is a Southam journalism fellow,
Massey College, University of Toronto. For the past seven years, she has worked
for the science magazine program « Le Code Chastenay » on the public network
Télé-Québec, writing freelance articles, as well as developing documentary
projects. Her documentary (idea and research) “Time Bombs”, about Canadian
soldiers who participated in atomic bomb tests, received awards of « Best documentary” from the New
York International Independent Film festival, « Best Documentary » from the Canadian Association of
Broadcasters, and nominated for “Best research” at the Gemeaux awards.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
45 | P a g e
Margaret Munro, Senior Writer, Science, PostMedia News
Margaret Munro is a senior writer for Postmedia News which reaches millions of
Canadians through newspapers and websites including the Ottawa Citizen, Montreal
Gazette, Calgary Herald and Vancouver Sun. Her work covering science – and science
controversies – has taken her to the Arctic to see the effects of global warming, to
First Nations communities to report on the diabetes epidemic and into Ottawa’s paper
labyrinth to reveal how the Canadian government has been muzzling scientists.
Munro’s honors include many writing and journalism awards and citations, including
runner-up for Canada’s World Press Freedom Award in 2013. Munro has taught science journalism at the
University of B.C., and served for several years on the board of the Canadian Science Writers’
Association and the Editorial Advisory Committee of the Science Media Centre of Canada
Dan Nguyen, Developer, ProPublica and Lecturer, Journalism, Stanford University
Daniel Nguyen is currently a lecturer in computational journalism at Stanford
University. Previously he worked as a news application developer at ProPublica, the
New York non-profit investigative newsroom, and as a beat reporter at the Sacramento
Bee.
Kathryn O’Hara, Associate Professor, School of Journalism and Communication, Carleton
University
Kathryn O'Hara is an Associate Professor in the School of Journalism and
Communication at Carleton University in Ottawa where she holds the School's
CTV Chair in Science Broadcast Journalism. Kathryn’s experience includes over
twenty-five years work in radio and television as a presenter, reporter, producer
and researcher, mainly in public broadcasting. She has served on numerous board
and panels, most recently the Council of Canadian Academies Expert Panel on
Research Integrity. Kathryn is the Treasurer of the WFSJ and the former president
of the Canadian Science Writers’ Association where she campaigned against the muzzling of Canadian
government scientists.
Jason Palmer, Science and Technology Correspondent, The Economist
Jason Palmer is a science and technology correspondent for The Economist, based
in London. He got into science journalism after years as an ultrafast laser scientist
in the US and the UK gave him a fear of dark laboratories. He started freelancing
in during a postdoc in Italy, returning to London in 2007 for internships at The
Economist and New Scientist. He moved to the BBC in 2008, where he was
science
and technology reporter until taking up a 2013/2014 Knight Science Journalism
fellowship.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
46 | P a g e
Fernando Pérez, Physicist, Applied Mathematician, UC Berkeley
Fernando Pérez (@fperez_org) is a staff scientist at Lawrence Berkeley National
Laboratory and and a founding investigator of the Berkeley Institute for Data
Science, created in 2013. He received a PhD in particle physics, followed by
postdoctoral research in applied mathematics, developing numerical algorithms.
Today, his research focuses on creating tools for modern computational research
and data science across domain disciplines, with an emphasis on high-level
languages, literate computing and reproducible research. He created IPython while
a graduate student in 2001 and continues to lead it as it evolves into the Jupyter
Porject, now as a collaborative effort with a talented team that does all the hard work. He regularly
lectures about scientific computing and data science, and is a member of the Python Software Foundation
as well as a founding member of the Numfocus Foundation. He is the recipient of the 2012 Award for the
Advancement of Free Software from the Free Software Foundation.
Thuy Huong Pham, Reporter, Vietnam News Agency, Vietnam
Pham Thuy Huong has been a reporter at Vietnam New Agency since 1999. Her
interests are Environment, Science and Education, mostly around the impact of the
environment to human health and the development of educational and science policy.
Currently she is an editor and chief of Department on Specific Issues. She was
selected to be a participant on SjCOOP Asia, a science journalism mentoring program
by WFSJ in 2013-2014.
Cheryl Phillips, Hearst Professional in Residence, Stanford University
Cheryl Phillips is a Hearst Professional in Residence at the Department of
Communications at Stanford University. She teaches data journalism and watchdog
reporting and is helping to found the Stanford Computational Journalism Lab. She
previously worked at The Seattle Times from 2002-2014. Her most recent position
in Seattle was as Data Innovation Editor. She has worked at USA Today and at
newspapers in Michigan, Montana and Texas. Cheryl has taught data journalism and data visualization at
the University of Washington and Seattle University. She also served for 10 years on the board of
directors for Investigative Reporters and Editors, a grassroots training organization for journalists and she
is a former IRE board president. Twitter: @cephillips
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
47 | P a g e
Ginger Pinholster, Director, Office of Public Programs, AAAS
Ginger Pinholster is the Director of Public Programs at the American Association for
the Advancement of Science (AAAS), which publishes the Science family of
journals. Earlier, she had served as national media relations manager at the
University of Delaware; deputy media relations manager at the National Academy of
Sciences; and media specialist at the Georgia Tech Research Institute. After
receiving her undergraduate degree in English from Eckerd College, she freelanced
for the St. Petersburg Times, and later held staff positions at the Northeast Georgian, the Athens Daily
News, and EE News. Her freelance news articles have appeared over the years in Science, Popular
Science, Environmental Health Perspectives, Omni, and elsewhere. She received her M.F.A. degree from
the Queens University at Charlotte, and she is a Fellow of AAAS.
Subhransu Priyadarsini, Editor, Nature India
Subhra Priyadarsini is an award winning science journalist currently running Nature
Publishing Group’s (NPG) India operation Nature India. Subhra has been a covering
politics and sports, fashion and films, crime and natural disasters in the mainstream
Indian media for over a dozen years. She has been a correspondent with major
Indian dailies and briefly for the Observer, London. Subhra received the BBC World
Service Trust award. She is a regular contributor to BBC Radio’s Hindi science
programme ‘Vigyan aur Vikas’ (Science and Development). Subhra won acclaim in India for her
coverage of the Orissa super cyclone in 1999 and the Indian Ocean tsunami in 2004.
Aditi Risbud, Science Communications Officer, Gordon and Betty Moore Foundation
Aditi is the science communications officer at the Gordon and Betty Moore
Foundation, where she provides communications expertise and guidance to the Science
Program. She also supports foundation-wide communications activities. Prior to
joining the foundation, Aditi was director of the Communications, Leadership, Ethics
and Research (CLEAR) program and a lecturing assistant professor of electrical and
computer engineering at the University of Utah. She previously held science
communications roles at public relations agency Weber Shandwick and the Molecular
Foundry at Lawrence Berkeley National Laboratory. She received a B.S. in materials science and
engineering from University of California, Davis, a Ph.D. in materials from University of California,
Santa Barbara, and a certificate in science communications from University of California, Santa Cruz.
Alessandra Santiago, Journalism Student, Stanford University
Alessandra Santiago is a Masters Student in the Earth Systems Program at Stanford
University. Her academic work involves advanced climate system modeling using 3D
animation to convey complex scientific concepts. Currently, she is working as an
independent filmmaker, partnering with UN institutions and governmental groups to tell
the story of island nations affected by climate change.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
48 | P a g e
David Secko, Associate Professor of Journalism, Concordia University, Canada
David Secko is an Associate Professor of Journalism at Concordia University
(Montréal). He previously worked as a journalist for The Scientist magazine and
Vancouver’s Tyee. Dave now studies science journalism as a scholar and is the leader
of the Concordia Science Journalism Project (www.csjp.ca). Examples of his recent
articles include the definition and testing of four models of science journalism
(Journalism Practice 7(1), 62-80; Journalism Practice 8(6), 789-808), a qualitative
metasynthesis of the experiences of a science journalists (Science Communication 34(2), 241-282) and a
narrative analysis of online commentary after science stories (Journalism 12(7), 814-831). He won a
University Research Award for his research contributions in 2011, the Dean’s Award for excellence as a
new scholar in 2010 and was awarded the Hal Straight Gold Medal in Journalism from UBC’s School of
Journalism in 2006. Dave was originally trained as a microbiologist at the University of British Columbia.
Dafna Shahaf, Researcher, Stanford University InfoLab
Dafna Shahaf is currently spending a year at Microsoft Research before joining the
faculty of the Hebrew University in Jerusalem. Prior to that, she was a postdoctoral
fellow at Stanford University. She received her Ph.D. from Carnegie Mellon University,
an M.S. from the University of Illinois at Urbana-Champaign and a B.Sc. from Tel-Aviv
University. Dafna's research focuses on helping people make sense of massive amounts
of data. She has won a best research paper award at KDD 2010, a Microsoft Research
Fellowship, a Siebel Scholarship, and a Magic Grant for innovative ideas.
Willie Shubert, EJN, Internews
Willie Shubert is the Program Officer for Internews' Earth Journalism Network. As a
coordinator of a global network of environmental journalists, Willie helps make tools
that enable people to connect with each other, find material support, and amplify
their local stories to global audiences. In his previous position at National
Geographic Magazine, he coordinated translation for the magazine's 32 local
language partners. He holds a degree in Geography from Humboldt State University
with concentrations in cartography, environmental economics, and Chinese Studies.
Outside of work, he devotes his time to the development of a free school dedicated to community building
through education and to collaborative mapping and audio projects.
Andreas Sperling, Editor, NDR Info
Andreas Sperling was born in 1965, living in Hamburg/Germany. He is editor and
anchor for NDR Info / german public radio and tv. He is an "allrounder" but he is also
responsible for the medicine and health department of his program. Andreas is the
participant of the "Immersion in science" programme of the German National Academy
of Science (Leopoldina) and the Bosch-Foundation. Hobby is Photography.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
49 | P a g e
Volker Stollorz, Science Journalist, Frankfurter Allgemeine Sonntagszeitung, Cologne
Volker Stollorz is a biology graduate, book author and freelance science journalist. He
lives in Cologne and works for the most prestigious German media organizations
including the newspaper daily (Frankfurter Allgemeine Sonntagszeitung), national
magazines (STERN) and the public television channel WDR. He is a member of the
German Science Journalists Association and has been awarded numerous
prizes for his work, including the renowned Georg von Holtzbrinck Prize for Scientific
Journalism.
Richard Stone, International Editor, Science
Richard Stone oversees Science Magazine's international news coverage. Rich has
been with the magazine since 1991. From 2000 to 2012, he served as a foreign
correspondent for Science, starting out in Cambridge, U.K., as Science’s European
Editor and as a Visiting Writer at the University of Cambridge. The last stop on his
overseas tour was Beijing, where he opened the magazine’s Asia bureau in 2007. He
returned to Science’s home office in Washington, D.C., in early 2013. Rich has
been a Fulbright Scholar twice, in Russia in 1995-96 and in Kazakhstan in 2004-05,
and he has visited North Korea six times and counting over the last decade for
reporting on science in the Hermit Kingdom. Rich has contributed to Discover,
Smithsonian, and National Geographic magazines, and is the author of the nonfiction book "Mammoth:
The Resurrection of an Ice Age Giant."
Jonathan Stray, Data Journalist, Computer Scientist and Head of The Overview Project
Jonathan Stray leads the Overview Project for the Associated Press, a Knight
News Challenge-funded visualization system to help investigative journalists
make sense of very large document sets, and teaches computational
journalism at Columbia University. Formerly he was an interactive editor at the
Associated Press, a freelance reporter in Hong Kong, and a senior computer
scientist at Adobe Systems. He has contributed stories to The New York Times,
Foreign Policy, Wired and China Daily. He has an MSc in computer science
from the University of Toronto and an MA in journalism from the University of
Hong Kong.
Harry Surjadi, Freelance Science Journalist, Indonesia
Harry Surjadi, founder and chairman of the Society of Indonesian Science
Journalists, has been working as a science journalists specialized in
environmental issues for 25 years. He was a Knight International Journalism
Fellow in 2007-2008 and 2011-2012. He was a group leader of SjCOOP
Asia. In 2013 he received the Communication for Social Change Award from
the University of Queensland, Australia.
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
50 | P a g e
Mariko Takahashi, Science journalist, Asahi Shimbun, Tokyo
Mariko Takahashi is senior staff writer at The Asahi Shimbun, one of Japan's
leading newspapers and vice president of the Japanese Association of Science and
Technology Journalists. She was employed by The Asahi Shimbun as a journalist in
1979, and has been staff writer of science news section, Tokyo, staff writer and
editor of Monthly Science KAGAKU ASAHI, deputy editor of the science news
section, Osaka, editorial writer and science editor. Nowadays she contributes stories
to WEBRONZA, an opinion website maintained by The Asahi Shimbun
(http://webronza.asahi.com/). She holds a B.Sc. Degree in Physics from the Tokyo
University.
Shara Tonn, Journalism Student, Stanford University
Shara Tonn is a recent graduate from Stanford University with an M.S. in Earth
Systems, an interdisciplinary environmental science degree. Over the past year,
she has made the transition from science to communications, interning at KQED
Science Radio and the Stanford News Service. She is excited about doing
multimedia science journalism by day while practicing and performing improve
theater – her second passion – by night. Born and raised in Tennessee, Shara loves
to travel and has lived overseas in Japan, Italy and Australia.
Dana Topousis, Head of Office of Legislative and Public Affairs, NSF
Dana Topousis joined NSF in 2006 as the acting head of the Office of Legislative
and Public Affairs. In addition to leading a diverse office of about 60 employees,
Ms. Topousis oversees the media relations, web, multimedia, social media, and
speechwriting teams at NSF. She and her team are experts in executive
communications training, issues management, strategic communications, media
relations and messaging. Prior to joining NSF, Ms. Topousis was
communications director for NOAA’s National Marine Protected Areas Center.
She served as a senior account executive for Fenton Communications, managed
m
ns for Conservation International, and served as deputy press director for the a
2nd
Kavli Symposium on Science Journalism: Data Mining
Dolce Hayes Mansion, San Jose, CA, USA, 16th
-18th
February 2015
51 | P a g e
Caroline Wichmann, Head of Press and Public Relations, German Science Academy
Caroline Wichmann has been active in science management for 17 years. As
a graduate in political science, public administration and media affairs
management, she is keenly aware of the instrumentality of press and publicity
work and mindful of the expectations of various target groups. Before taking
over the position of spokesperson and Director of the Department of Press
and Public Relations at the German National Academy of Sciences
Leopoldina she was employed in cooperative French-German projects as well as in the fields of press and
public relations. Caroline Wichmann set up the Department of Press and Public Relations at the
Leopoldina and adapted it to the new responsibilities attendant to a National Academy of Sciences,
namely, science-based policy advice. Her primary field of interest lies at the intersection of science and
media and the challenges of communication therein.
Ron Winslow, Deputy Bureau Chief Health/Science, The Wall Street Journal
Ron Winslow is Deputy Bureau Chief Health/Science at The Wall Street Journal. He
has been a reporter and editor at the Wall Street Journal for 32 years, including more
than 25 covering health and medicine. Earlier, he taught journalism at his alma
mater, the University of New Hampshire. He is the author of Hard Aground, the
Story of the Argo Merchant Oil Spill; co-author of Open and Shut (a true crime
story) and was a co-writer of NOVA, the book published in commemoration of the
10th anniversary of the PBS science program. In 2011, he won the Victor Cohn Prize
for Excellence in Medical Science Reporting. He is past President of the National
Association of Science Writers and was founding board member of the Association of Health Care
Journalists.