165
Reinventing the Research Article - Seven Questions on Science Publishing Anita de Waard Researcher Disruptive Technologies, Elsevier Labs NWO - Casimir Grantee, Utrecht University ELPUB 2008

Unknown Unknowns

Embed Size (px)

DESCRIPTION

Talk from Elpub2008 in Toronto on 'known unknowns' in science publishing

Citation preview

Page 1: Unknown Unknowns

Reinventing the Research Article -Seven Questions on Science Publishing

Anita de WaardResearcher Disruptive Technologies,

Elsevier LabsNWO - Casimir Grantee,

Utrecht University

ELPUB 2008

Page 2: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

Page 3: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

Page 4: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

Page 5: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

3. The narrative research article is outdated and needs to be replaced.

Page 6: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

3. The narrative research article is outdated and needs to be replaced.

4. Since words contain meaning,

Page 7: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

3. The narrative research article is outdated and needs to be replaced.

4. Since words contain meaning,

5. And words (and logic) contain scientific fact,

Page 8: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

3. The narrative research article is outdated and needs to be replaced.

4. Since words contain meaning,

5. And words (and logic) contain scientific fact,

6. We just need to model them with xml + rdf;

Page 9: Unknown Unknowns

Seven ’known knowns’ in online science publishing:

1. The internet has caused an information overload.

2. Science papers contain facts.

3. The narrative research article is outdated and needs to be replaced.

4. Since words contain meaning,

5. And words (and logic) contain scientific fact,

6. We just need to model them with xml + rdf;

7. And the publishers should stop making all these papers.

Page 10: Unknown Unknowns

1. The internet has caused an information overload

Page 11: Unknown Unknowns

- My own experience (as a researcher):

1. The internet has caused an information overload

Page 12: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

1. The internet has caused an information overload

Page 13: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

1. The internet has caused an information overload

Page 14: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

1. The internet has caused an information overload

Page 15: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

1. The internet has caused an information overload

Page 16: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

- Infuriating:

1. The internet has caused an information overload

Page 17: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

- Infuriating:

- Trying to respond to people who ask me something

1. The internet has caused an information overload

Page 18: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

- Infuriating:

- Trying to respond to people who ask me something

- Managing three email accounts on 4 computers

1. The internet has caused an information overload

Page 19: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

- Infuriating:

- Trying to respond to people who ask me something

- Managing three email accounts on 4 computers

- Following up on plans and projects

1. The internet has caused an information overload

Page 20: Unknown Unknowns

- My own experience (as a researcher):

- Easy: find what I know exists

- OK: Finding things I expect hope exist

- Hard: making sure I haven’t missed anything

- However, none of these make me feel overwhelmed.

- Infuriating:

- Trying to respond to people who ask me something

- Managing three email accounts on 4 computers

- Following up on plans and projects

- However, we can improve the delivery of science content online.

1. The internet has caused an information overload

Page 21: Unknown Unknowns

1. The internet has caused an information overload

Page 22: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

Page 23: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

Page 24: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

Page 25: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

Page 26: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

- Explore

Page 27: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

- Explore

- But this does not address WHAT you want to Locate, Understand, ..

Page 28: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

- Explore

- But this does not address WHAT you want to Locate, Understand, ..

- Semantic network in pharmacology: ‘Grey out what I already know’

Page 29: Unknown Unknowns

1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

- Explore

- But this does not address WHAT you want to Locate, Understand, ..

- Semantic network in pharmacology: ‘Grey out what I already know’

1. How can we model a user’s interest?

Page 30: Unknown Unknowns

1. The internet has caused an information overload1. The internet has caused an information overload

- Pick (carve out) a first set of user needs, e.g.:

- Locate

- Understand

- Believe (Be convinced)

- Explore

- But this does not address WHAT you want to Locate, Understand, ..

- Semantic network in pharmacology: ‘Grey out what I already know’

1. How can we model a user’s interest?

Page 31: Unknown Unknowns

2. Science papers contain facts

Page 32: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

2. Science papers contain facts

Page 33: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

2. Science papers contain facts

Page 34: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

2. Science papers contain facts

Page 35: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

2. Science papers contain facts

Page 36: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 37: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 38: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 39: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 40: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 41: Unknown Unknowns

- With FEBS Letters Editorial Office in Heidelberg/MINT Database in Rome

- Structured Digital Abstract [Gerstein et. al]: ‘machine-readable XML summary of pertinent facts’

- For FEBS: provide proteins, methods, protein-protein interactions, as given in MINT:

- 2008: authors provide, editors check

- 2009: Word Plug-in tool suggests, authors (and editors) check

2. Science papers contain facts

Page 42: Unknown Unknowns

2. Science papers contain facts

Page 43: Unknown Unknowns

- Issue: authors cannot be curators!

2. Science papers contain facts

Page 44: Unknown Unknowns

- Issue: authors cannot be curators!

- Fact is not claim, but created by consensus post-hoc

2. Science papers contain facts

Page 45: Unknown Unknowns

- Issue: authors cannot be curators!

- Fact is not claim, but created by consensus post-hoc

- How do we model the process of consenses-building, of disagreement, of fact creation, of mistrust and doubt?

2. Science papers contain facts

Page 46: Unknown Unknowns

- Issue: authors cannot be curators!

- Fact is not claim, but created by consensus post-hoc

- How do we model the process of consenses-building, of disagreement, of fact creation, of mistrust and doubt?

2. Can we create (tools for) an ontology of doubt?

2. Science papers contain facts

Page 47: Unknown Unknowns

- Issue: authors cannot be curators!

- Fact is not claim, but created by consensus post-hoc

- How do we model the process of consenses-building, of disagreement, of fact creation, of mistrust and doubt?

2. Can we create (tools for) an ontology of doubt?

2. Science papers contain facts 2. Science papers contain facts

Page 48: Unknown Unknowns

3. The narrative RA should be replaced

Page 49: Unknown Unknowns

3. The narrative RA should be replaced Aristotle QuintilianQuintilianQuintilian Cell APA Style Guide

prooimion Introduction exordiumThe introduction of a speech, where one announces the subject and purpose

of the discourse, and where one usually employs the persuasive appeal of ethos in order to establish credibility with the audience.

Introduction Introduction

prothesis Statement of Facts narratio

The second part of a classical oration, following the introduction or exordium. The speaker here provides a narrative account of what has happened and

generally explains the nature of the case. Quintilian adds that the narratio is followed by the propositio, a kind of summary of the issues or a statement of

the charge.

Introduction Introduction

Summary propostitioComing between the narratio and the partitio of a classical oration, the

propositio provides a brief summary of what one is about to speak on, or concisely puts forth the charges or accusation.

Abstract Abstract

Division/outline partitio

Following the statement of facts, or narratio, comes the partitio or divisio. In this section of the oration, the speaker outlines what will follow, in accordance with what's been stated as the status, or point at issue in the case. Quintilian suggests the partitio is blended with the propositio and also assists memory.

Table of Contents Article Outline

pistis Proof confirmatioFollowing the division / outline or partitio comes the main body of the speech

where one offers logical arguments as proof. The appeal to logos is emphasized here.

Results Methods, Results

Refutation refutatioFollowing the the confirmatio or section on proof in a classical oration, comes the refutation. As the name connotes, this section of a speech was devoted to

answering the counterarguments of one's opponent.Discussion Discussion

epilogos peroratioFollowing the refutatio and concluding the classical oration, the peroratio conventionally employed appeals through pathos, and often included a

summing up (see the figures of summary, below).Discussion Discussion

Page 50: Unknown Unknowns

3. The narrative RA should be replaced Aristotle QuintilianQuintilianQuintilian Cell APA Style Guide

prooimion Introduction exordiumThe introduction of a speech, where one announces the subject and purpose

of the discourse, and where one usually employs the persuasive appeal of ethos in order to establish credibility with the audience.

Introduction Introduction

prothesis Statement of Facts narratio

The second part of a classical oration, following the introduction or exordium. The speaker here provides a narrative account of what has happened and

generally explains the nature of the case. Quintilian adds that the narratio is followed by the propositio, a kind of summary of the issues or a statement of

the charge.

Introduction Introduction

Summary propostitioComing between the narratio and the partitio of a classical oration, the

propositio provides a brief summary of what one is about to speak on, or concisely puts forth the charges or accusation.

Abstract Abstract

Division/outline partitio

Following the statement of facts, or narratio, comes the partitio or divisio. In this section of the oration, the speaker outlines what will follow, in accordance with what's been stated as the status, or point at issue in the case. Quintilian suggests the partitio is blended with the propositio and also assists memory.

Table of Contents Article Outline

pistis Proof confirmatioFollowing the division / outline or partitio comes the main body of the speech

where one offers logical arguments as proof. The appeal to logos is emphasized here.

Results Methods, Results

Refutation refutatioFollowing the the confirmatio or section on proof in a classical oration, comes the refutation. As the name connotes, this section of a speech was devoted to

answering the counterarguments of one's opponent.Discussion Discussion

epilogos peroratioFollowing the refutatio and concluding the classical oration, the peroratio conventionally employed appeals through pathos, and often included a

summing up (see the figures of summary, below).Discussion Discussion

The Story of Goldilocks and the Three Bears

Story Grammar Paper The AXH Domain of Ataxin-1 Mediates Neurodegeneration through Its Interaction with Gfi-1/Senseless Proteins

Once upon a time Time Setting Background The mechanisms mediating SCA1 pathogenesis are still not fully understood, but some general principles have emerged.

a little girl named Goldilocks Characters

Setting

Objects of study the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,

She went for a walk in the forest. Pretty soon, she came upon a house.

Location

Setting

Experimental setup

studied and compared in vivo effects and interactions to those of the human protein

She knocked and, when no one answered,

Goal Theme Researchgoal

Gain insight into how Atx-1's function contributes to SCA1 pathogenesis. How these interactions might contribute to the disease process and how they might cause toxicity in only a subset of neurons in SCA1 is not fully understood.

she walked right in. Attempt

Theme

Hypothesis Atx-1 may play a role in the regulation of gene expression

At the table in the kitchen, there were three bowls of porridge.

Name Episode 1 Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in Files

Goldilocks was hungry. Subgoal

Episode 1

Subgoal test the function of the AXH domain

She tasted the porridge from the first bowl.

Attempt

Episode 1

Method overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and Perrimon, 1993) and compared its effects to those of hAtx-1.

This porridge is too hot! she exclaimed.

Outcome

Episode 1

Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives expression in the differentiated R1-R6 photoreceptor cells (Mollereau et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after eclosion, overexpression of either Atx-1 does not show obvious morphological changes in the photoreceptor cellsSo, she tasted the porridge

from the second bowl. 

Episode 1

Data (data not shown),

This porridge is too cold, she said

Outcome

Episode 1

Results both genotypes show many large holes and loss of cell integrity at 28 days

So, she tasted the last bowl of porridge.

 

Episode 1

Data (Figures 1B-1D).

Ahhh, this porridge is just right, she said happily and

Outcome

Episode 1

Results Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye abnormalities. The external structures of the eyes that overexpress dAtx-1 show disorganized ommatidia and loss of interommatidial bristles

she ate it all up.  

Episode 1

Data (Figure 1F),

Page 51: Unknown Unknowns

3. The narrative RA should be replaced

Page 52: Unknown Unknowns

Discourse Segments:

3. The narrative RA should be replaced

Page 53: Unknown Unknowns

Discourse Segments:

- “A text is made up of Discourse Segments and the relations between them” - Grosz and Sidner, Mann-Thomson, Marcu, Swales

3. The narrative RA should be replaced

Page 54: Unknown Unknowns

Discourse Segments:

- “A text is made up of Discourse Segments and the relations between them” - Grosz and Sidner, Mann-Thomson, Marcu, Swales

- Discourse Segment Purpose: element that has a consistent rhetorical/pragmatic goal.

3. The narrative RA should be replaced

Page 55: Unknown Unknowns

Discourse Segments:

- “A text is made up of Discourse Segments and the relations between them” - Grosz and Sidner, Mann-Thomson, Marcu, Swales

- Discourse Segment Purpose: element that has a consistent rhetorical/pragmatic goal.

- Define for Biological Research Article:

3. The narrative RA should be replaced

Page 56: Unknown Unknowns

Discourse Segments:

- “A text is made up of Discourse Segments and the relations between them” - Grosz and Sidner, Mann-Thomson, Marcu, Swales

- Discourse Segment Purpose: element that has a consistent rhetorical/pragmatic goal.

- Define for Biological Research Article:<EXPERIMENTS> <Experiment> <Header header="h1">p53-Independent Initiation of G1 Arrest Induced by IR</Header> <Fact fact="fa1" factref="br26">Since the transcriptional response by p53 is a relatively slow process,</Fact> <Problem problem="p1">we asked whether initiation of a G1 arrest following genotoxic stress requires p53. <Problem><Method method="m1">We generated an MCF-7 derivative </Method> <Fact fact="fa2" factref="br24">that expresses the HPV16 E6 protein, which mediates degradation of p53(<Bibref bib="br24">[24]</Bibref>).</Fact><Result result="r1">In the presence of E6, p53 stabilization in response to IR was almost completely prevented in MCF-7 cells (<Figref figref="agami1.gif">Figure 1A).</Figref></Result><Result result="r2">Consistent with this, no induction of p21cip1 by IR was seen in the E6-expressing MCF-7 cells

3. The narrative RA should be replaced

Page 57: Unknown Unknowns

3. The narrative RA should be replaced

Page 58: Unknown Unknowns

3. The narrative RA should be replaced

Page 59: Unknown Unknowns

3. The narrative RA should be replaced

Page 60: Unknown Unknowns

3. The narrative RA should be replaced

Page 61: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

Page 62: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

- Scientific rhetoric is contained within the narrative

Page 63: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

- Scientific rhetoric is contained within the narrative

- Main goal of article is to persuade: ‘ The author is a medium that enables the article to get itself published (a la selfish gene/meme)’

Page 64: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

- Scientific rhetoric is contained within the narrative

- Main goal of article is to persuade: ‘ The author is a medium that enables the article to get itself published (a la selfish gene/meme)’

- Science happens in language - science is done by creating successful persuasive texts IN ENGLISH! (empowerment rests on mastery of this genre)

Page 65: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

- Scientific rhetoric is contained within the narrative

- Main goal of article is to persuade: ‘ The author is a medium that enables the article to get itself published (a la selfish gene/meme)’

- Science happens in language - science is done by creating successful persuasive texts IN ENGLISH! (empowerment rests on mastery of this genre)

- How to disentangle good science from good writing?

Page 66: Unknown Unknowns

3. The narrative RA should be replaced

- Narrative is how stories are told; ‘the truth can only be told in stories’....

- Scientific rhetoric is contained within the narrative

- Main goal of article is to persuade: ‘ The author is a medium that enables the article to get itself published (a la selfish gene/meme)’

- Science happens in language - science is done by creating successful persuasive texts IN ENGLISH! (empowerment rests on mastery of this genre)

- How to disentangle good science from good writing?

3. How can we better represent online narrative? ...

Page 67: Unknown Unknowns

3. The narrative RA should be replaced

Page 68: Unknown Unknowns

PHC Growth arrestundergo

3. The narrative RA should be replaced

Page 69: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

3. The narrative RA should be replaced

Page 70: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 71: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 72: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 73: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 74: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

underpinning

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 75: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

data 1

data 2 data 3

3. The narrative RA should be replaced

Page 76: Unknown Unknowns

PHC Growth arrestundergo

Paper A:implication

results

method

goal

fact

fact

Paper B:

data 4

data 5 data 6

implication

results

method

goal

fact

fact

data 1

data 2 data 3

method link

3. The narrative RA should be replaced

Page 77: Unknown Unknowns

3. The narrative RA should be replaced

Page 78: Unknown Unknowns

- How to develop systems that ‘reconstruct the salami’

3. The narrative RA should be replaced

Page 79: Unknown Unknowns

- How to develop systems that ‘reconstruct the salami’

- Claim-evidence networks: identify nr. of experiments supporting a claim, vs. nr. of papers containing two words in a sentence?

3. The narrative RA should be replaced

Page 80: Unknown Unknowns

- How to develop systems that ‘reconstruct the salami’

- Claim-evidence networks: identify nr. of experiments supporting a claim, vs. nr. of papers containing two words in a sentence?

3. How can we better represent collections of online narratives?

3. The narrative RA should be replaced

Page 81: Unknown Unknowns

- How to develop systems that ‘reconstruct the salami’

- Claim-evidence networks: identify nr. of experiments supporting a claim, vs. nr. of papers containing two words in a sentence?

3. How can we better represent collections of online narratives?

3. The narrative RA should be replaced 3. The narrative RA should be replaced

Page 82: Unknown Unknowns

4. Words contain meaning

Page 83: Unknown Unknowns

4. Words contain meaning

Sicilian?

Page 84: Unknown Unknowns

4. Words contain meaning

Sicilian?

Page 85: Unknown Unknowns

4. Words contain meaning

Sicilian?

Page 86: Unknown Unknowns

4. Words contain meaning

Sicilian?

Page 87: Unknown Unknowns

4. Words contain meaning

Sicilian?

Page 88: Unknown Unknowns

4. Words contain meaning

Page 89: Unknown Unknowns

4. Words contain meaning

- ‘A word is worth a thousand pictures’ (Don Loritz)

Page 90: Unknown Unknowns

4. Words contain meaning

- ‘A word is worth a thousand pictures’ (Don Loritz)

- The meaning of words occurs in context and is dependent on knowledge and experience

Page 91: Unknown Unknowns

4. Words contain meaning

- ‘A word is worth a thousand pictures’ (Don Loritz)

- The meaning of words occurs in context and is dependent on knowledge and experience

- This is even more so in science:PSA = Prostate-Specific Antigen or Pot Smokers Association of America?

Page 92: Unknown Unknowns

4. Words contain meaning

Page 93: Unknown Unknowns

4. Words contain meaning

- Cognitive linguistics: language and cognition cannot be separated - language acts are cognitive acts

Page 94: Unknown Unknowns

4. Words contain meaning

- Cognitive linguistics: language and cognition cannot be separated - language acts are cognitive acts

- Lakoff, metaphor: ‘anger is heat’

Page 95: Unknown Unknowns

4. Words contain meaning

- Cognitive linguistics: language and cognition cannot be separated - language acts are cognitive acts

- Lakoff, metaphor: ‘anger is heat’

- Meaning is created in the mind:a word is not (only) a ‘particle’ but (also) a ‘wave’:Hearing/reading is not unpacking a package, but resonating at a specific frequency - context is its medium - context-free language does not exist!

Page 96: Unknown Unknowns

4. Words contain meaning

- Cognitive linguistics: language and cognition cannot be separated - language acts are cognitive acts

- Lakoff, metaphor: ‘anger is heat’

- Meaning is created in the mind:a word is not (only) a ‘particle’ but (also) a ‘wave’:Hearing/reading is not unpacking a package, but resonating at a specific frequency - context is its medium - context-free language does not exist!

4. How do we model cognitive context?

Page 97: Unknown Unknowns

4. Words contain meaning

- Cognitive linguistics: language and cognition cannot be separated - language acts are cognitive acts

- Lakoff, metaphor: ‘anger is heat’

- Meaning is created in the mind:a word is not (only) a ‘particle’ but (also) a ‘wave’:Hearing/reading is not unpacking a package, but resonating at a specific frequency - context is its medium - context-free language does not exist!

4. How do we model cognitive context?

4. Words contain meaning

Page 98: Unknown Unknowns

5. Words (and logic) contain scientific fact

Page 99: Unknown Unknowns

5. Words (and logic) contain scientific fact

• “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987

Page 101: Unknown Unknowns

5. Words (and logic) contain scientific fact

“We generated an MCF-7 derivative that expresses the

HPV16 E6 protein, which mediates degradation of p53

([24]).”

24. M. Scheffner, B.A. Werness, J.M. Huibregtse, A.J. Levine and P.M. Howley, The E6 oncoprotein encoded by human

papillomavirus types 16 and 18 promotes the degradation of p53. Cell 63 (1990), pp. 1129–1136. SummaryPlus | Full Text + Links | PDF (1728 K) | Abstract + References in Scopus |

Cited By in Scopus

• “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987

Page 102: Unknown Unknowns

5. Words (and logic) contain scientific fact

“In the presence of E6, p53 stabilization in response to IR

was almost completely prevented in MCF-7 cells

(Figure 1A).”

“We generated an MCF-7 derivative that expresses the

HPV16 E6 protein, which mediates degradation of p53

([24]).”

24. M. Scheffner, B.A. Werness, J.M. Huibregtse, A.J. Levine and P.M. Howley, The E6 oncoprotein encoded by human

papillomavirus types 16 and 18 promotes the degradation of p53. Cell 63 (1990), pp. 1129–1136. SummaryPlus | Full Text + Links | PDF (1728 K) | Abstract + References in Scopus |

Cited By in Scopus

• “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987

Page 103: Unknown Unknowns

5. Words (and logic) contain scientific fact

Figure 1. Initiation and Maintenance of G1 Arrest Induced by IR(A) Stable MCF-7 clones containing either pCDNA3.1 (Neo) or pCDNA3.1-E6 were irradiated (20 Gy), and cellular protein extracts were made 2 hr later, separated on 10% SDS PAGE, and immunoblotted to detect p53 and cyclin D1 proteins.

“In the presence of E6, p53 stabilization in response to IR

was almost completely prevented in MCF-7 cells

(Figure 1A).”

“We generated an MCF-7 derivative that expresses the

HPV16 E6 protein, which mediates degradation of p53

([24]).”

24. M. Scheffner, B.A. Werness, J.M. Huibregtse, A.J. Levine and P.M. Howley, The E6 oncoprotein encoded by human

papillomavirus types 16 and 18 promotes the degradation of p53. Cell 63 (1990), pp. 1129–1136. SummaryPlus | Full Text + Links | PDF (1728 K) | Abstract + References in Scopus |

Cited By in Scopus

• “[Y]ou can transform a fact into fiction or a fiction into fact just by adding or subtracting references [and data]” – Bruno Latour, ‘Science in Action’,1987

Page 104: Unknown Unknowns

5. Words (and logic) contain scientific fact

Page 105: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

Page 106: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

Page 107: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

- Bioimage consortium (Shotton, Oxford): access biology images across a variety of sources (PLoS, Nature, Elsevier...) and create common metadata format

Page 108: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

- Bioimage consortium (Shotton, Oxford): access biology images across a variety of sources (PLoS, Nature, Elsevier...) and create common metadata format

- SPIDER: Allowing shared access to epidemiology data (meta-epidemiology)

Page 109: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

- Bioimage consortium (Shotton, Oxford): access biology images across a variety of sources (PLoS, Nature, Elsevier...) and create common metadata format

- SPIDER: Allowing shared access to epidemiology data (meta-epidemiology)

- Tie in to Open Data initiative, generalise, get buy in, sustainability:

Page 110: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

- Bioimage consortium (Shotton, Oxford): access biology images across a variety of sources (PLoS, Nature, Elsevier...) and create common metadata format

- SPIDER: Allowing shared access to epidemiology data (meta-epidemiology)

- Tie in to Open Data initiative, generalise, get buy in, sustainability:

5. How do we represent (and access) non-textual elements?

Page 111: Unknown Unknowns

5. Words (and logic) contain scientific fact

- Essential persuasive elements are non-textual

- Open Data, how to incorporate into ‘text mining’?

- Bioimage consortium (Shotton, Oxford): access biology images across a variety of sources (PLoS, Nature, Elsevier...) and create common metadata format

- SPIDER: Allowing shared access to epidemiology data (meta-epidemiology)

- Tie in to Open Data initiative, generalise, get buy in, sustainability:

5. How do we represent (and access) non-textual elements?

5. Words (and logic) contain scientific fact

Page 112: Unknown Unknowns

6. Just model the facts with xml + rdf

Page 113: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

Page 114: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

Page 115: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

Page 116: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

Page 117: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

Page 118: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

Page 119: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

- More experiments with RDF:

Page 120: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

- More experiments with RDF:

- DOPE: Semantic access to heterogeneous data in pharmacology

Page 121: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

- More experiments with RDF:

- DOPE: Semantic access to heterogeneous data in pharmacology

- OKKAM: Entity-centric web (EU-funded)

Page 122: Unknown Unknowns

6. Just model the facts with xml + rdf

- Content in XML - but what about overlapping tags?

- Versioning in DTDs/Schemas? Principle of hierarchical trees - not always best model of a content set

- First pass at relations in RDF (Resource Description Framework:

- Cohere: Open University - (open!) system of creating and linking claims

- More experiments with RDF:

- DOPE: Semantic access to heterogeneous data in pharmacology

- OKKAM: Entity-centric web (EU-funded)

8

1. DOPE (2003)deduplicate thesaurus term

select co-occurrence terms

see results set + link to full-text

visualise overlap results

Page 123: Unknown Unknowns

6. Just model the facts with xml + rdf

Page 124: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

Page 125: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

Page 126: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

Page 127: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

Page 128: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

- access inside a PDF by coordinates? Format, content changes

Page 129: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

- access inside a PDF by coordinates? Format, content changes

- add IDs to every single element? Format, content, version changes?

Page 130: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

- access inside a PDF by coordinates? Format, content changes

- add IDs to every single element? Format, content, version changes?

- How to represent relations, even if we know where they link?

Page 131: Unknown Unknowns

6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

- access inside a PDF by coordinates? Format, content changes

- add IDs to every single element? Format, content, version changes?

- How to represent relations, even if we know where they link?

6. How can we better model discourse elements (and relations)?

Page 132: Unknown Unknowns

6. Just model the facts with xml + rdf6. Just model the facts with xml + rdf- Yes, but:

- In practice: ScienceDirect does not use our XML... (shhh....)

- At Elsevier: Project Harpoon: ‘stab’ the document with metadata, asynchronous, linked in (XPath/XQuery), distributed

- Not solved in XML - how to access a phrase inside an article:

- access inside a PDF by coordinates? Format, content changes

- add IDs to every single element? Format, content, version changes?

- How to represent relations, even if we know where they link?

6. How can we better model discourse elements (and relations)?

Page 133: Unknown Unknowns

7. And publishers should stop making all those papers.

Page 134: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

Page 135: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

Page 136: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

Page 137: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

Page 138: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

- conference tickets

Page 139: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

- conference tickets

- research assessment

Page 140: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

- conference tickets

- research assessment

- and yes, by the way, reporting on scientific work.

Page 141: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

- conference tickets

- research assessment

- and yes, by the way, reporting on scientific work.

- Scientists are evaluated largely based on publications: this enables their production to be evaluated by non-specialists

Page 142: Unknown Unknowns

7. And publishers should stop making all those papers.

- 6 uses of a RA:

- job application

- report card

- thesis

- conference tickets

- research assessment

- and yes, by the way, reporting on scientific work.

- Scientists are evaluated largely based on publications: this enables their production to be evaluated by non-specialists

- This places an undue stress on quantity, conformity (for risk of being rejected), publishing for its own sake.

Page 143: Unknown Unknowns

7. And publishers should stop making all those papers.

Page 144: Unknown Unknowns

The real challenge:

7. And publishers should stop making all those papers.

Page 145: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

7. And publishers should stop making all those papers.

Page 146: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

7. And publishers should stop making all those papers.

Page 147: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

7. And publishers should stop making all those papers.

Page 148: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

7. And publishers should stop making all those papers.

Page 149: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

7. And publishers should stop making all those papers.

Page 150: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

7. And publishers should stop making all those papers.

Page 151: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

7. And publishers should stop making all those papers.

Page 152: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

7. And publishers should stop making all those papers.

Page 153: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

- Prof. Zimitri Erasmus, Sociologist from Cape Town

7. And publishers should stop making all those papers.

Page 154: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

- Prof. Zimitri Erasmus, Sociologist from Cape Town

7. And publishers should stop making all those papers.

Page 155: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

- Prof. Zimitri Erasmus, Sociologist from Cape Town

How can we access their science?

7. And publishers should stop making all those papers.

Page 156: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

- Prof. Zimitri Erasmus, Sociologist from Cape Town

How can we access their science?

7. How can we disentangle communication and evaluation (‘metric of attribution’ - virtual RFID)?

7. And publishers should stop making all those papers.

Page 157: Unknown Unknowns

The real challenge:

- in Holland, chemistry departments are dwindling

- in large companies, nr. of PhDs is inversely proportional to power

- direction of scientific research determined by managers for adolescents

For science to survive, we need:

- ‘Hanny’, who found a Voorwerp on GalaxyZoo.org

- Prof. Twalib Ngoma, Professor of Oncology from Dar-Es-Salaam, Nigeria

- Prof. Zimitri Erasmus, Sociologist from Cape Town

How can we access their science?

7. How can we disentangle communication and evaluation (‘metric of attribution’ - virtual RFID)?

7. And publishers should stop making all those papers. 7. And publishers should stop making all those papers.

Page 158: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

Page 159: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

Page 160: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

Page 161: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

3. How can we better represent collections of online narrative?

Page 162: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

3. How can we better represent collections of online narrative?

4. How do we model cognitive context?

Page 163: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

3. How can we better represent collections of online narrative?

4. How do we model cognitive context?

5. How do we represent and access non-textual elements?

Page 164: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

3. How can we better represent collections of online narrative?

4. How do we model cognitive context?

5. How do we represent and access non-textual elements?

6. How can we better model discourse elements and relations?

Page 165: Unknown Unknowns

Seven ‘Known Unknowns’ in Online Science Publishing

1. How can we model a user’s interest?

2. Can we create an ontology of doubt?

3. How can we better represent collections of online narrative?

4. How do we model cognitive context?

5. How do we represent and access non-textual elements?

6. How can we better model discourse elements and relations?

7. How can we disentangle communication and evaluation?