25
Delineating the Citation Impact of Scientific Discoveries Chaomei Chen 1 , Jian Zhang 1 , Weizhong Zhu 1 , Michael Vogeley 2 1 College of Information Science and Technology, Drexel University 2 Department of Physics, Drexel University This work is supported by the National Science Foundation under Grant No. 0612129. Thomson ISI provides the bibliographic data for the analysis.

Delineating the Citation Impact of Scientific Discoveries

  • Upload
    nowles

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Delineating the Citation Impact of Scientific Discoveries . Chaomei Chen 1 , Jian Zhang 1 , Weizhong Zhu 1 , Michael Vogeley 2 1 College of Information Science and Technology, Drexel University 2 Department of Physics, Drexel University . - PowerPoint PPT Presentation

Citation preview

Page 1: Delineating the Citation Impact of Scientific Discoveries

Delineating the Citation Impact of Scientific Discoveries

Chaomei Chen1, Jian Zhang1, Weizhong Zhu1, Michael Vogeley2

1College of Information Science and Technology, Drexel University2Department of Physics, Drexel University

This work is supported by the National Science Foundation under Grant No. 0612129. Thomson ISI provides the bibliographic data for the analysis.

Page 2: Delineating the Citation Impact of Scientific Discoveries

As We May Thinkby Vannevar Bush

There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember, as they appear. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.

Page 3: Delineating the Citation Impact of Scientific Discoveries

An Increasingly Strong Trend in Science Gray & Szalay 2004

massive scientific data are being collected by one group of scientists

and being analyzed by another group of

scientists.

Two notable examples: 1. The SDSS project in astrophysics 2. The human genome project in biomedicine

Page 4: Delineating the Citation Impact of Scientific Discoveries

Sloan Digital Sky SurveyThe most ambitious astronomical survey ever undertaken

Sloan Survey Data• June, 2006: Data Release Five:

8000 square degrees, 1,048,960 spectra.

• June, 2005: Data Release Four:6670 square degrees, 806,400 spectra.

• September, 2004: Data Release Three:5282 square degrees, 528,640 spectra.

• March, 2004: Data Release Two:3324 square degrees, 367,360 spectra.

• April, 2003: Data Release One:2099 square degrees, 186,240 spectra.

• June, 2001: Early Data Release:462 square degrees, 52,896 spectra.

There is an increasingly strong trend in science that massive scientific data are being collected by one group of scientists and being analyzed by another group of scientists

(Gray & Szalay 2004). Two notable examples: the SDSS project in astrophysics and the human genome project in biomedicine.

SDSS Literature• Total number of articles: 1,478• Total citations: 47,282

• June 18, 2007: H = 95• January 30, 2007: H = 89

Time Slice Space Node Link2001-2001 1699 300 7249

2002-2002 2703 519 14808

2003-2003 4294 1036 40133

2004-2004 5580 1218 43398

2005-2005 6692 1685 76009

2006-2006 10279 2815 139300

2007-2007 3136 496 15259

Page 5: Delineating the Citation Impact of Scientific Discoveries

Integrating Microscopic and Macroscopic perspectives

• Connecting text-level patterns (microscopic) and paper-level citation impacts (macroscopic) – improve our understanding of science in

the making– develop data mining and visual analytics

algorithms

Page 6: Delineating the Citation Impact of Scientific Discoveries
Page 7: Delineating the Citation Impact of Scientific Discoveries

Figure 3. Prominent keywords assigned by authors and burst terms extracted from titles and abstracts (2002-2006).

Page 8: Delineating the Citation Impact of Scientific Discoveries

Hc, Ht Split

Class I

Class II

Page 9: Delineating the Citation Impact of Scientific Discoveries

Fast-Growing SDSS Literature

• 1,400 papers • 40,000 citations• The total citation number doubled in the

past 1.5 years. • H-index of SDSS literature = 89 95

Page 10: Delineating the Citation Impact of Scientific Discoveries

As of June 18, 2007, 95 SDSS papershave 95 or more citations.

It was 89 in January 2007.

Page 11: Delineating the Citation Impact of Scientific Discoveries

Measuring the Citation ImpactSc discounts citations accumulated over a long

period of time. – Sc is adjusted for publication age.

St measures the recent impact:– St gives heavier weights to relatively recent citations than earlier citations.

Page 12: Delineating the Citation Impact of Scientific Discoveries

Year Title Cites Sc St

2004 Cosmological parameters from SDSS and WMAP 404 404.00 367.00

1995 THE FIRST SURVEY - FAINT IMAGES OF THE RADIO SKY AT 20 CENTIMETERS

455 140.00 301.64

2003 Stellar population synthesis at the resolution of 2003 371 296.80 263.47

2001 Evidence for reionization at z similar to 6: Detection of a Gunn-Peterson trough in a z=6.28 quasar

307 175.43 255.07

2001 The luminosity function of galaxies in SDSS commissioning data 250 142.86 196.73

2003 A survey of z > 5.7 quasars in the Sloan Digital Sky Survey. II. Discovery of three additional quasars at z > 6

195 156.00 175.80

2001 A survey of z > 5.8 quasars in the Sloan Digital Sky Survey. I. Discovery of three new quasars and the spatial density of luminous quasars at z similar to 6

226 129.14 174.87

2002 Evolution of the ionizing background and the epoch of reionization from the spectra of z similar to 6 quasars

211 140.67 170.00

2001 Composite quasar spectra from the Sloan Digital Sky Survey 221 126.29 168.21

2004 The three-dimensional power spectrum of galaxies from the Sloan Digital Sky Survey

224 224.00 167.00

Page 13: Delineating the Citation Impact of Scientific Discoveries

Hg Indices and Splits• The 1,293 records

– H-index = 65, including 3 papers have 65 citations

– Hc index =52– Ht index = 53

• The H split – 67 papers in the highly cited group – 1,226 remaining papers in the second group

Page 14: Delineating the Citation Impact of Scientific Discoveries

Class I

Class IClass II

Page 15: Delineating the Citation Impact of Scientific Discoveries

Significant Noun Phrases

• 22,665 noun phrases identified by a part-of-speech tagging and pattern matching process.

• 290 of them are selected based on their log-likelihood ratios.

Sc Sc St St

Total terms: 22,665

A(Sc) G(Sc) A(Sc) G(Sc)

Pivotal value 11.70 11.06 11.46 8.61

#High 379 379 328 401

#Low 914 914 965 892

Page 16: Delineating the Citation Impact of Scientific Discoveries

Figure 4. An overview of a decision tree generated based on 216 terms selected by log-likelihood ratio values (p<0.01) and a geometric mean split (74.44% of classification accuracy). The tree should be read from the root downwards .

Page 17: Delineating the Citation Impact of Scientific Discoveries

Figure 5. A part of the tree shown in Figure 4. The presence (>0) or absence (<=0) of a term is associated with a citation status group, i.e. highly and timely cited group.

Page 18: Delineating the Citation Impact of Scientific Discoveries

Figure 6. An ADTree derived from the data selected with the same selection criteria with 70.55% of accuracy.

Page 19: Delineating the Citation Impact of Scientific Discoveries

Figure 7. A decision tree of 95.82% classification accuracy derived from 721 terms and 1,267 records.

n-

Page 20: Delineating the Citation Impact of Scientific Discoveries

Figure 10. The citation history of timeliness papers shows recently published papers are moved up in the rankings.

Page 21: Delineating the Citation Impact of Scientific Discoveries

Future Work

• Unsupervised ontology construction to smooth the feature space

• Incremental classification of incoming new data and scholarly publications

• Self-directed optimization of existing decision trees based on new evidence

• Full-text analysis that can model associative relations between hypotheses and evidence and between facts and opinions

Page 22: Delineating the Citation Impact of Scientific Discoveries
Page 23: Delineating the Citation Impact of Scientific Discoveries
Page 24: Delineating the Citation Impact of Scientific Discoveries
Page 25: Delineating the Citation Impact of Scientific Discoveries