29
Automatic Classification and Automatic Classification and Automatic Classification and Automatic Classification and Analysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields in Computer Sciences in Computer Sciences in Computer Sciences in Computer Sciences Tanmoy Chakraborty Google India PhD Fellow Google India PhD Fellow Indian Institute of Technology, Kharagpur India In collaboration with: Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy Ganguly, Animesh Mukherjee IIT-Kgp, India 2013 ASE/IEEE SocialCom, Washington D.C., USA, September 8-14, 2013

Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Automatic Classification andAutomatic Classification andAutomatic Classification andAutomatic Classification andAnalysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields Analysis of Interdisciplinary Fields

in Computer Sciences in Computer Sciences in Computer Sciences in Computer Sciences

Tanmoy ChakrabortyGoogle India PhD FellowGoogle India PhD Fellow

Indian Institute of Technology, Kharagpur

India

In collaboration with:Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar,

Niloy Ganguly, Animesh Mukherjee

IIT-Kgp, India

2013 ASE/IEEE SocialCom, Washington D.C., USA, September 8-14, 2013

Page 2: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Outline

o Problem Definition

o Dataset

o Indicators of Interdisciplinarity

o Unsupervised Classification Model

o Evolution Landscape of Interdisciplinarity

o Core-periphery Analysis

o Conclusion

Page 3: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

How to seek a good toolkit ?How to seek a good toolkit ?How to seek a good toolkit ?How to seek a good toolkit ?

Knife

Screw-driverPaper-

punch

Stitching

machine

Bottle

Opener

Saw

Nail-cutter

COMBO PAC

Opener

Page 4: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

InterdisciplinarityInterdisciplinarityInterdisciplinarityInterdisciplinarity

Biology

EngineeringPhysicsChemistry

Sociology

Economics

Mathematics

Interdisciplinary toolkit

Page 5: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

In The Lines of Great ThinkersIn The Lines of Great ThinkersIn The Lines of Great ThinkersIn The Lines of Great Thinkers

“We are not students of some subject matter, but students ofproblems. And problems may cut right across theborders of any subject matter or discipline.”

– Karl Popper– Karl Popper

“Interdisciplinary research is the only way to do researchin current times.”

– Fritjof Capra

Page 6: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Outline

Problem DefinitionDataset

Indicators of Interdisciplinarity

Unsupervised Classification Model

Evolution dynamics of InterdisciplinarityEvolution dynamics of Interdisciplinarity

Core-periphery Analysis

Conclusion

Page 7: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Problem DefinitionProblem DefinitionProblem DefinitionProblem Definition

o Proper quantitative indicators of Interdisciplinarity

o Unsupervised classification of core and o Unsupervised classification of core and

interdisciplinary fields

o Evolution dynamics of interdisciplinarity

o Core-periphery analysis of citation network

Page 8: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem Definition

DatasetIndicators of Interdisciplinarity

Unsupervised classification modelUnsupervised classification model

Evolution dynamics of Interdisciplinarity

Core-periphery Analysis

Conclusion

Page 9: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

DatasetDatasetDatasetDataset

o Large DBLP dump used by Chakraborty et al. (ASONAM, 13)

o Bibliographic information during 1960-2008

- Paper name

Publicly available: http://cnerg.org

http://cse.iitkgp.ac.in/~tanmoyc

# of valid papers 702,973- Paper name

- Author(s)

- Publication venue

- Year of publication

- Abstract

- References

- Field

AI Bioinformatics NLP

Algorithm Graphics WWW

Networking Comp. Vision Education

Database Data Mining OS

Dist Comp. Prog. Lang. Embedded Sys.

Architecture Security Simulation

Software Engg. IR HCI

Machine Learning Scientific Comp. Multimedia

24 Fields

# authors 495,311

# unique venue name 1,705

Page 10: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Citation networkCitation networkCitation networkCitation network

o Aggregated Network: 1960-2005

Node

(paper)

o Time-stamp wise Networks:

5 years sliding window (60-64, 61-65, 62-66, ..., 2001-2005)

Link

(citation)

Page 11: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem definition

Dataset

Indicators of

InterdisciplinarityInterdisciplinarityUnsupervised classification model

Evolution dynamics of interdisciplinarity

Core-periphery organization

Conclusion

Page 12: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Indicators of Indicators of Indicators of Indicators of InterdisciplinarityInterdisciplinarityInterdisciplinarityInterdisciplinarity

o Reference Diversity Index (RDI)

o Citation Diversity Index (CDI)o Citation Diversity Index (CDI)

o Membership Diversity Index (MDI)

o Attraction Index

� Most of the indices are Entropy based measures

� More Entropy => More diversity

Page 13: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Reference Diversity indexReference Diversity indexReference Diversity indexReference Diversity index(RDI)(RDI)(RDI)(RDI)

∑−=j

jji ppXRDI log)(RDI of a paper Xi =

pj = proportion of references of Xi citing the papers of field Fj

Xi

pj = 3/5

pk = 2/5

Fj

Fk

RDI(Xi) = - 3/5 log (3/5) – 2/5 log (2/5)

= 0.67

More RDI, more interdisciplinarity

Page 14: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Citation Diversity IndexCitation Diversity IndexCitation Diversity IndexCitation Diversity Index(CDI)(CDI)(CDI)(CDI)

o CDI of a paper Xi at time ti =

∑−=j

jjit ppXCDIi

log)(

pj = proportion of citations

received by Xi from the papers pj Freceived by Xi from the papers

of field Fj

Drift of CDI between two successive time windows =

)()()(1 ititit fCDIfCDIf

iii−=∆

+

�Drift:

Xi

pj

pk

Fj

Fk

CDI(Xi) = 0.67

Page 15: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Spikes in CDISpikes in CDISpikes in CDISpikes in CDI

Page 16: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Membership Diversity IndexMembership Diversity IndexMembership Diversity IndexMembership Diversity Index(MDI)(MDI)(MDI)(MDI)

1. Identify overlapping communities

[Xie et al., ICDM, 2011]

AIIR

2. Tag the communities by the fields

(major field in a group)

∑−=j

jji ppfMDI log)(

where, pj is the fraction of overlapped papers of field fi belonging to the

communities tagged as fj

(major field in a group)

3. For each field fi in the dataset,

3.1 Observe the belongingness of all papers in different field-tagged

communities

3.2 Measure MDI

More MDI, more interdisciplinarity

Page 17: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

External Evidence: External Evidence: External Evidence: External Evidence: Attraction indexAttraction indexAttraction indexAttraction index

• ni: # unique authors up to the year ti (in field f )

• ni+4: # unique authors up to the year ti+4 (in field f )

i

iif

c

nn −= + 4χ

• ni+4: # unique authors up to the year ti+4 (in field f )

• ci: # publications in f in the time window (ti+4 - ti)

0 ti ti+4ni = 2ni+4 = 3

4/1=fχ

More χ , more interdisciplinarity

Page 18: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem definition

Dataset

Indicators of Interdisciplinarity

Unsupervised Unsupervised

Classification ModelEvolution dynamics of interdisciplinarity

Core-periphery organization

Conclusion

Page 19: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Unsupervised Classification ModelUnsupervised Classification ModelUnsupervised Classification ModelUnsupervised Classification Model

o A field is represented by a vector of size 4 indicating

four features

o Adjacency matrix A of size 24×24

A(i,j)= Cosine similarity of field i and jA(i,j)= Cosine similarity of field i and j

o Clustering algorithm proposed by Waltman et al. (J.

Informetrics, 2010)

Page 20: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Result of the Classification Result of the Classification Result of the Classification Result of the Classification

Page 21: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem definition

Dataset

Indicators of Interdisciplinarity

Unsupervised classification model

Evolution Landscape of Evolution Landscape of

InterdisciplinarityCore-periphery organization

Conclusion

Page 22: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Evolution dynamicsEvolution dynamicsEvolution dynamicsEvolution dynamics

o Construct a field-field citation network

o 24 nodes in each time-stamp

o Draw directed and weighted edges based on

citationscitations

o Observe the citation distribution across the

fields

Page 23: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

Evolutionary LandscapeEvolutionary LandscapeEvolutionary LandscapeEvolutionary LandscapeWWW

WWW

WWW

WWW

o Fields are grouped based on the connection proximity

o The size of the font indicates the relative importance (# of incoming citations)

of a field

PLPL PLPL

Page 24: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem definition

Dataset

Indicators of Interdisciplinarity

Unsupervised classification model

Evolution dynamics of interdisciplinarity

Core-periphery Core-periphery

AnalysisConclusion

Page 25: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

CoreCoreCoreCore....periphery Analysisperiphery Analysisperiphery Analysisperiphery Analysis

ALGO ALGOALGO

ALGO

DM DM DM

DM

Page 26: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

OutlineProblem definition

Dataset

Indicators of Interdisciplinarity

Unsupervised classification model

Evolution dynamics of interdisciplinarity

Core-periphery organization

ConclusionConclusion

Page 27: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

ConclusionsConclusionsConclusionsConclusions

o Quantitative indications of interdisciplinary

o Automated scheme to identify interdisciplinary fields

o Evolutionary landscape depicts the cross-hybridization among fields

o K-core analysis shows the steady movements of interdisciplinary field at the core field at the core

� Future Directions:

o Identify interdisciplinary papers

o Predicting the possible fields to be intermingled next.

Page 28: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

AcknowledgementsAcknowledgementsAcknowledgementsAcknowledgements

o Financial Support: Financial Support: Financial Support: Financial Support: Google India Pvt. LtdGoogle India Pvt. LtdGoogle India Pvt. LtdGoogle India Pvt. Ltd....

o Technical support: Technical support: Technical support: Technical support: Complex Network Research Group

(CNeRG), IIT-Kgp

http://cnerg.org/http://cnerg.org/http://cnerg.org/http://cnerg.org/

Page 29: Automatic Classification and Analysis of Interdisciplinary ...cse.iitkgp.ac.in/~tanmoyc/PPT/Presentation_SocialCom_2013.pdf · Srijan Kumar, M Dastagiri Reddy, Suhansanu Kumar, Niloy

ConclusionConclusionConclusionConclusion

o Quantitative indications of Interdisciplinarity

o Citation based indices

(i) suggests the practice of interdisciplinairty

(ii) unfolds the evolutionary landscape of Interdsiciplinarity

o Few interdisciplinary fields (Data Mining) come towards the

core

o Future work:� Identification of interdisciplinary papers

� Recommendation system to predict future combination of

fields

http://cse.iitkgp.ac.in/~tanmoyc/

http://cnerg.org/