20
Assortative Mixing in the Amazon.com Book Reviewer Network CSE 5810 Complex Networks Ben Collingsworth April 21, 2010 Department of Computer Sciences, Florida Institute of Technology email:[email protected]

Assortative Mixing in the Amazon Book Reviewer Network

  • Upload
    lula

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Assortative Mixing in the Amazon.com Book Reviewer Network. CSE 5810 Complex Networks Ben Collingsworth April 21, 2010. Department of Computer Sciences, Florida Institute of Technology email:[email protected]. Proposal. Demonstrate assortativity in the Amazon.com book reviewer network:. - PowerPoint PPT Presentation

Citation preview

Page 1: Assortative Mixing in the Amazon Book Reviewer Network

Assortative Mixing in the Amazon.com Book Reviewer Network

CSE 5810 Complex Networks

Ben Collingsworth

April 21, 2010

Department of Computer Sciences, Florida Institute of Technology

email:[email protected]

Page 2: Assortative Mixing in the Amazon Book Reviewer Network

Proposal

Demonstrate assortativity in the Amazon.com book reviewer network:

Are people balanced in the materials they read or do they tend stay within arange of their bias and inclination?

Page 3: Assortative Mixing in the Amazon Book Reviewer Network

Related Work

Identifying the role that individual animals play in their social network[5]:

• Examines assortativity in community of 62 dolphins.

• Vertices represent dolphins.

• Edge exists between two dolphins if associations between the pair is higher than expected by chance.

Attribute Assortativity

Gender 0.346

Age 0.148

Degree 0

Page 4: Assortative Mixing in the Amazon Book Reviewer Network

Related Work

Complex network study of Brazilian soccer players[11]:

• Bipartite network containing soccer club and soccer player vertices taken from the set of clubs and players that participated in the Brazilian soccer championship during the period from 1971 to 2002.

• An edge is created between a club and a player if the player has been employed by the club.

• The network is found to be assortative with a value of 0.12.

• Assortativity rises over time from 0.02 in 1975, to 0.12 in 2002.

• Rise attributed to a growing segregationist pattern, where preferential transfers of players between teams occurs.

Page 5: Assortative Mixing in the Amazon Book Reviewer Network

Related Work

Statistical Analysis of Network Data: Methods and Models[11]:

• Analysis of assortativity in Internet2 backbone (Abilene).

• Vertices consist of network components (aggregation points, connectors, exchanges, and participants).

• Edge exists between vertices if physically connected.

• Assortativity based on node type attribute.

• High negative assortativity found with value of -0.3162.

• Negative assortativity expected from hierarchical network.

Page 6: Assortative Mixing in the Amazon Book Reviewer Network

The Book Reviewer Network

• Books are represented by vertices.

• An edge exists between two books if they are reviewed by the same reviewer.

• Edges are undirected.

• Multiple edges between two vertices are recorded in edge weight.

Page 7: Assortative Mixing in the Amazon Book Reviewer Network

Tools Used to Create Network

• Java 2 Standard Edition (J2SE) Java development and runtime environment

• Eclipse Integrated Development Environment (IDE)

• MySQL Relational Database Management System

• BioLayout Express 3D

• Network Workbench

Page 8: Assortative Mixing in the Amazon Book Reviewer Network

Network Data Collection

Web Crawler used to collect data:

• The Web Crawler was run in two phases to collect data in two disparate categories.

• The two categories for the book collection were "George W. Bush" and "Barrack Obama".

• Five starting books were used to begin each pass of the data collection.

• The first pass was started using five books describing George W. Bush. These books describe Bush's background, beliefs, and accomplishments in a positive manner.

• In the second phase, five books describing Barak Obama were used. Similarly, these books described Obama positively.

Page 9: Assortative Mixing in the Amazon Book Reviewer Network

Data Collection Algorithm

while number of books less than maximum for each book in book URL list extract book information if the book already exists in the database and the category of the book is different in the database, change the category to “Common” otherwise, store in book information in database extract reviewer URLs for book for each reviewer extract review information and store in database extract URLs of other books read by reviewer add book URLs to next level book URL list end for end for assign next level book URL list to book URLlistend while

Page 10: Assortative Mixing in the Amazon Book Reviewer Network

Book Reviewer Networks Nodes and Edges

Page 11: Assortative Mixing in the Amazon Book Reviewer Network

Level 3 Network Visualization

Page 12: Assortative Mixing in the Amazon Book Reviewer Network

Level 1 Network Visualization

Page 13: Assortative Mixing in the Amazon Book Reviewer Network

Book Reviewer Network Level 3 - Degree Probability Distribution

Page 14: Assortative Mixing in the Amazon Book Reviewer Network

Book Reviewer Network Level 3 - LOG10 Degree Probability

Page 15: Assortative Mixing in the Amazon Book Reviewer Network

Book Reviewer Network Property Comparison

Page 16: Assortative Mixing in the Amazon Book Reviewer Network

Book Reviewer Network Assortativity

Assortativity calculated using the Newman equation:

Page 17: Assortative Mixing in the Amazon Book Reviewer Network

Conclusions

• The book reviewer networks were demonstrated to be assortative.

• The book reviewer networks remained assortative even as books and reviewers became further removed from the original “seed" books through additional iterations of the collection algorithm

• The assortativity shown in the book reviewer networks reveals a heterogeneity in the types of books people read.

• People tend to read a range of books that match their personal biases and inclinations.

Page 18: Assortative Mixing in the Amazon Book Reviewer Network

Future Work

• Further exploration into degradation of book reviewer network as levels deepen.

• Analysis of network with increase in number of “seed” books.

• Investigation into validity of assortativity calculation for unbalanced networks.

Page 19: Assortative Mixing in the Amazon Book Reviewer Network

Bibliography1. B. Collingsworth and R. Menezes. Identication of social tension in organizational networks. In

Complex Networks, pages 209{223. Springer Berlin / Heidelberg, May 2009.2. R. Dawkins. The God Delusion. Houghton Miin Harcourt, 2006.3. H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scale-free topology of e-mail networks. Phys. Rev.

E, 66(3):035103:(1{4), Sep 2002.4. E. D. Kolaczyk. Statistical Analysis of Network Data: Methods and Models (Springer Series

in Statistics). Springer, 1 edition, 2009.5. D. Lusseau and M. E. J. Newman. Identifying the role that individual animals play in their

social network. PROC.R.SOC.LONDON B, 271:S477, 2004.6. J. Matlis. Internet2. COMPUTERWORLD, August 2006. Available for download at

http://www.computerworld.com/s/article/9002735/Internet2.7. M. McPherson, L. S. Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks.

Annual Review of Sociology, 27(1):415{444, 2001.8. M. E. Newman. The structure of scientic collaboration networks. ProcNatl Acad Sci U S A,

98(2):404{409, January 2001.9. M. E. J. Newman. Mixing patterns in networks. Phys. Rev. E, 67(2):026126, Feb 2003.10. M. E. J. Newman. The structure and function of complex networks. SIAM Review,

45:167{256, 2993.11. R. N. Onody and P. A. de Castro. Complex network study of brazilian soccer players. Phys

Rev E Stat Nonlin Soft Matter Phys, 70(3 Pt2):037103, 2004.

Page 20: Assortative Mixing in the Amazon Book Reviewer Network

“Bush” connected to “Obama”

• Reviews of books sorted by "Most Helpful Customer Reviews“.

• Next level book URL list, i.e. other books read by reviewer, sorted by "Most Recently Reviewed“.

• Not every review of a book in book URL list is saved, limited to 12 reviews per book.

• Not every book reviewed by reviewer is used in next level book URL list , limited to 10 books.

• A reviewer is saved in the list of reviews of a book in the "Bush phase“:

• In the “Obama” phase, a book reviewed by the same reviewer is seen that was not seen in the "Bush" phase.

• Hence, books from the "Bush" phase that were recorded from the reviewer will have edges to books in the "Obama" phase that were reviewed by the same reviewer.

• “Bush” to “Obama” limited to a few hundred edges in level 3 network.

• Included in the assortativity calculation.