39
01 March 2011 Kaiser: COMS E6125 1 COMS E6125 Web-enHanced COMS E6125 Web-enHanced Information Management Information Management (WHIM) (WHIM) Prof. Gail Kaiser Prof. Gail Kaiser Spring 2011 Spring 2011

01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

Embed Size (px)

Citation preview

Page 1: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 1

COMS E6125 Web-COMS E6125 Web-enHanced Information enHanced Information Management (WHIM)Management (WHIM)

COMS E6125 Web-COMS E6125 Web-enHanced Information enHanced Information Management (WHIM)Management (WHIM)

Prof. Gail KaiserProf. Gail Kaiser

Spring 2011Spring 2011

Page 2: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 2

Today’s Topics:

• What is Web 2.0?• Information Sharing and

Privacy• Applications Beyond the Web

Page 3: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

Tim O’Reilly, September 2005

3

Page 4: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 4

Netscape vs. Google: The Web As Platform

• Netscape: free web browser as flagship to establish market for high-priced server products that push content to the “webtop” – but servers also turned out to be commodities

• Google: Native web application, never sold or packaged or ported, delivered as a service with no scheduled software releases, massively scalable - core competency is data management

Page 5: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 5

Akamai vs. BitTorrent:Internet Decentralization

• Akamai: Treats network as platform at deeper level of stack, transparent caching and content delivery that eases bandwidth congestion – also limited by business model catering to large providers

• BitTorrent: P2P file fragment downloads, every client is also a server, the service automatically gets better the more people use it - architecture of participation

Page 6: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 6

Harness Collective Intelligence

• Google PageRank using link structure• eBay enabler of user activity requiring

critical mass• Amazon uses community activity to

produce better search results (real-time “most popular” computation)

• Wikipedia – radical experiment in trust, profound change in content creation

Page 7: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 7

Harness Collective Intelligence

• Web of connections grows organically• Viral marketing – if a site or product relies

on advertising to get the word out, it isn’t Web 2.0

• Peer-production open source development of much web infrastructure – linux, apache, mysql, various perl, php, python

• Network effects from user contributions are the key to market dominance

Page 8: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 8

Blogosphere• Blogging vs. personal home pages,

replaced personal dairy, daily opinion column, NNTP (Usenet’s Network News Protocol), now being supplanted by facebook and twitter

• RSS (Really Simple Syndication) allows subscribing to a page – the incremental (or live) web

• Permalink builds bridges between weblogs, effects PageRank search results

Page 9: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 9

Perpetual Beta• Software delivered as a service, not a

product• Upgrades every day vs. every 2-3 years• Operations and monitoring must

become core competencies• Scripting languages as duct tape• Innovation in assembly

Page 10: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 10

AJAXRich User Experiences

• Standards-based presentation using XHTML and CSS

• Dynamic display and interaction using the Document Object Model

• Data Interchange and manipulation using XML and XSLT

• Asynchronous data retrieval using XMLHttpRequest

• Javascript binding everything together

Page 11: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 11

Infoware• Database management as core competency• Specialized databases: web crawl, distributed file

databases• Map databases: starting with Mapquest, many

services now license the same data from NavTeq (digital street maps), Digital Globe (satellite images)

• Amazon licensed ISBN registry from Bowker but added publisher-supplied data and user annotations

• Mashups based on lightweight programming model create value-added data

Key issue: Who owns the data?

Page 12: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 12

Information Sharing: Web 1.0

• The original purpose of the Web!• Generally viewed as an information resource, download

without upload• Websites owned by “someone else” may store your

information in a database – usually limited to basic identification (name, address, phone number, credit card) and “preferences”

• Personal websites (e.g., hosted by geocities) might be universally browse-able but in practice visited by few

• Key issue: Who owns the data?

Page 13: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 13

Information Sharing: Web 2.0

• Message boards with user-supplied content• Portals with user-selected content “portlets”• Blogs, wikis, news feeds, texting• Social networking, collaborative filtering• RIAs (rich internet applications)• The Web as Platform, widgets, mashups, user-

supplied applications Key issue: Who owns the data?

Page 14: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 14

The Right To Privacy• Secrecy (confidentiality): The extent

to which we are known to others• Anonymity: The extent to which we

are the subject of others’ attention• Solitude: The extent to which others

have access to us

Page 15: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 15

Rights to Sue (wrt Privacy)

• Intrusion upon seclusion or solitude, or into private affairs

• Public disclosure of embarrassing private facts

• Inaccurate reporting: Publicity that places a person in a false light in the public eye

• Appropriation of identity: “identity theft”

Page 16: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 16

A New Yorker cartoon from 1993

Page 17: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 17

But in 2011, your browser (and its addons, plugins, etc.) know

• You’ve searched for local veterinarians and groomers

• You’ve read reviews comparing flea powders• You’ve ordered “chew sticks” and “squeaky toys”• You’ve printed coupons for Alpo• You’ve downloaded 101 Dalmations and Lassie

“on demand” movies• Your email contains sales notices from petco.com Your “My Pictures” folder contains 100s of

images of fire hydrants and frisbees

Page 18: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 18

Page 19: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 19

Web Tracking• Bits: How Do They Track You? • Data collection events:

– Pages displayed– Search queries entered– Videos played– Advertising displayed (both same party and

third party)• In December 2007 alone, yahoo

collected 400 billion events, aol 100 billion, google 91 billion, microsoft 51 billion

Page 20: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 20

From study bycomScore publishedin NY Times online3/9/08

Page 21: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 21

Caveats• Not all of this data is useful• Not all of it is retained by the

companies with access to it• Much of it cannot be traced back to

individuals• Several data collection events may be

triggered by a single Web page • Does not include user-volunteered data

(website registration, social networking)

Page 22: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 22

Why Track?• Targeted advertising supports

“free” services and content (ad serving was the first widely deployed mashup)

• But collected information can be used for other purposes…

• Ad choice (e.g., TACO, Evidon)

Page 23: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 23

Privacy Before and After

• Before the Web, you participated in a variety of activities

• These might have involved groups of people, in public or private, possibly even “the press”

• Photos or recordings might have been taken, with or without your knowledge

• You might have borrowed or purchased books or magazines related to your activities

• You might have sent/received letters by snailmail

• What is different now? Does it matter?

Page 24: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 24

Privacy Before and After

• Before the Web, you might have typed your name, address, phone number, birth date, social security number, bank account numbers, credit card numbers, etc. into your PC for personal storage

• It was unlikely anyone outside your household could access your PC

• Now you type at least part of that information into your PC all the time (if you make online purchases and/or sign up for online services)

• And you have no idea who might be reading them, from either your PC (if connected to Internet) or from the Websites you sent them to

Page 25: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 25

Privacy Before and After

• Your name, phone number, address were always easily available (phone book, reverse listings)

• So was your birth date, although harder to obtain (birth records, drivers license)

• And your SSN - lots of forms ask for it• Your checking account and/or credit card

numbers were available through the issuing banks and the merchants where you made purchases

• So what is different now? Does it matter?

Page 26: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 26

Web 2.0 Applications for Scientific Communities

• Scientists collaborating together in the same lab on the same project share:– Data: specimens, samples, materials, observations,

etc.– Tools: instruments, software, hardware– Knowledge: open discussion, whiteboard Real-world social networking

• However, there are time and space constraints• More significantly, this model does not scale

well to communities of scientists working on different projects but who could possibly learn from each other’s expertise, experience, etc.

Page 27: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 27

CSCW Approaches• CSCW (Computer-Supported Collaborative

Work) aims to augment same-time/same-place collaboration but more significantly different-time/different-place collaborations and communities

• Current generation CSCW systems support data sharing (e.g., PNNL Collaboratories) and/or tool sharing (e.g., UIUC BioCoRE)

• However, these systems do not address knowledge sharing how/when/where/why to use tools and data

Page 28: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 28

Knowledge Sharing• Knowledge sharing is partially enabled

through labor intensive static approaches: publications, email lists, wikis, chat, shared display, etc.

• We seek to enable automatic knowledge sharing - without requiring “extra work” on the part of scientists

Page 29: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 29

Social Networking Metaphor

• Some online social networking is a form of CSCW that is potentially enjoyable and profitable but still requires “extra work”, with dynamism limited by explicit user participation– Facebook, LinkedIn, Twitter, etc.

• Other social networking automatically records what people do online to aggregate, data mine and disseminate in an enjoyable and profitable fashion, with no “extra work” required - but can be enhanced by very simple user actions (e.g., ratings)– Collaborative filtering – “people like you …”

Page 30: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 30

genSpace Overview• We combine implicit and explicit social

networking concepts in our approach to knowledge sharing

• Prototype implemented as a set of plugins for geWorkbench, a platform for analysis and visualization tools for integrated genomics

• Records, aggregates, data mines and disseminates geWorkbench users’ activities with tools and tool sequences (workflows)

Page 31: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 31

Questions genSpace Can Answer

• What do I do next? • Which tools work well together?• Where does this tool fit in a typical

workflow?• Who do I know who also uses this tool?• How can I get help (from an expert who

is online right now)?

Page 32: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 32

genSpace Features• Collaborative Workflow Composition: past

history of analysis tool usage is used to identify commonly-occurring sequences/workflows

• Tool Suggestions: suggests analysis tools that may be useful, based on what tools were previously used

• Social Networking: allows users to associate with each other and share knowledge within groups

• Data Suggestions: suggest data sets based upon previous analyses and CF

Page 33: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 33

genSpace Architecture

Page 34: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 34

Privacy/Confidentiality Concerns

• Users can choose anonymous logging or disable it entirely

• Security/privacy of the activity logs is being investigated (data sets are NOT recorded*)

• Issues when users change their collaborative networks and/or opt out preferences

• Must we provide privacy by default?

Page 35: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

Research in the Cloud• geWorkbench, most other analysis

tools are “fat” desktop applications

• Why not create a browser-based client?

01 March 2011 Kaiser: COMS E6125 35

Page 36: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

More open questions for genSpace

• What other CSCW techniques can help support researchers?

• How can we efficiently address privacy concerns while providing helpful recommendations?

01 March 2011 Kaiser: COMS E6125 36

Page 37: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 37

Summary• genSpace embodies an approach to

knowledge sharing that is based on social networking metaphors

• genSpace is built on the geWorkbench platform for integrated genomics

• Potentially applicable to other kinds of scientists and engineers, including software engineers

Page 38: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

February 22, 2011 COMS 6125 38

Next Assignments• Full paper due Tuesday March 8th

• Project Proposal due Tuesday March 8th

Page 39: 01 March 2011Kaiser: COMS E61251 COMS E6125 Web-enHanced Information Management (WHIM) Prof. Gail Kaiser Spring 2011

01 March 2011 Kaiser: COMS E6125 39

COMS E6125 Web-COMS E6125 Web-enHanced Information enHanced Information Management (WHIM)Management (WHIM)

COMS E6125 Web-COMS E6125 Web-enHanced Information enHanced Information Management (WHIM)Management (WHIM)

Prof. Gail KaiserProf. Gail Kaiser

Spring 2011Spring 2011