Upload
duane-mccoy
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
How A Million People Could Save the Planet:
The Next Research Agenda forCollaborative Computing
2012 Brazilian Symposium on Collaborative Systems
David W. [email protected]
University of WashingtonThe Information School
October 18, 2012
The Shifting Paradigm in Collaborative Computing
• There are a set of interesting problems at the intersection of computing and how people use computing
• The issues at the intersection can change as participation scales up to larger numbers
• Insight gained from studying the intersection can fundamentally change computing as a purposely designed and built artifact
• Theory and methods originating from one single perspective (computational, behavioral, or social) are insufficient to fully interpret the happenings in the intersection
Talk Outline
• Introduction• The Shifting Paradigm in Collaborative Computing• Insights from Prior Research
– Expertise Locating– Proactive Displays– Lifestyle Behavior Change
• Patterns of Behavioral Observations in Wikipedia– Collective Behavioral Observation– Machine Learning Experiments– Candidate Patterns– Validation and Limitations
• Social Computational Systems Research Agenda– High-Level Research Challenges– Research Openings
• Research Questions– How do people find necessary expertise?– How can we build systems to support
natural expertise locating behavior?
• Methods– Qualitative, 9 month ethnographic field
study– Grounded Theory analysis– System building/design (wrote code)
– Quantitative evaluation of locating and matching heuristics
Expertise Locating
• Findings– Locating process – Identification,
Selection, Escalation– Identification – Work products and
byproducts can be used to generate recommendations of individuals with ‘localized’ expertise
– Selection – Social Networks for contextualizing social recommendation are only partially effective
– Plugable software architecture (ERArch) to allow extension and addition of Identification and Selection techniques
Expertise Locating
Proactive Displays
• Design Goals/Questions– Enhance the feeling of community among
conference attendees. – Mesh with common social practices at the
conference. – Manage the privacy concerns of all participants.
• Methods– System building/design– Field trial (deployment) at academic conference– Observation, ad-hoc interviews, post
conference survey
Auto Speaker ID
Ticket 2 Talk
Neighborhood Window
Proactive Displays
• Findings– Proactive Display as an Open Region – an
area where people of different status are socially allowed to interact (Goffman, Behavior in Public Places, 1963)
– Shared Interactions – You don’t really “interact” with a “proactive” system
– Design Implication for Public Displays – Context(s), Content, Control
Auto Speaker ID
Ticket 2 Talk
Neighborhood Window
Lifestyle Behavior Change - UbiFit
• Research Question– How can technology help people move from the
behaviors that define the lifestyle they have to a new lifestyle they want?
• Methods– System building/design– Field Trial – 3 weeks– Field Experiment – 3 month– Interviews, surveys, activity data– Analysis
• Presentation of Self (Goffman)
• Cognitive Dissonance Theory (Festinger)
• Transtheoretical Model of Behavior Change (Prochaska et al)
Lifestyle Behavior Change - UbiFit
• Findings– Traditional models of validation for
inference systems are problematic when deployed in the real world
– Theories being used for UbiComp fitness/health applications are somewhat problematic (TTM)
– Awareness of behavior through personal ambient display can overcome avoidance
– Fitness behavior patterns are not very regular (exceptions are the rule)
Talk Outline
• Introduction• The Shifting Paradigm in Collaborative Computing• Insights from Prior Research
– Expertise Locating– Proactive Displays– Lifestyle Behavior Change
• Patterns of Behavioral Observations in Wikipedia– Collective Behavioral Observation– Machine Learning Experiments– Candidate Patterns– Validation and Limitations
• Social Computational Systems Research Agenda– High-Level Research Challenges– Research Openings
Collective Behavioral Observations
• People make behavioral observations– Every day social/behavioral science– Motivating example – Driving
Collective Behavioral Observations
• People make behavioral observations– Every day social/behavioral science– Motivating example – Driving
• Online Communities– Observations are attenuated– Leverage the power of the crowd, many people
• Wikipedia Behavioral Observations– Barnstars
Observational Patterns
• Can we identify patterns of user activity through non-specialist observations?
• Possible problems …– Pro-social recognition (piling on)– Singular activity – popular – Singular activity – extraordinary efforts
Generate Train & Test Sets
• Previous work (became Train Set)– Mined Nov. 2006 Wikipedia data dump– Over 14K unique barnstars, ~4900 recipients– Created coding scheme, 7 top-level categories– 3 coders, ~2126 barnstars
• Additional Coding (new Test Set)– Random selection, cleaning – 2 coders, ~478 barnstars
Train & Test Set Distributions
Train Set Test Set
Dimension of Observed Activity
Codes
% Codes
%
Editing Work 852 27.8 180 29.1
Social and Community Support Action
763 24.9 150 24.2
Border Patrol 342 11.2 81 13.1
Administrative 284 9.3 54 8.7
Collaborative Action and Disposition
244 8.0 41 6.6
Meta-Content Work 128 4.2 23 3.7
Undifferentiated Work 447 14.6 90 14.5
Classification Experiments
• General Multi-label Classification Approaches– Problem Transformation (PT)– Algorithm Adaptation
• Features– n-gram, barnstar name, barnstar image name, policy named,
policy linked, link to a page, link to a specific edit, …
• What worked reasonably well– PT1 – Independent binary classification– PT4 – Classifier for every set of applied labels– AA – MLkNN, multi-label version of k Nearest Neighbors
PT1 – Results (AUC)
Dimension of Activity
Logistic Regression
Naïve Bayes
Random Forest (1k Trees)
KNN (k=10)
Administrative 0.833 0.949 0.942 0.903
Border Patrol 0.922 0.941 0.952 0.956
Collaborative Action 0.750 0.722 0.743 0.725
Editing 0.878 0.875 0.879 0.884
Meta-Content 0.835 0.842 0.883 0.800
Social and Community
0.802 0.796 0.797 0.805
Undifferentiated Work
0.847 0.848 0.844 0.854
Avg. AUC 0.838 0.853 0.862 0.847
Identifying Candidates
• Select Barnstar Recipients– Recipients with 9 or more barnstars
• 259 candidates, 4327 barnstars
• Applied the Random Forest– Label the received barnstars
• Candidate Recipients– Predominate observed activity if the same label applied to
more than half
Candidates
Dimension of Activity Label Avg %Candida
tes
Editing E 67.9 25
Border Patrol B 73.2 13
Social and Community S 61.5 54
Administrative A 66.4 75
Collaborative Actions C 52.0 1
Meta-Content M 76.8 4
Undifferentiated Work U 60.0 10
182
Review Candidates & Labels
• Random selection of pattern candidates– 39 of the 182 (21.4% ), yield 544 of 4327 barnstars
(12.6%)
• Validation– Possible duplicates, possible non-barnstars– Mislabel application
Reviewing Patterns
• Independence of the Observations– Seem relatively independent– No evidence of barnstars awarded to the same recipient
for the exact same event
• Limitations– Skew in what the community “values” and in the numbers
(a challenge for ML validation – unbalanced data)– Link candidate and patterns to the actual edits
• Future work
Working at the Intersection
• Contribution to Computing– Naturalistic datasets open interesting problems for ML
algorithms• Massive datasets probably require application of ML techniques
– Approaches for handling short text, incremental contributions
– Unbalanced data
McDonald, D. W., S. Javanmardi and M. Zachry (2011) Finding Patterns in Behavioral Observations by Automatically Labeling Forms of Wikiwork in Barnstars. Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym'11).
Sajnani, H., S. Javanmardi, D. W. McDonald and C. Lopes. (2011) Multi-Label Classification of Short Text: A Study on Wikipedia Barnstars. Presented at the “Analyzing Microtext” Workshop at the Twenty-Fifth Conference on Artificial Intelligence (AAAI-11).
Talk Outline
• Introduction• The Shifting Paradigm in Collaborative Computing• Insights from Prior Research
– Expertise Locating– Proactive Displays– Lifestyle Behavior Change
• Patterns of Behavioral Observations in Wikipedia– Collective Behavioral Observation– Machine Learning Experiments– Candidate Patterns– Validation and Limitations
• Social Computational Systems Research Agenda– High-Level Research Challenges– Research Openings
The Shifting Paradigm in Collaborative Computing
• There are a set of interesting problems at the intersection of computing and how people use computing
• The issues at the intersection can change as participation scales up to larger numbers
• Insight gained from studying the intersection can fundamentally change computing as a purposely designed artifact
• Theory and methods originating from one single perspective (computational, behavioral, or social) are insufficient to fully interpret the happenings in the intersection
Social Computational Systems Research Agenda• Defining SoCS
– A Social Computational System (SoCS) interleaves machine activity and human activity to solve problems that neither machine nor human can solve alone.
• Properties of SoCS– Allow people to do what people do best– Allow machines to do what machines do best– Solve unique problems that interleave both
– Could be a 1-with-1 system (one person with one machine)– Perhaps scaling of SoCS could solve more difficult problems
SoCS: Research Openings,Collaborative Substrate
Collaborative Substrate/Infrastructure
• Software Engineering– Architectures for effective interleaving– Toolkits to support new system development
• Languages– Support massive parallelization between people/machine– Expressive asynchrony– Support task decomposition & recomposition among people/machine
• Data Management– Data Provenance– Who generated the data? (Person or machine?)– How does data (quality) change over time?
• Psychological– Motivations, incentives to make contributions– Promote high quality contributions– Skill development and individual growth
• Interaction, Social– Support for prosocial or congenial interaction– Leveraging or minimizing conflict– Effective support for meta conversations about the
system/tasks– Provide meaningful feedback on the work, tasks,
contributions
SoCS: Research Openings,Human/Social
• Intelligent Systems– Understanding error rates of machine and people– Patterns across very large numbers of contributions– Patterns in very small contributions
• Data Mining– Effective use of user contribution– Working to minimize multiple collections
SoCS: Research Openings,Computational
SoCS: Research Openings,Interface
• Usability– Simplify making a contribution– Identifying tasks or places where contributions are needed– Administration tasks
• Visualizations– Understand, interpret who, what, where of contributions– Where are groups of people, clusters of work– Where are there gaps
Social Computational Systems Research Agenda• Three High-Level Challenges for SoCS
– Methodological ChallengeEffectively use existing methods to study the intersection and,
where those methods fail, develop new methods to address the intersection.
– Human Trait or Technical Quality (Trait/Quality) ChallengeUnderstand the shifting influences of human traits and technical
qualities across scales to accommodate shifting levels of participation in SoCS - potentially increasing or decreasing.
– Design ChallengeCommunicate SoCS design principles so that the broader
community of system builders and industry can readily utilize them.
Promising Domains
• Leverage human skills, insight, intuitions• Leverage the ability of machines to model,
calculate, aggregate, visualize
Promising Domains
• Leverage human skills, insight, intuitions• Leverage the ability of machines to model,
calculate, aggregate, visualize
• Social Computational Systems as Applications– Cognitive support – memory, understanding, comprehension– Social support – facilitate interactions with others, cross-cultural– Educational – interleave people and machines for teaching as
well as learning– Government – grow participation in decision making– Work/Labor – enable new forms of work, potentially new
economies
Promising Domains
• Leverage human skills, insight, intuitions• Leverage the ability of machines to model,
calculate, aggregate, visualize
• Social Computational Systems for Grand Challenges– Global warming– Preserve cultural knowledge from extinction– Sustainable economic development– Health and wellness