Upload
daniel-tunkelang
View
3.573
Download
1
Embed Size (px)
DESCRIPTION
Presentation from O'Reilly Strata 2012 on Big DataHumans, Machines, and the Dimensions of MicroworkDaniel Tunkelang (LinkedIn)Claire Hunsaker (Samasource)The advent of crowdsourcing has wildly expanded the ways we think of incorporating human judgments into computational workflows. Computer scientists, economists, and sociologists have explored how to effectively and efficiently distribute microwork tasks to crowds and use their work as inputs to create or improve data products. Simultaneously, crowdsourcing providers are exploring the bounds of mechanical QA flows, worker interfaces, and workforce management systems.But what tasks should be performed by humans rather than algorithms? And what makes a set of human judgments robust? Quantity? Consensus? Quality or trustworthiness of the workers? Moreover, the robustness of judgments depends not only on the workers, but on the task design. Effective crowdsourcing is a cooperative endeavor.In this talk, we will analyze various dimensions of microwork that characterize applications, tasks, and crowds. Drawing on our experience at companies that have pioneered the use of microwork (Samasource) and data science (LinkedIn), we will offer practical advice to help you design crowdsourcing workflows to meet your data product needs.
Citation preview
1Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Humans, Machines, and theDimensions of Microwork
Daniel Tunkelang, LinkedInClaire Hunsaker, Samasource
IdentityConnect, find and be foundLinkedIn Profile, Address Book, Search
InsightsBe great at what you doHomepage, LinkedIn Today, Groups
Work wherever our members work
EverywhereMobile, APIs, Plug-InsDesktop
Rolodex, Resume, Business Card
Newspapers,
Trade Magazines, Events
What is LinkedIn?
2
3
4
5
6
7
8
How do Microworkers help Data Scientists?
Data Collection
Human Judgments as Training Data
Evaluation
9
Quality Assurance
10
Foreign University Research Project
11
Stay Objective
vs.
12
At What Price Independence?
Independent judgments enable statistical reasoning.– Can increase accuracy by requiring agreement of independent
workers on the same task.
13
At What Price Independence?
Independent judgments enable statistical reasoning.– Can increase accuracy by requiring agreement of independent
workers on the same task.
But independent workers can’t help each other out.– No benefit from collaboration = less accurate workers.– Number of workers becomes bottleneck, and workers may be
incented to create fake alter egos.
14
Add slide of workers helping each other out at delivery center
15
At What Price Independence?
Independent judgments enable statistical reasoning.– Can increase accuracy by requiring agreement of independent
workers on the same task.
But independent workers can’t help each other out.– No benefit from collaboration = less accurate workers.– Number of workers becomes bottleneck, and workers may be
incented to create fake alter egos.
Market for lemons (Akerlof, 1970)
16
17
Keep It Simple
Avoid unnecessary difficulty.– Provide clear instructions with examples.– Be transparent about what you’re trying to achieve.– Check an early sample of the work closely.– Set expectations on quality and accuracy and manage to those.
18
Keep It Simple
Avoid unnecessary difficulty.– Provide clear instructions with examples– Be transparent about what you’re trying to achieve– Check an early sample of the work closely– Set expectations on quality and accuracy and manage to those
Trade-offs between task value and difficulty.– Easier to select from options than answer open-ended questions.– Even easier if there are only two options.– But open-ended questions leverage more intelligence.
19
Keep It Simple
Avoid unnecessary difficulty.– Provide clear instructions with examples– Be transparent about what you’re trying to achieve– Check an early sample of the work closely– Set expectations on quality and accuracy and manage to those
Trade-offs between task value and difficulty.– Easier to select from options than answer open-ended questions.– Even easier if there are only two options.– But open-ended questions leverage more intelligence.
Watch out for systematic bias.– Even independent judges may make the same mistakes.– Especially if they use the same tools.
20
Take Aways
Independent judgments are nice for some tasks, but not always worth the cost.
Keep crowdsourcing tasks as simple as possible.
Manage the trade-off between task value and difficulty.
Watch out for systematic bias.
21
Questions?
Contact:
Daniel: [email protected]
Claire: [email protected]
We’re (both) hiring (a lot)!
Thank You!