Labor Marketplace Applications: Human Computer Interaction KAIST KSE Uichin Lee 9/26/2011

Labor Marketplace Applications: Human Computer Interaction

KAIST KSEUichin Lee9/26/2011

Soylent: A Word Processor with a Crowd Inside

Michael Bernstein, Greg Little, Rob Miller, David Karger, David Crowell, Katrina Panovich

MIT CSAILBjörn Hartmann, Mark Ackerman

UC Berkeley University of MichiganUIST 2010

Part of slides are from http://projects.csail.mit.edu/soylent/

Soylent: a word processing interface

• Uses crowd contributions to aid complex writing tasks.

Challenges for Programming Crowds

• The authors have interacted with ~9000 Turkers on ~2000 different tasks

• Key Problem: crowd workers often produce poor output on open-ended tasks– High variance of effort: Lazy Turker vs. Eager Beaver– Errors made by Turkers; e.g., “Of Mice and Men” “Of

Mice” or calling it a movie..• 30% Rule: ~30% of the results from open-ended

tasks will be unsatisfactory• Solution: find-fix-verify

(at least 20% of the Turkers agree upon)

(e.g., given to 10 Tuckers)

(e.g., given to 5 Tuckers)

Find-Fix-Verify Discussion• Why split Find and Fix?

– Force Lazy Turkers (tend to choose easiest work) to work on a problem of our choice– Allows us to merge work completed in parallel

• Why add Verify?– Quality rises when we place Turkers in productive tension– Allows us to trade off lag time with quality

• Timeout at each stage for responsive operations (e.g., 10-20 minutes)

Evaluation

• Implementation– Microsoft World plug-in using MS Visual Studio

Tools for Office (VSTO) and Windows Presentation Foundation (WPF)

– TurKit Mechanical Turk toolkit • Evaluation:– Shortn, Crowdproof, Human Macro

• Metrics: quality, delay, cost

Shortn Evaluation

• Setting:– Find: 6-10 workers ($0.08 per Find)– Fix/Verify: 3-5 workers ($0.05 per Fix, $0.04 per Verify)– Delay: wait time, work time

• Results:– 78%-90% reduction.. – Wait time of all stages: median 18.5 minutes– Work time: median 118 seconds – Average costs: $1.41 ($0.55 + $0.48 + $0.38)– Caveats: some modifications are grammatically appropriate,

but stylistically incorrect

Crowdproof Evaluation

• Tested 5 input texts– Manually labeled all spelling, grammatical and style

errors in each of the five inputs (total 49 errors)• Ran Crowdproof w/ a 20-minute stage timeout; and

measure how many corrections Crowdproof makes– Task setup: Find: $0.06, Fix: $0.08, Verify: $0.04

• Soylent: 67% vs. MS Word (grammar check): 30%– Combined: 82%

Human Macro Evaluation• Give 5 prompts (Input and Output) to users (CS students, Admin, Author)• Ask to generate descriptions (Request) used for Human Macro tasks

Discussion

• Delay in interface outsourcing: minutes to hours..

• Privacy? Confidential documents?• Legal ownership?• Lack of domain knowledge or shared context

to usefully contribute

VizWiz: Nearly Real-time Answers to Visual Questions

Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller,

Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and Tom Yeh. UIST

Part of slides are from: http://husk.eecs.berkeley.edu/courses/cs298-52-sp11/images/8/8d/Vizwiz_soylent.pdf

VizWiz

• “automatic and human-powered services to answer general visual questions for people with visual impairments.”

• Lets blind people use mobile phone to:1. Take a photo2. Speak a question3. Receive multiple spoken answers

Motivation

• Current technology uses automatic approaches to help blind people access visual information– Optical Character Recognition (OCR) – Ex) Kurzweil knfbReader: ~$1,000

• Problems: error-prone, limited in scope, expensive– Ex: OCR cannot read graphic labels, handwritten

menu, street sign

Motivation

• Solution: ask real people• Can phrase questions naturally– “What is the price of the cheapest salad?” vs. OCR

reading the entire menu• Feedback– Real people can guide blind people to take better

photos• Focus on blind people’s needs, not current

technology

Human-Powered Services

• Problem: Response latency• Solution: quikTurkit (and some tricks)– “First attempt to get work done by web based

workers in nearly real-time”– Maintain a pool of workers to answer questions

quikTurkit

• “requesters create their own web site on which Mechanical Turk workers answer questions.”

• “answers are posted directly to the requester’s web site, which allows [them] ... to be returned before an entire HIT is complete.”

• “workers are required to answer multiple previously-asked questions to keep workers around long enough to possibly answer new questions”

Answering Site

Deployment• Setting: 11 blind iPhone users• quikTurkit:

– Reward range: $0.01 (1/2 of jobs), $0.02/3 (1/2)• Results: 86.6% of first answers “correct”

– Average of 133.3s latency for first answer

Problems: Photos too dark or too blurry and thus unanswerable.

VizWiz 2.0 detects and alerts users if photo is too dark or blurry

VizWiz: LocateIt

• Combine VizWiz with computer vision to help blind people locate objects

LocateIt mobile: Sensor (zoom and filter) and sonification modules

Web interface

(angle)

(distance)

Future Work

• Expand worker pools beyond Mechanical Turk (e.g. social network)

• Reduce cost by using game, volunteers, friends

• Improve interface to make photo-taking easier for blind people

• Combine automatic approaches to improve delay

Discussion

• Resource usage vs. delay (wasting resources for better responses or near real-time services?)– Any better approach than quikTurkit?

• Quality control? How do we make sure that the workers correctly identified the photos?– How do systems accept/reject the submissions? (if

it’s becoming a large scale service?)• Other application scenarios with quikTurkit? • Adding inertial sensors to LocateIt?

Labor Marketplace Applications: Human Computer Interaction KAIST KSE Uichin Lee 9/26/2011

Documents

KSE631: Content Networking Uichin Lee KAIST KSE Feb. 07, 2012

Understanding Google Answers KSE 801 Uichin Lee. Outline Earnings and Ratings at Google Answers (Edelman 2004) – General statistics, what do askers value,

Energy Management: Part I Uichin Lee KAIST KSE. Mobile Processing Power – Changing the Mobile Device From

Task and Workflow Design II KSE 801 Uichin Lee. Contents Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol:

M-FAMA: A Multi-session MAC Protocol for Reliable Underwater Acoustic Streams 1 Seongwon Han (UCLA) Youngtae Noh (Cisco Systems Inc.) Uichin Lee (KAIST)

Uichin Lee KAIST KSE KSE801: Mobile and Pervasive Computing for Knowledge Services

Ubiquitous Human Computation KAIST KSE Uichin Lee May 11, 2011

Social Network Analysis KSE 652 Social Computing Systems: Design and Analysis Uichin Lee Oct. 16, 2013

Social Computing Research Sept. 6, 2012 Uichin Lee KAIST KSE

KSE RuleBook

Understanding Localness of Knowledge Sharing: A Study of Naver KiN “Here” Sangkeun Park, Yongsung Kim, Uichin Lee (KAIST, Knowledge Service Engineering)

Energy Management Uichin Lee KAIST KSE. Mobile Processing Power – Changing the Mobile Device From

KAIST · 2019. 9. 26. · KAIST

M2M and Semantic Sensor Web KAIST KSE Uichin Lee

Preserving Location Privacy Uichin Lee KAIST KSE Slides based on by Ling Liu

Ubiquitous Human Computation KSE 801 Uichin Lee. Outline Papers today: – Crowd-Sourced Sensing and Collaboration Using Twitter, WoWMOM 2010 – Earthquake

Audio Processing for Ubiquitous Computing Uichin Lee KAIST KSE

Q&A over Social Networks KSE 801 Uichin Lee. Aardvark: The Anatomy of a Large- Scale Social Search Engine Damon Horowitz, Aardvark Sepandar D. Kamvar,

Uichin Lee, Jihyoung Kim *, Eunhee Yi **, Juyup Sung, Mario Gerla * KAIST Knowledge Service Engineering * UCLA Computer Science ** LG UX R&D Lab {uclee,juyup.sung}@kaist.ac.kr

Task and Workflow Design in Human Computation KSE 652 Social Computing System Design and Analysis Uichin Lee