Introduction to Amazon Mechanical Turk Applications
Demographics and statistics The value of using MTurk Repeated
labeling A machine-learning perspective
Slide 3
Automaton Chess Player built in 80s.
Slide 4
Human Intelligence Task (HIT) Tasks hard for computers
Developer Prepay the money Publish HITs Get results Worker Complete
the HITs Get paid
Slide 5
User Survey
Slide 6
Image Tagging
Slide 7
Data Collection
Slide 8
Audio Transcription Split the audio into 30sec pieces Image
Filtering Filter porn or inappropriate image Lots of
applications
Slide 9
It depends on the task. Some information: Payment >= 0.01:
586 Payment >= 0.05: 357 Payment >= 0.10: 264 Payment >=
0.50: 74 Payment >= 1.00: 48 Payment >= 5.00: 5
Slide 10
Slide 11
Survey on 1000 Turkers Conduct the survey twice (Dec. 2008 and
Oct. 2008) Consistent statistics Blog Post: A Computer Scientist in
a Business School A Computer Scientist in a Business School Where
are Turkers from? United States76.25% India 8.03% United Kingdom
3.34% Canada 2.34%
Slide 12
Degree Age Gender Income/year
Slide 13
Use the data from ComScore In summary, Tukers are younger
Portion of 21-35 years old: 51% vs. 22% in internet mainly female
70% female vs. 50 % female having lower income 65% turkers with
income < 60k/year vs. 45% in internet having smaller family 55%
turkers have no children vs. 40% in internet
Slide 14
Slide 15
Slide 16
Slide 17
Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis
New York University KDD 2008
Slide 18
Imperfect labeling Amazon mechanical Turk Games with a purpose
Repeated labeling Improve the supervised induction Increase the
single-label accuracy Decrease the cost for acquiring training
data
Slide 19
Increase single-label accuracy Decrease cost for training data
Labeling is cheap (using MTurk or GWAP) Obtaining data sample might
be expensive (taking new pictures, feature extraction)
Slide 20
How repeated labeling influence quality of the label accuracy
of the model cost of acquiring data and the label Selections of
data points to label repeatedly
Slide 21
Uniform labeler quality All labelers exhibit the same quality p
p is the probability labeler label correctly For 2N+1 labelers, the
label quality q is Label quality for different settings of p
Slide 22
Different labeler quality Repeated labeling is helpful in some
cases An example: three labelers with quality p, p+d, p-d Repeated
labeling is preferable to single labeler with quality p+d when
settings is in the blue region No detailed analysis in the
paper
Slide 23
Majority voting (MV) Simple and intuitive Drawback of
information lost Uncertainty-preserved labeling Multiplied Example
procedure (ME) Using frequency as the weight of the label
Slide 24
Round-robin strategy Label the example with the fewest labels
Repeated label the examples in a fixed order
Slide 25
The definition of the cost C U : the cost for the unlabeled
portion C L : the cost for labeling Single labeling (SL): Acquire a
new training example cost C U +C L Repeated labeling with majority
vote (MV) Get another label for existing example cost C L