41
The Dynamics of Micro-Task Crowdsourcing The Case of Amazon MTurk Djellel Eddine Difallah , Michele Catasta, Gianluca Demartini, Panos Ipeirotis, Philippe Cudré-Mauroux WWW’15 - 20th May 2015 - Florence 1

The Dynamics of Micro-Task Crowdsourcing

Embed Size (px)

Citation preview

Page 1: The Dynamics of Micro-Task Crowdsourcing

The Dynamics of Micro-TaskCrowdsourcing

The Case of Amazon MTurk

Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panos Ipeirotis, Philippe Cudré-Mauroux

WWW’15 - 20th May 2015 - Florence 1

Page 2: The Dynamics of Micro-Task Crowdsourcing

Background

Crowdsourcing is an Effective solution to certain classes of problems

2

Page 3: The Dynamics of Micro-Task Crowdsourcing

Background

A Crowdsourcing Platform allows requesters to publish a crowdsourcing request (batch)

composed of multiple tasks (HITs)

Programmatically Invoke the crowd with APIs

3

Page 4: The Dynamics of Micro-Task Crowdsourcing

Background

Paid Microtask Crowdsourcing scales-out but remains highly unpredictable

4

Page 5: The Dynamics of Micro-Task Crowdsourcing

Background

Paid Microtask Crowdsourcing scales-out but remains highly unpredictable

5

time

#HITs/ Minute

Batch Throughput

Page 6: The Dynamics of Micro-Task Crowdsourcing

SLAs are expensive

6

Page 7: The Dynamics of Micro-Task Crowdsourcing

MTurk is a Marketplace for HITs

Direct: Price, Time of the day, #workers, #HITs etc

Other: Forums, Reputation-sys (TurkOpticon), Recommendation-sys (Openturk) 7

Page 8: The Dynamics of Micro-Task Crowdsourcing

A Data Driven Approach

8

Page 9: The Dynamics of Micro-Task Crowdsourcing

9

Page 10: The Dynamics of Micro-Task Crowdsourcing

...Five Years Later[2009 - 2014]

mturk-tracker collected 2.5Million different batches

with over 130Million HITs

10

Page 11: The Dynamics of Micro-Task Crowdsourcing

mturk-tracker.com

● Collects metadata about each visible batch (Title, description, rewards, required qualifications, HITs available etc)

● Records batch progress (every ~20 minutes)

We note that the tracker reports data periodically only and does not reflect fine-grained information (e.g., real-time variations)

11

Page 12: The Dynamics of Micro-Task Crowdsourcing

Menu

1. Notable Facts Extracted from the Data

2. Large-scale HIT Type Classification

3. Analyzing the Features Affecting Batch Throughput

4. Market Analysis

12

Page 13: The Dynamics of Micro-Task Crowdsourcing

1) Notable Facts Extracted from the Data

13

Page 14: The Dynamics of Micro-Task Crowdsourcing

Country-Specific HITs

14

US and India?

Page 15: The Dynamics of Micro-Task Crowdsourcing

Country-Specific HITs

Workers from US, India and Canada are the most sought after.15

Page 16: The Dynamics of Micro-Task Crowdsourcing

Distribution of Batch Size

16

“Power-law”

Page 17: The Dynamics of Micro-Task Crowdsourcing

Evolution of Batch Sizes

Very large batches

start to appear

17

Page 18: The Dynamics of Micro-Task Crowdsourcing

HIT Pricing

18

Is 1-cent per HIT the norm?

Page 19: The Dynamics of Micro-Task Crowdsourcing

HIT Pricing

19

5-cents is the new

1-cent

Page 20: The Dynamics of Micro-Task Crowdsourcing

Requesters and Reward Evolution

20

Increasing number of New and Distinct Requesters

Page 21: The Dynamics of Micro-Task Crowdsourcing

2) Large-scale HIT Type Classification

21

Page 22: The Dynamics of Micro-Task Crowdsourcing

Classify HITs into types (Gadiraju et. al 2014)- Information Finding (IF)- Verification and Validation (VV )- Interpretation and Analysis (IA)- Content Creation (CC)- Surveys (SU)- Content Access (CA)

22

HIT Classes

Page 23: The Dynamics of Micro-Task Crowdsourcing

We trained a Support Vector Machine (SVM) model

- HIT title, description, keywords, reward, date, allocated time, and batch size

- Created labeled data on Mturk for 5,000 HITs uniformly sampled HITs- Our HIT used 3 repetitions

- Consensus reached for 89% of the tasks- 10-fold cross validation

- Precision of 0.895- Recall of 0.899- F-Measure of 0.895

- We then performed a large-scale classification for all 2.5M HITs

Supervised ClassificationWith the Crowd

23

Page 24: The Dynamics of Micro-Task Crowdsourcing

Distribution of HIT Types

Less Content Access batches

Content Creation being the most popular24

Page 25: The Dynamics of Micro-Task Crowdsourcing

3) Analyzing the Features Affecting Batch

Throughput

25

time

#HITs/ Minute

Batch Throughput

Page 26: The Dynamics of Micro-Task Crowdsourcing

Batch Throughput Prediction

29 Features

HIT Features

HITs available, Start Time, Reward, Description length, Title length, Keywords, requester_id, Time_alloted, Task type, Age (minutes) etc.

Market Features

Total HITs available, HITs arrived, rewards Arrived, % HITs completed etc.

26

Page 27: The Dynamics of Micro-Task Crowdsourcing

Batch Throughput Prediction

Ttime

delta

- Predict batch throughput at time T by training a Random Forest Regression model with samples taken in [T-delta, T) time span

- 29 Features (including the Type of the Batch)- Hourly Data in range [June-October] 2014- We sampled 50 times points for evaluation purposes

27

Page 28: The Dynamics of Micro-Task Crowdsourcing

Batch Throughput Prediction

Ttime

delta

- Predict batch throughput at time T by training a Random Forest Regression model with samples taken in [T-delta, T) time span

- 29 Features (including the Type of the Batch)- Hourly Data in range [June-October] 2014- We sampled 50 times points for evaluation purposes

We are interested in cases where prediction works reasonably28

Page 29: The Dynamics of Micro-Task Crowdsourcing

Predicted vs. Actual Batch Throughput (delta=4 hours)

Prediction Works best for larger batches having large momentum

29

Page 30: The Dynamics of Micro-Task Crowdsourcing

Significant Features

- What features contribute best when the

prediction works reasonably

- We proceed by feature ablation

- Re-run prediction by removing 1 feature at a time

- 1000 samples

30

Page 31: The Dynamics of Micro-Task Crowdsourcing

Significant Features

- What features contribute best when the prediction works reasonably

- We proceed by feature ablation- Re-run prediction by removing 1 feature at a time.- 1000 samples

HITs_Available (Number of tasks in the batch)

Age_Minutes (how long ago the batch was created)

31

Page 32: The Dynamics of Micro-Task Crowdsourcing

4) Market Analysis

32

Demand - The number of new tasks published on the platform by the requesters

Supply - The workforce that the crowd is providing

Page 33: The Dynamics of Micro-Task Crowdsourcing

Supply Elasticity

How does the market reacts when new tasks arrive on the platform?

33

Page 34: The Dynamics of Micro-Task Crowdsourcing

Supply Elasticity

We regressed the percentage of work done (within 1 Hour) against the number of new HITs

34

Page 35: The Dynamics of Micro-Task Crowdsourcing

Supply Elasticity

Intercept = 2.5Slope = 0.5%

20% of new work gets completed within an hour

35

Page 36: The Dynamics of Micro-Task Crowdsourcing

Supply Elasticity

Intercept = 2.5Slope = 0.5%

20% of new work gets completed within an hour

36

Page 37: The Dynamics of Micro-Task Crowdsourcing

Demand and Supply Periodicity

Demand Supply37

Page 38: The Dynamics of Micro-Task Crowdsourcing

Demand and Supply Periodicity

Strong weekly periodicity 7-10 days.38

Page 39: The Dynamics of Micro-Task Crowdsourcing

Conclusions

- Long time data analysis uncovers some hidden trends

- Large scale HIT classification

- Important features in throughput prediction (HITs

available, Age_minutes)

- Supply is Elastic

- (More work available -> More work Done)

- Supply and Demand are periodic (7-10days) 39

Page 40: The Dynamics of Micro-Task Crowdsourcing

Is a Crowdsourcing Marketplace the right paradigm for efficient and predictable

crowdsourcing?

40

Page 41: The Dynamics of Micro-Task Crowdsourcing

Is a Crowdsourcing Marketplace the right paradigm for efficient and predictable

crowdsourcing?

41

Q&A

Djellel Difallah

[email protected]