18
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1 , He He 2 , and John S. Baras 1 1 Institute for Systems Research and Department of Electrical and Computer Engineering University of Maryland, College Park, MD 2 Deptment of Computer Science, University of Maryland, College Park, MD

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Embed Size (px)

Citation preview

Page 1: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Trust-Aware Optimal Crowdsourcing With Budget

Constraint

Xiangyang Liu1, He He2, and John S. Baras1

1Institute for Systems Research and

Department of Electrical and Computer Engineering

University of Maryland, College Park, MD2Deptment of Computer Science, University of Maryland, College

Park, MD

Page 2: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Motivation

• Requester has budget constraint.• Workers on AMT platform have varying reliability. Some are

even malicious.• Each task incurs certain amount of cost. Difficult tasks are

more expensive and easy tasks are cheaper.• Workers with higher reliability should expect to receive higher

pay and workers with lower reliability are cheaper to be recruited to answer questions.

Goal: optimally assign tasks to workers with varying trust and reliability under budget constraint.

Page 3: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Problem Setting

Crowdsourcing Assignment Engine

Malicious workers

More reliable workers

Pure experts

Amazon Turkers

Trust Evaluation

True Label Inference

Objective: minimize estimation error

Task distributed to Turkers

Page 4: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Problem Formulation

Crowdsourcing Assignment Engine

Objective: minimize estimation error

Task distributed to Turkers

• depends on the choice of estimation algorithm.• is the truth value for question i.• w are estimated trust values of workers given by an independent

component introduced in the previous subproblem. It is assumed to be fixed and serves as input to the assignment engine.

Page 5: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Applying Probabilistic Approximation Correction (PAC) from learning theory, we relax the previously nondeterministic optimization problem into a convex optimization problem:

Proof:

Let the right hand side equals

Trust-Aware Budget Allocation

Page 6: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Proof continued:

With probability , the following holds:

We express as:

If , question i is always estimated correctly. Otherwise we get the wrong answer with probability

Trust-Aware Budget Allocation

Page 7: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Proof continued: Therefore, we obtain the upper bound on the error rate which we are going to minimize

The last inequality holds since w’s takes values from [0,1]. Therefore, we relaxed the optimization problem to minimizing the new upper bound:

Trust-Aware Budget Allocation

Page 8: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Trust-Aware Budget Allocation

Applying Probabilistic Approximation Correction (PAC) from learning theory, we get a relaxed problem:

Intuition in the solution: when budget is not sufficient, assign budget to the most efficient workers. The most efficient worker is defined to be a worker that has highest reliability (squared)-cost ratio.

Analytical solution

Page 9: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Trust-Aware Budget Allocation With Penalty

Intuition: when budget is high, we want to allocate budget to expensive workers (more trustworthy) instead of just efficient workers. Therefore, the taste of uniform strategy enforced by the penalty term serves exactly this purpose.

Page 10: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Theoretical Guarantees

• We provide the upper bound on the error probability rate of the trust-aware budget allocation scheme: given budget B, i.e.,

The error bound above has the following characteristics:• decreases exponential with budget B.• decreases when cost is lower.• decreases when workers are more reliable.

Page 11: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Theoretical Guarantees

• ProofAssume weight majority vote, the labels contributed by workers are aggregated by

Hoeffding concentration bound gives us:

Plugging the optimal solution we obtain the bound straightforwardly.

Page 12: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Theoretical Guarantees

• If , with probability at least , the total error probability satisfies:

Page 13: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Experiment BenchmarksUA: the algorithm tends to allocate the same number of people to answer a question from each available crowd. If the budget is not used up, for each question, it randomly chooses an expert from the set of crowds.

CQSA: for each question, the algorithm only chooses people from the most trustworthy crowd to assign

If budget is not consumed, it iterates the question set again and randomly chooses an expert from the set of crowds for each question.

CA: the algorithm only chooses the cheapest crowd (the least trustworthy crowd) for questions according to

Page 14: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Experiment Results

• TAAP performs the best out of all benchmarks across the span of budget.• When budget is small (<200), the improvement of TAA and TAAP over CQSA and UA is

up by 30%.• TAA performs poorly when budget is high due to the floor function during

assignment and the sparsity feature in the optimal solution in the relaxed problem (only workers from the most efficient group are chosen). This is fixed by TAAP.

TAA: Trust-aware allocation.

TAAP: Trust-aware allocation with penalty.

Page 15: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Experiment Results

• In this experiment, we do not assume the trust values of workers can be perfectly estimated. We add Gaussian noise on the true trust values of workers. We do experiments with varying level of noises (increasing means of Gaussian noise).

• TAAP is affected to some extend when the noise goes high. However, when budget is high, the allocation scheme is very robust against all levels of noises.

Page 16: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Conclusions

• We formalize the problem of trust-aware task allocation in crowdsourcing and provide a principled way to solve it.

• We model the workers’ trustworthiness as reliability and the cost depends on both workers’ trusts and questions’ difficulty. Our method is flexible in that you can plugin more complicated aggregation method other than weighted majority vote.

• We provide theoretical guarantee for our trust-aware allocation scheme.

Page 17: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Future Work

• The theoretical guarantee for the trust-aware algorithm when the weight (trust) can not be perfectly estimated has not been addressed yet. It would be interesting to investigate this.

• Now the trust estimate is assumed to be fixed and given by another component. We will consider the case where trust is dynamically updated and crowdsourcing assignment is done online instead of offline.

Page 18: Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department

Thank you