Game Theoretical Modeling and Studies of Peer-Reviewing Methods

Preview:

DESCRIPTION

Game Theoretical Modeling and Studies of Peer-Reviewing Methods. Marius C. Silaghi Florida Tech CS Seminar, Fall 2012. Peer-Reviewing Games. Game Theory Concepts Peer-Reviewing Background Model Peer-Reviewing as Game Theoretical analysis of PR Games Experimental analysis of PR Games. - PowerPoint PPT Presentation

Citation preview

Game Theoretical Modeling and Studies of Peer-Reviewing Methods

Marius C. Silaghi

Florida Tech CS Seminar, Fall 2012

• Game Theory Concepts

• Peer-Reviewing Background

• Model Peer-Reviewing as Game

• Theoretical analysis of PR Games

• Experimental analysis of PR Games

Peer-Reviewing Games

Game Theory

Is not as much about computer games.

Game Theory

It is about understanding motivations (utilities $$$).

Fundamentals of Game Theory

Understanding what will happen in a given situation

Typical example: “Prisoner’s Dilemma”

Golden Balls Trust

Golden Balls strategies

Each can rat out the other or remainsilent, resulting in 4 possible outcomes

Fundamentals of Game Theory

Understanding what will happen in a given situation

Typical example: “Prisoner’s Dilemma”

Perfectly rational players

only care for their own felicity (utility)

Payoff matrix

y defects y cooperates

x defects -5, -5 0,-20

x cooperates -20,0 -1,-1utility of yutility of x,

Fundamentals of Game Theory

Understanding what will happen in a given situation

Typical example: “Prisoner’s Dilemma”

Perfectly rational players

only care for their own felicity (utility)

Payoff matrix

y defects y cooperates

x defects -5, -5 0,-20

x cooperates -20,0 -1,-1

Mechanism design is about selecting right payoffs to encourage a “social choice” function:W. Vickrey got Nobel Prize in 1996 for 2nd Price Auctions for “truthful bidding”.

utility of yutility of x,

Iterated Prisoner’s Dilemma

The game repeats every 20 years for 1000 years

Strategies:

Tit for Tat

Forgiving Tit for Tat

Optimistic Tit for Tat

Iterated Prisoner’s Dilemma

The game repeats every 20 years for 1000 years

Strategies:

Tit for Tat

Forgiving Tit for Tat

Optimistic Tit for Tat

Strategy Equilibrium studies try to predict behavior in existing games by theoretically orexperimentally analyzing their impact on a player’s utilities.

A player utilities define its typePlayer will select best strategy given strategies currently used by other participants.

Rational Players

(… = predictable)

Rational players are ones that predictably try to maximize their utility.

Utility can be expressible in $$$.

Can most people can be assumed rational?

(if given enough time and help to think)

One has to take into account the utilities as defined by the beliefs (type) of the given player

Rational Players

(… = predictable)

Are avaricious/epicurean/workaholic people rational?

One has to take into account the utilities as defined by the beliefs (type) of the given player.

- obviously, avaricious people believe in the value of the dollar.

If one is alive, one likely has (subjective) beliefs.

Can they be manipulated?

Yes, since they are predictable: promise’em what they want.

Rational Players

(… = predictable)

Are seekers of fame rational?

One has to take into account the utilities as defined by the beliefs (type) of the given player.

- a player that believes in immortality through fame will value fame (quantifiable in money).

Tiberius

Temple of Artemis in Ephesus burned on July, 356 BC by Herostratus who wished that:

”His name be spread to the whole Earth.”

To dissuade copycats, Ephesians ruled that his name should never be pronounced.Can they be manipulated?

Rational Players

(… = predictable)

Is a “fanatical altruist” rational?

One has to take into account the utilities as defined by the beliefs (type) of the given player.

- an altruist will be happier if he believes that “others” (family, country, humanity, animals) are happier.

Coax him by claiming that “others” love to be bombed!

Can he be manipulated?

Rational Players

(… = predictable)

Is a religious person rational (maximizing a utility)?

He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational).

One has to take into account the utilities as defined by the beliefs (type) of the given player.

- for a player that has a degree of belief in afterlife,

his utilities are a function of that religion (believed mechanisms to reach afterlife).

Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’.Can he be manipulated?

Rational Players

(… = predictable)

Is a religious person rational (maximizing a utility)?

He (concurring with cynics) likes to claim he is not looking for $ / (i.e. rational).

One has to take into account the utilities as defined by the beliefs (type) of the given player.

- for a player that has a degree of belief in afterlife,

his utilities are a function of that religion (believed mechanisms to reach afterlife).

Devil lies in details: claim that the correct interpretation of ‘A’ is ‘D’.Can he be manipulated?

Game Theory

It is about understanding motivations (utilities).

Used in real war

Used in designing how an enterprise/country works

Used commonly in macro-economy

Can be used for peer-reviewing

For computer science:

• Used in Multi-agent Systems

• It is a computational problem (require simulations, models, etc.)

Motivation Machine

Game TheoryUsed in real war

Why do soldiers obey their commanders in the army of the enemy?

• Could one offer them something such that they defect?

• Could one make them think that they (or their country) is better capitulating?

• (whether that is true or false)

Why does the president of the enemy country fight for that system?• Could one blackmail/bribe/menace/convince/confuse him into quitting?

Most research and funding for game theory seems to be here.

Game TheoryUsed in designing a country

Nicolo Machiavelli, 1469-1527

Italian ex-politician of the Republic of Florence

Discourses on Livy

El Principe

Game TheoryUsed in designing a country

People obey laws (for fear of police).• besides brainwashing in schools/media

Police obeys (for fear of army).

Army obeys (for fear of secret services).

Secret services obey (for high pay, or fear of another secret service).• checks and balances • good fences make good neighbors

Poorly designed countries (with the incentives missing/disappearing for some ring in the chain), have been seen collapsing in spectacular manners:

Roman Empire (motivation of soldiers?)

USSR (motivation of KGB?)

Yugoslavia (brainwashing failure, motivation of states in confederation?)

9 countries (11 states)15

Game TheoryUsed commonly in macro-economy

What combination of fees/taxes/subventions would lead to a strong economy?

• (where resources end-up in the hand of those who know/can make most out of them, etc.)

E.g.: subventions only to people owning over 50ha,

concentrates farming land in hands of those who have money for machines and technology.

Game Theory

Can be used for peer-reviewing:

Peer-reviewing is the foundation of modern scientific research and

• controls the speed of the development and

• significant decisions on allocation of funding.

• Game Theory Concepts• Peer-Reviewing Background• Model of Peer-Reviewing as Game• Theoretical analysis of PR Games• Experimental analysis of PG Games• Conclusions

Index

Peer-Review

Features of PR mechanisms keep getting richer to improve and encourage research quality.

o Blind reviewing, o Author's reply to comments by reviewers, o Reviewers bid for papers, o Authors rate reviewers, o Authors blacklist reviewers

Most scientists regarded the new streamlinedpeer-review process as ‘quite an improvement.’

scienceforseo.com

Common Blind Peer-Reviewingfor Conferences

1. Chair assigns each paper to a Senior PC

2. SPC distributes a paper to 3 PC members (bidding)

3. PC gives a paper to a reviewer student

4. Each student reviews and assigns score.

5. Author sees reviews and answers

6. Student/PC may change review

7. PC forwards review to SCP

8. SCP gathers 3 reviews: rejects if any reject

9. Chair makes last changes:

10. Applies threshold (1/10 papers)

11. Answers complains

Chair

Senior Program Committee

Program Committee

Accept 7

Reject 3

Accept 5

Open Peer-Review (e.g. Material Thinking Design Workshop, 2007)Mainly with journals (biology)

Reviewers bid on papers. Papers distributed to 3 reviewers.

Each reviewer writes a short article with the review of the paper.

Authors see reviews and:

• can withdraw paper, or

• may write for each review a short article with an answer.

Reviewers see answers and can withdraw their reviews.

Papers are published together with reviews and answers.

Paper with negative reviews (no withdrawn) are

published as technical reports together with the reviews.

main article

reviewarticles

reviewanswerarticles

Proceedings

Open Peer-Review Papers without reviews. What to do?

Understanding facts/possible motivations/conclusions

1. nobody accepted to review

• likely not relevant to community, or

• reluctance to write negative reviews

• but may also be a boycott

fair: tech rep for an archiving fee

2. all reviews are withdrawn / not submitted in time:

• (paper’s fault) could be irrelevant or poor quality

• (reviewer’s fault) overcommitted, malicious strategy

• assigned reviewer names should be published

accepted or rejected? A 3rd category!

The number of non-reviewed reports is a measure of the quality of the symposium/community.

usable for deciding whether to submit similar articles in the future.

main article

reviewarticles

reviewanswerarticles

Proceedings

?

• Game Theory Concepts• Peer-Reviewing Background• Model of Peer-Reviewing as Game• Theoretical analysis of PR Games• Experimental analysis of PG Games• Conclusions

Index

Players

Game players:o The researchers - Authors and reviewers,

a repeated/iterated game at each conference

o Funding Agency - rewards researchers based on their publications. mechanism designer, or player in a game with the researchers

next slides with help of R. Vishen

Concepts

• Model paper quality by a paper's worth - utility to the society.

• Worth is evaluated by expert reviewers.

• Assumption: All reviewers in the symposium are equally expert. Reviewers have the same type (association paper worth)

t : Γ →R

Assumption of equally expert reviewers

• Note: we fail to model people emotionally attached to antagonistic scientific beliefs in a community:

• scientists believing in climate warming vs. unbelievers• scientists believing (or not) in the relevance of a given metric:

• is/isn’t privacy more important than verifiability (in voting)• is network logic runtime more relevant than real runtime?

• scientists with a given emotional belief should probably create their own communities/conferences.

• Authors and Reviewers expect rewards from a funding agency.• Assumption: funding agency intends to maximize social value

• Social value is defined as “sum of quality of the endorsed papers.”

• An article is endorsed if it is published with favorable reviews by experts.• Given a set of texts appearing in a community of type t, the social choice

function is :

Maximizing the total utility:

Model

t γ( )( ){γ |endorsed (γ )}

∑€

f Γ t( ) = s∀γ ∈Γ, t γ( ) > 0⇒ γ ∈endorsed s( ) ∧ t γ( ) < 0⇒ γ ∉endorsed s( ){ }

trouble

• Conferences have multiple venues:• orally presented papers• posters• technical reports

Publication venues and social value

Venues gives a way to automate the accounting of the paper worth, via its impact on the visibility (number of citations):oTechnical reports are less endorsed than posters or orally

presented papers.oThe social value given a set of publication venues ψ (posters,

regular papers, etc.) consists of the weighted sum of the worth of the published papers (assume measurable via citation influence).

Publication venues and social value

wψ * t γ( )( )γ∈ψ

∑ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟

ψ∈Ψ

• Funding agency settings for distributing funds (rewards):oCitation Influence,oPublications count.

Assumption: the funding agency cannot access the worth of a paper directly.

Utilities (Motivation)Let us convert it all to

o The citations influence (CI) of an author at a given moment is a metric of the influence of his publications, and it estimates the weighted sum of the worth of his publications on each of the three venues Ψ, Ψ = {regular, poster, technical report}

Citation Influence

CI = wψ * t γ( )( )γ∈ψ (author )

∑ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟

ψ∈Ψ

• A researcher gets reputation (positive utility) when papers are cited.• often one cannot automatically distinguish good vs bad citations.

• A researcher can get bad reputation (negative utility):• for publishing erroneous articles, (as pointed by citations/reviews)

or• if his/her review is proven to be incorrect. (only with Open PR)

Utilities (Motivation)Let us convert it all to:

E=mc2

What about reviewing (what is the motivation?) Being asked to review is like a citation (proof of reputation).

To be asked again one has to promptly review when requested.

But why writing good reviews rather than random ones?- Fear of “Authors scoring of reviewers” (not typically valuable: tit for tat)- SPC cannot generally notice poor reviews.

- (when noticed, no mechanism to disseminate it)

Utilities (Motivation)Let us convert it all to:

Deviation from truthful reviewing may pay.• Reviewing takes time (writing random reviews earns you time)• Conferences with thresholds are zero-sum games.• A paper “p” is superseded by a newer paper “n” when the new paper

“n” steals the show for “p” (reduces future citations of p).

Utilities (Motivation)Let us convert it all to:

o The citations influence (CI) of an author at a given moment is a metric of the influence of his publications, and is given by the weighted sum of the worth of his un-superseded publications of each of the three venues Ψ, Ψ = {regular, poster, technical report}

Citation Influence with superseding

CI = wψ * t γ( )( )γ∈un −superseeded(ψ (author ))

∑ ⎛

⎝ ⎜ ⎜

⎠ ⎟ ⎟

ψ∈Ψ

• Game Theory Concepts• Peer-Reviewing Background• Model of Peer-Reviewing as Game• Theoretical analysis of PR Games (#-based)• Experimental analysis of PG Games (CI-based)• Conclusions

Index

• Rewards author i based on the number and venue of publications.

• Paper superseding not relevant.

• Conferences:• with threshold on paper acceptance rate (1/10, 1/20)• without thresholds on paper acceptance

Funding Based on Counting articles(Trusted Peer-Reviewing)

R = noi *wo + np

i *wp + nti *wt

• Two-players conference, with one submission each,with {accept, reject} decisions, the payoff matrix is:

Funding Based on Counting articles(Trusted Peer-Reviewing)No thresholds on paper acceptance rates

accept x’s reject x’s

accept y’s 1,1 0,1

reject y’s 1,0 0,0

One-shot game: Best strategy is “make random decision”.Iterated game: Effective strategy is “(forgiving) Tit-for-Tat”

• Conference in order to remain relevant to the funding agency puts a threshold on the ratio of accepted papers. (CBR CBRz)

• In this new version the actions available to players are not {accept, reject}, but the scores {low, high}.

• In case of tie, a paper is randomly selected.

• 2-players case: Zero-sum game.

Funding Based on Counting articles(Trusted Peer-Reviewing)Paper Acceptance Thresholds

high x’s low x’s

high y’s 0.5,0.5 0,1

low y’s 1,0 0.5,0.5

• n-players case (accepting n/k articles, equally worthy submissions):• Zero-sum game.• not pair-wise Zero-sum game

• Utility of rejecting a paper (one less competitor):

(Trusted Peer-Reviewing)Multiple playersPaper Acceptance Thresholds

high x’s low x’s

high y’s 0,0 0,1/k(n-1)

low y’s 1/k(n-1),0 1/k(n-1),1/k(n-1)€

1

k(n −1)=n /k

n −1−n /k

n

For small CBR communities, a dominant strategy is “always low”.For huge CBR communities, a dominant strategy tends to “random review”.Always “low” is a Nash equilibrium (observed in some small communities)!With Tit-for-Tat opponents, Nash equilibrium is “always low”

if the opponent is not met again for (n-1) rounds.With Hits-for-Tat opponents, Nash equilibrium is “always high”

Hits-for-Tat: an opponent can strike back many (m>>1) times for one Tatexpected penalty:

m

k(n −1)

Truthful reviewing for TPRConclusion

With counting of articles: Truthful reviewing is never in equilibrium.

(under the working assumptions: mainly that nobody notices how you review).

• Game Theory Concepts• Peer-Reviewing Background• Model of Peer-Reviewing as Game• Theoretical analysis of PR Games (#-based)• Experimental analysis of PG Games (CI-based)• Conclusions

Index

Experimental studiesSimulations

Assumption: Funding based on CI

Compared mechanisms:

Open Peer-Review (SelectivitY)

Common Blind Review (CBR)

Common Blind Review with paper acceptance threshold (CBRz)

• Generate 100 random research communities (i.e. simulations)• 20 researchers • 20 conferences (i.e. 20 years)

• All participants are considered equally expert and inventive.

• Researchers get ideas for articles with a Uniform distribution at an average of (only) 2 articles per year, and a worth that is uniformly random in [-10, 10].

• Each paper is superseded each year with a probability of 1/5.

• Experiments for the weights wo = 0.5, wp = 0.3, and wt = 0.1.

Evaluation – Simulation Experiment

• Compared Review Strategies

i. Truthful reviewing.

ii. Truthful reviewing except for papers superseding one’s work, which are rejected.

iii. Random reviewing except for papers superseding one’s work, which are rejected.

iv. Giving the opposite possible score to all papers (reject good papers and accept poor papers).

v. Giving the lowest possible score to all papers.

(Tit-for-Tat was not explored here: assumed to have limited relevance)

Evaluation - Reviewer Types

• A misclassifying comment has a negative worth.o if published, it will be accounted only as bad reputation for the

reviewer.

• Misclassifying comment worth - the difference between the corresponding value and the real worth of the paper (always negative).

• The worth of an author’s answer to a negatively misclassifying comment is the same as the “absolute value of the worth of the misclassifying comment”. Otherwise, the answer has worth zero.

Evaluation – ExperimentAssumptions about worth of reviews

a. All reviewers review truthfully.

b. All reviewers review truthfully, except for one reviewer who rejects articles superseding his work but reviews truthfully submissions not superseding his work

c. All reviewers review truthfully, except for one reviewer who rejects articles superseding his work and reviews randomly submissions not superseding his work.

d. All reviewers review truthfully submissions not superseding their work and reject the other submissions.

e. All reviewers review randomly submissions not superseding their work and reject the other submissions.

Evaluation - Experiment CasesCombinations of strategies tested for equilibria

• For both reviewing mechanisms the goal of the funding agency (social value) is maximized with truthful reviewing – case (a).

• Reduced in other cases.

• In SY with cases (b)-(e), even if all worthy papers are published, the total worth is reduced compared to the case(a) - [remember, technical reports have less weight]

Evaluation - Experiment Results

• To evaluate the equilibrium of truthful reviewing researcher 1 performs non-truthful reviews.

• The experiments show the extent of the implications of the use of different strategies with CBR and SY.

• Confirms that truthful reviewing is not in Nash equilibrium using CBR, but it is in Nash equilibrium when SY is used under given assumptions and strategies.

Evaluation - Experiment Results for equilibriums with CI

Experiments with funding based on counting

Settings:• 100 researchers• each paper reviewed by 4 people (assumed truthful except for 1).• ¼ are selected for publication• 500000 randomized simulation runs.

Truthful reviewing was always leading to less benefits for reviewer:• with strategy iv (inverting), gain 11.19% more publications• with strategy v (always low), gain 15% more publications

Conclusions

We gave an example of how to analyze peer-reviewing mechanisms.

Introduced “Peer-Reviewing Games”, an abstraction of real peer-reviewing processes:

• sufficiently complex to capture interesting trade-offs

• sufficiently simple to enable some theoretical and experimental analysis

Prove that truthful-reviewing is not in Nash equilibrium for Common Based Review with given assumptions.

Prove that truthful-reviewing is in Nash equilibrium for the simplified Open Peer-Review SY under studied assumptions and strategies.

For OPR with threshold on acceptance rate, Tit_for_Tat is a rational strategy.

For CBR, a rational strategy given assumptions is: reject superseding, random review for others.

Next?

Recommended