1 Computer Science and the Socio- Economic Sciences Fred Roberts, Rutgers University

Preview:

Citation preview

1

Computer Science and the Socio-Economic Sciences

Fred Roberts, Rutgers University

2

CS and SS •Many recent applications in CS involve issues/problems of long interest to social scientists:

preference, utilityconflict and cooperationallocationincentivesconsensussocial choicemeasurement

•Methods developed in SS beginning to be used in CS

3

CS and SS •CS applications place great strain on SS methods

Sheer size of problems addressedComputational power of agents an issueLimitations on information possessed by playersSequential nature of repeated applications

•Thus: Need for new generation of SS methods

•Also: These new methods will provide powerful tools to social scientists

4

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

5

CS and SS: Outline

CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

6

CS and Consensus/Social Choice • Relevant social science problems: voting, group

decision making• Goal: based on everyone’s opinions, reach a “consensus” • Typical opinions:

“first choice”ranking of all alternativesscores classifications

• Long history of research on such problems.

7

CS and Consensus/Social Choice Background: Arrow’s Impossibility Theorem: There is no “consensus method” that satisfies

certain reasonable axioms about how societies should reach decisions.

Input: rankings of alternatives.Output: consensus ranking.

Kenneth ArrowNobel prize winner

8

CS and Consensus/Social Choice

There are widely studied and widely used consensus methods.

One well-known consensus method: “Kemeny-Snell medians”: Given setof rankings, find ranking minimizingsum of distances to other rankings.

Kemeny-Snell medians are having surprising new applications in CS.

John Kemeny,pioneer in time sharingin CS

9

CS and Consensus/Social Choice Kemeny-Snell distance between rankings: twice

the number of pairs of candidates i and j for which i is ranked above j in one ranking and below j in the other + the number of pairs that are ranked in one ranking and tied in another.

Kemeny-Snell median: Given rankings a1, a2, …, ap, find a ranking x so that

d(a1,x) + d(a2,x) + … + d(ap,x) is minimized.Sometimes just called Kemeny median.

10

CS and Consensus/Social Choice a1 a2 a3

Fish Fish ChickenChicken Chicken FishBeef Beef Beef

Median = a1. If x = a1:

d(a1,x) + d(a2,x) + d(a3,x) = 0 + 0 + 2is minimized.If x = a3, the sum is 4.For any other x, the sum is at least 1 + 1 + 1 = 3.

11

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

Three medians = a1, a2, a3.

This is the “voter’s paradox” situation.

12

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

Note that sometimes we wish to minimize

d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2

A ranking x that minimizes this is called a Kemeny-Snell mean.

In this example, there is one mean: the ranking declaring all three alternatives tied.

13

CS and Consensus/Social Choice a1 a2 a3

Fish Chicken BeefChicken Beef FishBeef Fish Chicken

If x is the ranking declaring Fish, Chickenand Beef tied, then

d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 = 32 + 32 + 32 = 27.

Not hard to show this is minimum.

14

CS and Consensus/Social Choice

Theorem (Bartholdi, Tovey, and Trick, 1989; Wakabayashi, 1986): Computing the Kemeny median of a set of rankings is an NP-complete problem.

15

Meta-search and Collaborative Filtering Meta-search

• A consensus problem• Combine page rankings from several search

engines• Dwork, Kumar, Naor, Sivakumar (2000):

Kemeny-Snell medians good in spam resistance in meta-search (spam by a page if it causes meta-search to rank it too highly)

• Approximation methods make this computationally tractable

16

Meta-search and Collaborative Filtering

Collaborative Filtering

• Recommending books or movies• Combine book or movie ratings• Produce ordered list of books or movies to

recommend• Freund, Iyer, Schapire, Singer (2003):

“Boosting” algorithm for combining rankings.• Related topic: Recommender Systems

17

Meta-search and Collaborative Filtering

A major difference from SS applications:

• In SS applications, number of voters is large, number of candidates is small.

• In CS applications, number of voters (search engines) is small, number of candidates (pages) is large.

• This makes for major new complications and research challenges.

18

Large Databases and Inference

• Real data often in form of sequences• GenBank has over 7 million sequences

comprising 8.6 billion bases. • The search for similarity or patterns has

extended from pairs of sequences to finding patterns that appear in common in a large number of sequences or throughout the database: “consensus sequences”.

• Emerging field of “Bioconsensus”: applies SS consensus methods to biological databases.

19

Large Databases and Inference

Why look for such patterns?

Similarities between sequences or parts of sequences lead to the discovery of shared phenomena.

For example, it was discovered that the sequence for platelet derived factor, which causes growth in the body, is 87% identical to the sequence for v-sis, a cancer-causing gene. This led to the discovery that v-sis works by stimulating growth.

20

Large Databases and Inference Example

Bacterial Promoter Sequences studied by Waterman (1989):

RRNABP1: ACTCCCTATAATGCGCCATNAA: GAGTGTAATAATGTAGCCUVRBP2: TTATCCAGTATAATTTGTSFC: AAGCGGTGTTATAATGCC

Notice that if we are looking for patterns of length 4, each sequence has the pattern TAAT.

21

Large Databases and Inference Example

Bacterial Promoter Sequences studied by Waterman (1989):

RRNABP1: ACTCCCTATAATGCGCCATNAA: GAGTGTAATAATGTAGCCUVRBP2: TTATCCAGTATAATTTGTSFC: AAGCGGTGTTATAATGCC

Notice that if we are looking for patterns of length 4, each sequence has the pattern TAAT.

22

Large Databases and Inference Example

However, suppose that we add another sequence:

M1 RNA: AACCCTCTATACTGCGCG

The pattern TAAT does not appear here.However, it almost appears, since the pattern

TACT appears, and this has only one mismatch from the pattern TAAT.

23

Large Databases and Inference Example

However, suppose that we add another sequence:

M1 RNA: AACCCTCTATACTGCGCG

The pattern TAAT does not appear here.However, it almost appears, since the pattern

TACT appears, and this has only one mismatch from the pattern TAAT.

So, in some sense, the pattern TAAT is a good consensus pattern.

24

Large Databases and Inference Example

We make this precise using best mismatch distance.

Consider two sequences a and b with b longer than a.

Then d(a,b) is the smallest number of mismatches in all possible alignments of a as a consecutive subsequence of b.

25

Large Databases and Inference Example

a = 0011, b = 111010

Possible Alignments:111010111010 1110100011 0011 0011

The best-mismatch distance is 2, which is achieved in the third alignment.

26

Large Databases and Inference Example

Now given a database of sequences a1, a2, …, an.Look for a pattern of length k. One standard method (Smith-Waterman): look for

a consensus sequence b that minimizes

i[k-d(b,ai)]/d(b,ai),

where d is best mismatch distance.

In fact, this turns out to be equivalent to calculating medians like Kemeny-Snell medians.

Algorithms for computing consensus sequences are important in modern molecular biology.

27

Large Databases and Inference Preferential Queries

• Look for flight from New York to Beijing• Have preferences for

airlineitinerarytype of ticket

• Try to combine responses from multiple travel-related websites

• Sequential decision making: Next query or information access depends on prior responses.

28

Consensus Computing, Image Processing • Old SS problem: Dynamic modeling of how

individuals change opinions over time, eventually reaching consensus.

• Often use dynamic models on graphs• Related to neural nets.

• CS applications: distributed computing.• Values of processors in a network are updated

until all have same value.

29

Consensus Computing, Image Processing • CS application: Noise removal in digital images• Does a pixel level represent noise?• Compare neighboring pixels.• If values beyond threshold, replace pixel value

with mean or median of values of neighbors.• Related application in distributed computing.• Values of faulty processors are replaced by those

of neighboring non-faulty ones.• Berman and Garay (1993) use “parliamentary

procedure” called cloture

30

Computational Intractability of Consensus Functions

• Bartholdi, Tovey and Trick: There are voting schemes where it can be computationally intractable to determine who won an election.

• Computational intractability can be a good thing in an election: Designing voting systems where it is computationally intractable to “manipulate” the outcome of an election by “insincere voting”:Adding votersDeclaring voters ineligibleAdding candidatesDeclaring candidates ineligible

31

Electronic Voting

• Issues:CorrectnessAnonymityAvailabilitySecurityPrivacy

32

Electronic VotingSecurity Risks in Electronic Voting

• Threat of “denial of service attacks”• Threat of penetration attacks involving a

delivery mechanism to transport a malicious payload to target host (thru Trojan horse or remote control program)

• Private and correct counting of votes• Cryptographic challenges to keep votes private• Relevance of work on secure multiparty

computation

33

Electronic Voting

Other CS Challenges:

• Resistance to “vote buying”• Development of user-friendly interfaces• Vulnerabilities of communication path between

the voting client (where you vote) and the server (where votes are counted)

• Reliability issues: random hardware and software failures

34

Software & Hardware Measurement • Theory of measurement developed by

mathematical social scientists• Measurement theory studies ways to combine

scores obtained on different criteria.• A statement involving scales of measurement is considered meaningful if its

truth or falsity is unchanged under acceptable transformations of all scales involved.

• Example: It is meaningful to say that I weigh more than my daughter.

• That is because if it is true in kilograms, then it is also true in pounds, in grams, etc.

35

Software & Hardware Measurement • Measurement theory has studied what statements you

can make after averaging scores.• Think of averaging as a consensus method.• One general principle: To say that the average score of

one set of tests is greater than the average score of another set of tests is not meaningful (it is meaningless) under certain conditions.

• This is often the case if the averaging procedure is to take the arithmetic mean: If s(xi) is score of xi, i = 1, 2, …, n, then arithmetic mean is

is(xi)/n.• Long literature on what averaging methods lead to

meaningful conclusions.

36

Software & Hardware Measurement A widely used method in hardware measurement:

Score a computer system on different benchmarks.

Normalize score relative to performance of one base system

Average normalized scoresPick system with highest average.Fleming and Wallace (1986): Outcome can

depend on choice of base system. Meaningless in sense of measurement theoryLeads to theory of merging normalized scores

37

Software & Hardware Measurement Hardware Measurement

417 83 66 39,449 772

244 70 153 33,527 368

134 70 135 66,000 369

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

Data from Heath, Comput. Archit. News (1984)

38

Software & Hardware Measurement Normalize Relative to Processor R

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

39

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.00

1.01

1.07

40

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.00

1.01

1.07

Conclude that machine Z is best

41

Software & Hardware Measurement Now Normalize Relative to Processor M

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

42

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.32

1.00

1.08

43

Software & Hardware Measurement Take Arithmetic Mean of Normalized Scores

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

ArithmeticMean

1.32

1.00

1.08

Conclude that machine R is best

44

Software and Hardware Measurement • So, the conclusion that a given machine is best

by taking arithmetic mean of normalized scores is meaningless in this case.

• Above example from Fleming and Wallace (1986), data from Heath (1984)

• Sometimes, geometric mean is helpful.• Geometric mean is

is(xi)n

45

Software & Hardware Measurement Normalize Relative to Processor R

417

1.00

83

1.00

66

1.00

39,449

1.00

772

1.00

244

.59

70

.84

153

2.32

33,527

.85

368

.48

134

.32

70

.85

135

2.05

66,000

1.67

369

.45

BENCHMARK

R

M

Z

PROCESSOR

E F G H I

GeometricMean

1.00

.86

.84

Conclude that machine R is best

46

Software & Hardware Measurement Now Normalize Relative to Processor M

417

1.71

83

1.19

66

.43

39,449

1.18

772

2.10

244

1.00

70

1.00

153

1.00

33,527

1.00

368

1.00

134

.55

70

1.00

135

.88

66,000

1.97

369

1.00

BENCHMARK

R

M

Z

PROCESSOR

E F G H IGeometricMean

1.17

1.00

.99

Still conclude that machine R is best

47

Software and Hardware Measurement• In this situation, it is easy to show that the conclusion

that a given machine has highest geometric mean normalized score is a meaningful conclusion.

• Even meaningful: A given machine has geometric mean normalized score 20% higher than another machine.

• Fleming and Wallace give general conditions under which comparing geometric means of normalized scores is meaningful.

• Research area: what averaging procedures make sense in what situations? Large literature.

• Note: There are situations where comparing arithmetic means is meaningful but comparing geometric means is not.

48

Software and Hardware Measurement

• Message from measurement theory to computer science:

Do not perform arithmetic operations on data without paying attention to whether the conclusions you get are meaningful.

49

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

50

CS and Game Theory• Game theory a long history in

economics; also in operations research, mathematics

• Recently, computer scientists discovering relevance to their problems

• Increasingly complex games arise in practical applications: auctions, Internet

• Need new game-theoretic methods for CS problems.

• Need new CS methods to solve modern game theory problems.

51

CS and Game Theory: Algorithmic Issues

Nash Equilibrium

• Each player chooses a strategy• If no player can benefit by changing

his strategy while others leave theirs unchanged, we are in Nash equilibrium.

• In 1951, Nash showed every game has a Nash equilibrium.

• How hard is this to compute?

John NashNobel prize winner

52

Example: Nash Equilibrium• 2-player game

• Strategy = number between 0 and 3

• Both players win lower amount.

• Player with higher amount pays $2 to player with lower amount

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

53

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

54

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

55

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

56

Example: Nash Equilibrium• 0-0 is unique

Nash equilibrium

• Any other strategy: one player can lower his to below other’s and improve.

• E.g.: From 2-2, player 1 lowers his number to 1 (or player 2 lowers his to 1)

0,0 2,-2 2,-2 2,-2

-2,2 1,1 3,-1 3,-1

-2,2 -1,3 2,2 4,0

-2,2 -1,3 0,4 3,3

Player 2 strategy

0 1 2 3

0

1

2

3

Player 1 strategy

Source: Wikipedia

57

CS and Game Theory: Algorithmic Issues

Nash Equilibrium• 2-player games: can use linear programming

methods.• Recent powerful result (Daskalakis, Goldberg,

Papadimitriou 2005): for 4-player games, problem is PPAD-complete.

• (PPAD: class of search problems where solution is known to exist by graph-theoretic arguments.)

• PPAD-complete means: If exists polynomial algorithm, then exists one for Brouwer fixed points, which seems unlikely.

58

CS and Game Theory: Algorithmic Issues

Other Algorithmic Challenges• Repeated games.

• Issues of sequential decision making• Issues of learning to play

• Other “solution concepts” in multi-player games: “power indices” (Shapley, Banzhaf, Coleman)Need calculate them for huge gamesMostly computationally intractableArise in many applications in CS, e.g.,

multicasting

59

Computational Issues in Auction Design

• Auctions increasingly used in business and government.

• Information technology allows complex auctions with huge number of bidders.

• Auctions are unusually complicated games.

60

Computational Issues in Auction Design

Bidding functions maximizing expected profit can be exceedingly difficult to compute.

Determining the winner of an auction can be extremely hard. (Rothkopf, Pekec, Harstad 1998)

61

Computational Issues in Auction Design

Combinatorial Auctions

• Multiple goods auctioned off.• Submit bids for combinations of goods.• This leads to NP-complete allocation

problems.• Might not even be able to feasibly express all

possible preferences for all subsets of goods.• Rothkopf, Pekec, Harstad (1998): determining

winner is computationally tractable for many economically interesting kinds of combinations.

62

Computational Issues in Auction DesignSome other Issues:

• Internet auctions: Unsuccessful bidders learn from previous auctions.

• Issues of learning in repeated plays of a game.

• Related to software agents acting on behalf of humans in electronic marketplaces based on auctions.

• Cryptographic methods needed to preserve privacy of participants.

63

Allocating/Sharing Costs & Revenues• Game-theoretic solutions have long

been used to allocate costs to different users in shared projects.Allocating runway fees in airportsAllocating highway fees to trucks of

different sizesUniversities sharing library facilitiesFair allocation of telephone calling

charges among users sharing complex phone systems (Cornell’s experiment)

64

Allocating/Sharing Costs & RevenuesShapley Value

• Shapley value assigns a payoff to each player in a multi-player game.

• Consider a game in which some coalitions of players win and some lose, with no subset of a losing coalition winning.

• Consider a coalition forming at random, one player at a time.

• A player i is pivotal if addition of i throws coalition from losing to winning.

• Shapley value of i = probability i is pivotal if an order of players is chosen at random.

• In such games with winners/losers, called Shapley-Shubik power index.

Lloyd Shapley

65

Allocating/Sharing Costs & RevenuesShapley Value

Example: Board of Directors of CompanyShareholder 1 holds 3 shares.Shareholders 2, 3, 4, 5, 6, 7 hold 1 share each.A majority of shares are needed to make a decision.Coalition {1,4,6} is winning.Coalition {2,3,4,5,6} is winning.

Shareholder 1 is pivotal if he is 3rd, 4th, or 5th.So shareholder 1’s Shapley value is 3/7.Sum of Shapley values is 1 (since they are probabilities)Thus, each other shareholder has Shapley value

(4/7)/6 = 2/21

66

Allocating/Sharing Costs & RevenuesShapley Value

Allocating Runway Fees at AirportsLarger planes require longer runways.Divide runways into meter-long

segments.Each month, we know how many

landings a plane has made.Given a runway of length y meters,

consider a game in which the players are landings and a coalition “wins” if the runway is not long enough for planes in the coalition.

67

Allocating/Sharing Costs & RevenuesShapley Value

Allocating Runway Fees at AirportsA landing is pivotal if it is the first

landing added that makes a coalition require a longer runway.

The Shapley value gives the cost of the yth meter of runway to a given landing.

We then add up these costs over all runway lengths a plane requires and all landings it makes.

68

Allocating/Sharing Costs & Revenues

Multicasting

• Applications in multicasting.• Unicast routing: Each packet sent from a

source is delivered to a single receiver.• Sending it to multiple sites: Send multiple

copies and waste bandwidth.• In multicast routing: Use a directed tree connecting source to all receivers.• At branch points, a packet is duplicated as necessary.

69

Multicasting

70

Allocating/Sharing Costs & Revenues

Multicasting

• Multicast routing: Use a directed tree connecting source to all receivers.

• At branch points, a packet is duplicated as necessary.

• Bandwidth is not directly attributable to a single receiver.

• How to distribute costs among receivers?• One idea: Use Shapley value.

71

Allocating/Sharing Costs & Revenues• Feigenbaum, Papadimitriou, Shenker (2001):

no feasible implementation for Shapley value in multicasting.

• Note: Shapley value is uniquely characterized by four simple axioms.

• Sometimes we state axioms as general principles we want a solution concept to have.

• Jain and Vazirani (1998): polynomial time computable cost-sharing algorithmSatisfying some important axiomsCalculating cost of optimum multicast tree within

factor of two of optimal.

72

Bounded Rationality

• Traditional game theory assumption: Strategic agents are fully rational; can completely reason about consequences of their actions.

• But: Consider bounded computational power.

73

Bounded RationalitySome issues:

• Looking at bounded rationality as bounded recall in repeated games.

• Modeling bounded rationality when strategies are limited to those implementable on finite state automata

• What are optimal strategies in large, complex games arising in CS applications for players with bounded computational power?

• E.g.: How do players with limited computational power determine minimal bid increases in an auction to transform losing bids into winning ones?

74

Streaming Data in Game Theory

Streaming Data Analysis:• When you only have one shot at the

data as it streams by• Widely used to detect trends and

sound alarms in applications in telecommunications and finance

• AT&T uses this to detect fraudulent use of credit cards or impending billing defaults

• Other relevant work: methods for detecting fraudulent behavior in financial systems

75

Streaming Data in Game Theory

Streaming Data Analysis:

• “One pass” mechanism of interest in game theory-based allocation schemes in multicasting Herzog, Shenker, Estrin (1997)

• Arises in on-line auctions.Need to develop bidding strategies if only

one pass is allowed

76

CS and SS: Outline

1.CS and Consensus/Social Choice

2. CS and Game Theory

3. Algorithmic Decision Theory

77

Algorithmic Decision Theory • Decision makers in many fields (engineering,

medicine, economics, …) have:Remarkable new technologies to useHuge amounts of information to help themAbility to share information at unprecedented

speeds and quantities

78

Algorithmic Decision Theory • These tools bring daunting new problems:

Massive amounts of data are often incomplete, unreliable, or distributed

Interoperating/distributed decision makers and decision making devices need coordination

Many sources of data need to be fused into a good decision.

• There are few highly efficient algorithms to support decisions.

79

Sequential Decision Making • Making some decisions before all data

is in.• Sequential decision problems arise in:

Communication networksTesting connectivity, paging

cellular customers, sequencing tasks

ManufacturingTesting machines, fault

diagnosis, routing customer service calls

80

Sequential Decision Making • Sequential decision problems arise in:

Artificial IntelligenceOptimal derivation strategies in

knowledge bases, best-value satisficing search, coding decision tables

MedicineDiagnosing patients, sequencing

treatments

81

Sequential Decision Making

Online Text Filtering Algorithms

• We seek to identify “interesting” documents from a stream of documents

• Widely studied problem in machine learning

82

Sequential Decision Making Online Text Filtering Algorithms: A Model

• As a document arrives, need to decide whether or not to present it to an oracle

• If document presented to oracle and is interesting, get r reward units.

• If presented and not interesting, get penalty of c units.

• What is a strategy for maximizing expected payoff?

• See Fradkin and Littman (2005) for recent work using sequential decision making methods

83

Inspection Problems • Inspection problem: in what order to do tests to inspect containers for drugs,

bombs, etc.?• Do we inspect? What test do we do next?

How do outcomes of earlier tests affect this decision?

• Simplest case: Entities being inspected need to be classified as ok (0) or suspicious (1).

• Binary decision tree model for testing. • Follow left branch if ok, right branch if

suspicious. • Find cost-minimizing binary decision tree.

84

Inspection Problems

Follow left branch if ok, right branch if suspicious.

85

Sequential Decision Making Problem Some More Details:

•Containers have attributes, each in a number of states

•Sample attributes:Levels of certain kinds of chemicals or biological materialsWhether or not there are items of a certain kind in the cargo listWhether cargo was picked up in a certain port

86

Sequential Decision Making Problem•Simplest Case: Attributes are in state 0 or 1•State 1 means have attribute and that is suspicious.

•Then: Container is a binary string like 011001

•So: Classification is a decision function F that assigns each binary string to a category 0 or 1: A Boolean function.

011001 F(011001)

If attributes 2, 3, and 6 are present and others are not, assign container to category F(011001).

87

Binary Decision Tree Approach•Reach category 1 from the root by:a0 L to a1 R a2 R 1 ora0 R a2 R1

•Container classified in category 1 iff it hasa1 and a2 and not a0 or a0 and a2 and possibly a1.

•Corresponding Boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

88

Binary Decision Tree Approach•This binary decision tree corresponds to the same Boolean function

F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

However, it has one less observation node ai. So, it is more efficient if all observations are equally costly and equally likely.

89

Binary Decision Tree Approach•Realistic problem much more difficult:

Test result errors Tests cost different amounts of money and take different amounts of timeThere are queues to wait for testingOne can adjust the thresholds of detectors.There are penalties for false negatives and false positives.

•Challenging problems for computer science

Gamma ray detector

90

Inspection Problems

• Problem of finding optimal binary decision tree has many other uses:AI: rule-based systems Circuit complexityReliability analysisTheory of programming/databases

• In general, problem is NP-complete

91

Inspection Problems

• Some cases of decision functions where the problem is tractable:k-out-of-n systemsCertain series-parallel systemsRead-once systems“regular systems”Horn systems

• Recent results in case of inspection problems at ports: Stroud and Saeger

(2004), Anand, et al. (2006).

92

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Successful decision making requires efficient elicitation of information and efficient representation of the information elicited.

• Old problems in the social sciences.• Computational aspects becoming a focal point

because of need to deal with massive and complex information.

93

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Example I: Social scientists study preferences: “I prefer beef to fish”

• Extracting and representing preferences is key in decision making applications.

94

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• “Brute force” approach: For every pair of alternatives, ask which is preferred to the other.

• Often computationally infeasible.

95

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• In many applications (repeated games, collaborative filtering), important to elicit preferences automatically.

• CP-nets introduced as tool to represent preferences succinctly and provide ways to make inferences about preferences (Boutilier, Brafman, Doomshlak, Hoos, Poole 2004).

96

Computational Approaches to Information Management in Decision Making

Representation and Elicitation

• Example II: combinatorial auctions.• Decision maker needs to elicit preferences

from all agents for all plausible combinations of items in the auction.

• Similar problem arises in optimal bundling of goods and services.

• Elicitation requires exponentially many queries in general.

97

Computational Approaches to Information Management in Decision Making

Representation and Elicitation• Challenge: Recognize situations in which

efficient elicitation and representation is possible.

• One result: Fishburn, Pekec, Reeds (2002)• Even more complicated: When objects in

auction have complex structure. • Problem arises in:

Legal reasoning, sequential decision making, automatic decision devices, collaborative filtering.

98

Concluding Comment• In recent years, interplay between CS and biology has transformed major parts of Bio into an information science.• Led to major scientific breakthroughs in

biology such as sequencing of human genome.

• Led to significant new developments in CS, such as database search.

• The interplay between CS and SS not nearly as far along.

• Moreover: problems are spread over many disciplines.

99

Concluding Comment

• However, CS-SS interplay has already developed a unique momentum of its own.

• One can expect many more exciting outcomes as partnerships between computer scientists and social scientists expand and mature.

100

Recommended