Cognitive Modeling

Cognitive ModelingAakash Hingu (IIIrd Year)

Amit Kumar Swami (IVth Year)

Himanshu Singh (IIIrd Year)

Saikiran Boga (IIIrd Year)

Saiteja Sirikonda (IIIrd Year)

Sudanshu Gaur (IIIrd Year) Date: 19th May 2013

Motivation

• A cognitive model is an approximation to Cognitive Processes (predominantly human) for the purposes of comprehension and prediction.

• We try to imitate the ways of the human brain and try implement them through a mathematical model.

• In the coming slides, we try to explain the following,

Reinforcement Learning with Inertia Model

Normalized Reinforcement Learning with Inertia.

The BOSS Model

Task• 60 problems

• Each problem played by 20 participants

• Each problem played for 100 trials by a Participant.

• Each trial involves a choice between risky and safe options

–Example:

•Risky: -$32 with p = 0.1; $0 otherwise;

•Safe: -$3 with p = 1

Dataset

Column 1: The proportion of risky choices averaged over all participants and

time periods for each problem.

Column 2: High outcome on the risky option

Column 3: Probability of High outcome on the risky option

Column 4: Low outcome on the risky option

Column 5: Medium outcome on the safe option

Models

Inertia Reinforcement learning :

• It was just a basic Reinforcement learning method until we introduced the Inertia to it.

• That is to sustain the previous outcomes with the model "learning" it with weight provided.

• So it is basically inertia reinforcement learning model(IRL).

Result:

• MSD for Best TPT : 0.0094

*NOTE :

We have optimized the parameter(Pinert and weight(w)) values with global optimization tool.

• pinert : 0.656

• w(weight) : 0.545

• MSD for IRL model : 0.00768

• The IRL models' MSD is better than best TPT models' MSD by the order of 1.2239.

• The Reason is that our model is learning from the inertia we have instrumented on the latest favourable outcomes.

Reinforcement Learning With Normalization

The probability of selecting the risky prospect at trial t is given by:

where

WVt(k) = is the weighted value of action k at trial t,

μ = is a free payoff sensitivity parameter, and

Dt = is a measure of experienced payoff variability.

Reinforcement Learning With Normalization (Cont.)

If strategy k was selected at t, its weighted value at trial t+1 is a weighted average of WVt(k) at t, and vt the obtained payoff at t

The parameter 0 < ω < 1 captures the weight for the recent outcomes. The initial value,

WV1(k) = 0.5 * (high * ph + low * (1-ph))+ .5 * med

The payoff variability term Dt is the weighted average of the difference between the obtained payoff at trial t and t-1:

where v0 is assumed to equal A(1), and D1 is assumed to equal μ.

Normalized Reinforcement Learning with Inertia

• Inertia is embedded into NRL model.

• Here inertia is chosen as a free parameter and optimized through GA.

• Optimized value of Inertia = 0.537

• Optimized value of ω=0.395

• Optimized value of Inertia λ= 0.59

Result:

The resultant MSD obtained is 0.0278

Models (Cont.)

2. BOSS Model

Basic Idea:

Tactically, when a person gambles he credits a certain amount of

money which he can affords to loose. He likes to take more and more

risk as he gains more and more money, and tends to choose safer

options as he loses more.

BOSS Model(Cont.)

Algorithm:-

For each trial

If it’s a first trial, take a random choice

Else if outcome till now >= compare

Randomly take a choice

else take safe choice

If choice = risky

take collective outcome randomly with low & high

Else collective outcome with medium

BOSS Model(Cont.)

Where compare is decided based on the medium outcome

compare = max_num_loss * med if med <0

= 0 otherwise

compare is the only free variable here. And optimized value found using GA for given dataset is 46.

Result:-

MSD = 0.0238

Documents

Cognitive Modeling