The Rescorla-Wagner

Introduction

The Rescorla-Wagner model is a classical model of associative learning, which describes how expectations about outcomes are updated based on observed outcomes. Originally, it was developed to model reinforcement learning tasks, where an organism learns to associate actions with various rewards. Since then, it has been applied in a variety of contexts, and is now a canonical model in reinforcement learning and cognitive psychology for modelling how agents learn from their environment. In ActionModels, the Rescorla-Wagner model is provided as a premade model. It functions as a perceptual model, and can be combined with various response models to make a full action model. In the following the Rescorla-Wagner model is described mathematically, and it's binary and categorical variants are introduced. In the next sections, we will demonstrate how it can be used with ActionModels

The continuous Rescorla-Wagner model

The classic rescorla wagner model has a single changing state, the expected value $V_t$ of an outcome at time $t$. This expectation is updated by the Rescorla-Wagner update rule:

\[V_t = V_{t-1} + \alpha (o_t - V_{t-1})\]

where $o_t$ is the observed outcome at time $t$, and $\alpha$ is the learning rate. The continuous Rescorla-Wagner model therefore has a total of two parameters:

The initial expected value $V_0 \in \mathbb{R}$ (by default set to 0)
The learning rate $\alpha \in [0,1]$ (by default set to 0.1)

And one state:

The expected value $V_t \in \mathbb{R}$, which is updated on each timestep

The expected value $V_t$ can be used to determine the action in various ways, depending on the response model of choice. A classic response model, which is the default in the ActionModels implementation, is a report with Gaussian noise. Here an observation is used to update the Rescorla-Wagner, and the expectation is then reported as the action, with some noise. The report action is sampled from a Gaussian distribution with the expected value $V_t$ as mean and a noise parameter $\beta$ as standard deviation:

\[a_t \sim \mathcal{N}(V_t, \beta)\]

This report action model then has one additional parameter:

The action noise $\beta \in [0, \infty]$ (by default set to 1)

Takes one observation:

The observation $o_t \in \mathbb{R}$

And returns one action:

The report action $a_t \in \mathbb{R}$, which is sampled from the Gaussian distribution described above.

The binary Rescorla-Wagner model

In its binary variant, the Rescorla-Wagner model receives a binary observation (as opposed to a continuous one), and its task is to learn the probability of the binary outcome. Here, the expected value $V_t$ is transformed with a logistic or sigmoid function to ensure it is between 0 and 1, before it is used to calculate the prediction error:

\[V_t = V_{t-1} + \alpha (o_t - \sigma(V_{t-1}))\]

where $\sigma(x) = \frac{1}{1 + e^{-x}}$ is the logistic function. The binary rescorla Wagner thus has a total of two parameters:

The initial expected value $V_0 \in \mathbb{R}$ (by default set to 0)
The learning rate $\alpha \in [0,1]$ (by default set to 0.1)

And one state:

The expected value $V_t \in \mathbb{R}$

A classic response model, which is the default in the ActionModels implementation, is a binary report with Bernoulli noise. Here an observation is used to update the Rescorla-Wagner, and the expectation is then reported as the action, with some noise. The binary report action is sampled from a Bernoulli distribution with the sigmoid-transformed expected value $V_t$ as probability. An action precision (the inverse action noise $\beta$) is used to control the noise of the action:

\[a_t \sim \text{Bernoulli}(\sigma(V_t * \beta^{-1}))\]

This report action model then has one additional parameter:

The action noise $\beta \in [0, \infty]$ (by default set to 1)

Takes one observation:

The binary observation $o_t \in \{0, 1\}$

And returns one action:

The report action $a_t \in \{0, 1\}$, which is sampled from the Bernoulli distribution described above.

The categorical Rescorla-Wagner model

In its categorical variant, the Rescorla-Wagner model receives a categorical observation (as opposed to a continuous one), and its task is to learn the probability of each observation category occuring. Here the observation is transformed into a one-hot encoded vector, representing for each category whether it was observed or not. For each category, the expected value $V_t$ is updated with the binary Rescorla-Wagner update rule:

\[V_{t, c} = V_{t-1, c} + \alpha (o_{t, c} - \sigma(V_{t-1, c}))\]

where $o_{t, c}$ is the observation for category $c$ at time $t$, and

\[\sigma(x) = \frac{1}{1 + e^{-x}}\]

is the logistic function. The categorical Rescorla-Wagner model therefore has a total of two parameters:

The initial expected values $V_0 \in \mathbb{R}$, which is a vector of values for each category (by default set to a vector of zeros)
The learning rate $\alpha \in [0,1]$ (by default set to 0.1)

And one state:

The expected values $V_t \in \mathbb{R}$, which is a vector of expected values for each category, updated on each timestep

A classic response model, which is the default in the ActionModels implementation, is a categorical report with noise. Here a categorical observation is used to update the Rescorla-Wagner, and the category with the highest expected probability is then reported as the action, with some noise. The categorical report action is sampled from a Categorical distribution with the softmax-transformed expected values $V_t$ as probabilities, weighted by an action precision (the inverse action noise $\beta$).

\[a_t \sim \text{Categorical}(σ(V_t * \beta^{-1}))\]

where $\sigma(x)$ is the softmax function

\[σ(x) = \frac{e^{x_i}}{\sum_{j=1}^n e^{x_j}}\]

for each category $i$, which ensures that the probabilities sum to 1. This report action model then has one additional parameter:

The action noise $\beta \in [0, \infty]$ (by default set to 1)

Takes one observation:

The categorical observation $o_t \in \{1, 2, \ldots, n\}$, which is the category observed at time $t$

And returns one action:

The report action $a_t \in \{1, 2, \ldots, n\}$, which is sampled from the Categorical distribution described above.

The premade Rescorla-Wagner constructor

In this section, we demonstrate how to use the premade Rescorla-Wagner model constructor provided by ActionModels. This can construct continuous, binary and categorical Rescorla-Wagner model variants, as described above. It can be used with the standard report actions above, or with custom response models.

First we load ActionModels, as well as StatsPlots for plotting the results:

using ActionModels
using StatsPlots

We can then create the configuration struct for the Rescorla-Wagner model.

rescorla_wagner_config = RescorlaWagner();

And use that to create an action model.

action_model = ActionModel(rescorla_wagner_config)

-- ActionModel --
Action model function: rescorla_wagner_act_after_update
Number of parameters: 1
Number of states: 0
Number of observations: 1
Number of actions: 1
submodel type: ActionModels.ContinuousRescorlaWagner

Which can now be used to simulate or fit the model as usual. By default, the RescorlaWagner constructor creates a continuous Rescorla-Wagner model with a Gaussian report action. It can also create binary and categorical Rescorla-Wagner models, by specifying the type keyword argument. Finally, it is possible to specify whether the expected value should be updated before or after the action is sampled, by setting the act_before_update keyword argument. This is useful with datasets that are structured so that the observation for a given timestep is not available until after the action has been sampled, such as in some reinforcement learning tasks.

continuous_RW_config = RescorlaWagner(
    type = :continuous,
    initial_value = 0.0,        # Initial expected value
    learning_rate = 0.1,        # Learning rate
    action_noise = 1.0,         # Action noise (only applicable with default report action)
    act_before_update = false,  # Whether to act before updating the expected value
)
continuous_RW = ActionModel(continuous_RW_config)

binary_RW_config = RescorlaWagner(
    type = :binary,
    initial_value = 0.0,        # Initial expected value
    learning_rate = 0.1,        # Learning rate
    action_noise = 1.0,         # Action noise (only applicable with default report action)
    act_before_update = false,  # Whether to act before updating the expected value
)
binary_RW = ActionModel(binary_RW_config)

categorical_RW_config = RescorlaWagner(
    type = :categorical,
    n_categories = 4,           # Number of categories
    initial_value = zeros(4),   # Initial expected values for each category - should be equal to the number of categories
    learning_rate = 0.1,        # Learning rate
    action_noise = 1.0,         # Action noise (only applicable with default report action)
    act_before_update = false,  # Whether to act before updating the expected value
)
categorical_RW = ActionModel(categorical_RW_config);

This means that users do not have to create an action model manually, but can proceed directly to simulating or fitting the model. With a categorical Rescorla-Wagner model, it looks like this (first simulating and thenf itting a single session):

#Load packages
using ActionModels
using StatsPlots

#Create action model
action_model = ActionModel(RescorlaWagner(type = :categorical, n_categories = 4))

#Initialize agent
agent = init_agent(action_model, save_history = :expected_value)

#Create observation
observations = [2, 1, 2, 2, 1, 2, 3, 4, 3, 2, 2, 2, 1, 1]

#Simulate actions
simulated_actions = simulate!(agent, observations)

#Fit the model to the simulated actions
model = create_model(
    action_model,
    (
        learning_rate = LogitNormal(),
        action_noise = LogNormal(),
        initial_value = MvNormal(zeros(4), I),
    ),
    observations,
    simulated_actions,
)

chns = sample_posterior!(model)

Chains MCMC chain (1000×18×2 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 2
Samples per chain = 1000
Wall duration     = 9.54 seconds
Compute duration  = 8.44 seconds
parameters        = learning_rate.session[1], action_noise.session[1], initial_value.session[1, 1], initial_value.session[2, 1], initial_value.session[3, 1], initial_value.session[4, 1]
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
                   parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ess_per_sec
                       Symbol   Float64   Float64   Float64     Float64     Float64   Float64       Float64

     learning_rate.session[1]    0.5360    0.2011    0.0048   1797.3865   1226.9289    1.0032      212.8848
      action_noise.session[1]    2.7045    2.2651    0.0628   1497.7765   1336.2854    0.9992      177.3986
  initial_value.session[1, 1]    0.2544    0.9467    0.0219   1875.1452   1290.0730    1.0036      222.0947
  initial_value.session[2, 1]   -0.2320    0.9376    0.0220   1807.0650   1453.7936    1.0000      214.0311
  initial_value.session[3, 1]    0.0686    0.9509    0.0208   2115.2037   1550.8753    0.9995      250.5275
  initial_value.session[4, 1]    0.0130    0.9459    0.0217   1896.8762   1601.7501    1.0026      224.6685

Quantiles
                   parameters      2.5%     25.0%     50.0%     75.0%     97.5%
                       Symbol   Float64   Float64   Float64   Float64   Float64

     learning_rate.session[1]    0.1444    0.3884    0.5436    0.6978    0.8750
      action_noise.session[1]    0.7666    1.3662    2.0360    3.1835    8.6920
  initial_value.session[1, 1]   -1.5943   -0.3938    0.2592    0.8709    2.1623
  initial_value.session[2, 1]   -2.0185   -0.8442   -0.2238    0.4007    1.6054
  initial_value.session[3, 1]   -1.8300   -0.5809    0.0700    0.7321    1.8760
  initial_value.session[4, 1]   -1.8618   -0.6396    0.0251    0.6434    1.8700

And we can see that the parameters have been succesfully estimated. The initial_value has an index for each session, and for each category. This action model can be used normally as described in the rest of the documentation.

Custom response models

In the examples above, we have used the default report action with the rescorla-wagner model. The precreated model, however, also allows for using custom response models. Here we show how to create the Gaussian report action as a custom response model - this is the default response model when using the continuous Rescorla-Wagner model. We will make a minor change, and add a bias parameter to the response model, which will be added to the expected value before sampling the action.

The response model itself should be a function that takes the model attributes as input and returns a distribution. The Rescorla-Wagner will have either been updated or not, depending on the act_before_update keyword, so the response model should be able to handle either case.

response_model = function biased_gaussian_report(attributes::ModelAttributes)
    rescorla_wagner = attributes.submodel
    Vₜ = rescorla_wagner.expected_value
    β = load_parameters(attributes).action_noise
    b = load_parameters(attributes).bias
    return Normal(Vₜ + b, β)
end;

In addition to the response model function, we need to specify the parameters, observations, and actions that the response model uses. This uses the usual syntax for creating action models.

response_model_parameters = (; action_noise = Parameter(1.0), bias = Parameter(0.0));
response_model_observations = (; observation = Observation(Float64));
response_model_actions = (; report = Action(Normal));

We can now create the Rescorla-Wagner action model with the custom response model.

action_model = ActionModel(
    RescorlaWagner(
        type = :continuous,
        response_model = response_model,
        response_model_parameters = response_model_parameters,
        response_model_observations = response_model_observations,
        response_model_actions = response_model_actions,
    ),
)

-- ActionModel --
Action model function: rescorla_wagner_act_after_update
Number of parameters: 2
Number of states: 0
Number of observations: 1
Number of actions: 1
submodel type: ActionModels.ContinuousRescorlaWagner

And continue to simulate with the biased report response model.

#Initialize agent
agent = init_agent(action_model, save_history = :expected_value)

#Set a strong bias but low noise
set_parameters!(agent, (; bias = 2.0, action_noise = 0.1))

#Create observation
observations = collect(0:0.1:2) .+ randn(21) * 0.1

#Simulate actions
simulated_actions = simulate!(agent, observations)

#Plot the trajectory
plot(agent, :expected_value)
plot!(simulated_actions, seriestype = :scatter, label = "Actions")
plot!(observations, seriestype = :scatter, label = "Observations")
title!("Biased actions relative to the expected value")

We can also fit the the model to estimate the action bias

model = create_model(
    action_model,
    (
        learning_rate = LogitNormal(),
        action_noise = LogNormal(),
        bias = Normal(),
    ),
    observations,
    simulated_actions,
)

chns = sample_posterior!(model)

plot(chns)

And we can see that the bias parameter has been estimated correctly, as has the other parameters.

Multiple observation sources

In some experiments, there are multiple soruces of observations. It is then a common strategy to use a single Rescorla-Wagner model to learn the expected value of each observation source, and then combine these expected values in a custom response model.

#TODO: This needs multiple submodels for implementation

The RescorlaWagner submodel

In the previous section, we demonstrated how to use the premade Rescorla-Wagner model constructor provided by ActionModels. ActionModels also provides a RescorlaWagner submodel, which can be used to create custom action models with the Rescorla-Wagner updates as a submodel. In this section, we will demonstrate how to use the RescorlaWagner submodel to create custom action models, which can be useful in more complex scenarios. This also serves as a demonstration of how the premade Rescorla-Wagner model is implemented in ActionModels.

In the following, we will demonstrate each of the three Rescorla-Wagner variants: continuous, binary and categorical. We will in each case combine them with the standard report action described above, but they can also be combined with other response models.

First we load the ActionModels package, as well as StatsPlots for plotting the results:

using ActionModels
using StatsPlots

Continuous variant

We can then create a Rescorla-Wagner submodel with the Rescorla-Wagner constructor.

submodel = ActionModels.ContinuousRescorlaWagner(
    initial_value = 0.0,  # Initial expected value
    learning_rate = 0.1,  # Learning rate
);

And an accompanying action model function with a Gaussian report action

model_function = function rescorla_wagner_gaussian_report(
    attributes::ModelAttributes,
    observation::Float64,
)
    #Load the action noise parameter
    parameters = load_parameters(attributes)
    β = parameters.action_noise

    #Extract Rescorla-Wagner submodel
    rescorla_wagner = attributes.submodel

    #Update the Rescorla-Wagner expectation based on the observation
    ActionModels.update!(rescorla_wagner, observation)

    #Extract the expected value from the Rescorla-Wagner submodel
    Vₜ = rescorla_wagner.expected_value

    #Return the action distribution, which is a Gaussian with the expected value Vₜ as mean and the action noise β as standard deviation
    action_distribution = Normal(Vₜ, β)

    return action_distribution
end;

We can then create an ActionModel object

action_model = ActionModel(
    model_function,
    submodel = submodel,
    parameters = (; action_noise = Parameter(1.0)),
    observations = (; observation = Observation()),
    actions = (; report = Action(Normal)),
)

-- ActionModel --
Action model function: rescorla_wagner_gaussian_report
Number of parameters: 1
Number of states: 0
Number of observations: 1
Number of actions: 1
submodel type: ActionModels.ContinuousRescorlaWagner

And proceed to simulate

#Initialize agent
agent = init_agent(action_model, save_history = :expected_value)

#Set parameters
set_parameters!(agent, (; action_noise = 0.1, learning_rate = 0.5))

#Create observations
observations = collect(0:0.1:2) .+ randn(21) * 0.1

#Simulate actions
simulated_actions = simulate!(agent, observations)

#Plot trajectory of expected values
plot(agent, :expected_value)
plot!(simulated_actions, seriestype = :scatter, label = "Actions")
plot!(observations, seriestype = :scatter, label = "Observations")
title!("Expectation trajectory and sampled actions", legend = :topright)

Or fit the model

#Fit the model to the simulated actions
model = create_model(
    action_model,
    (learning_rate = LogitNormal(), action_noise = LogNormal(), initial_value = Normal()),
    observations,
    simulated_actions,
)

chns = sample_posterior!(model)

plot(chns)

Binary variant

We can also create a binary Rescorla-Wagner submodel:

submodel = ActionModels.BinaryRescorlaWagner(
    initial_value = 0.0,  # Initial expected value
    learning_rate = 0.5,  # Learning rate
);

And an accompanying action model function with a Bernoulli report action. Now, the observation is a binary value (0 or 1), and the expected value is transformed with a sigmoid function before being used to calculate the action distribution.

model_function = function rescorla_wagner_bernoulli_report(
    attributes::ModelAttributes,
    observation::Int64,
)
    #Load the action noise parameter
    parameters = load_parameters(attributes)
    β = parameters.action_noise

    #Transform the action noise into a precision (or inverse noise)
    β⁻¹ = 1 / β

    #Extract Rescorla-Wagner submodel
    rescorla_wagner = attributes.submodel

    #Update the Rescorla-Wagner expectation based on the binary observation
    ActionModels.update!(rescorla_wagner, observation)

    #Extract the expected value from the Rescorla-Wagner submodel
    Vₜ = rescorla_wagner.expected_value

    #Transform the expected value with a logistic function to get the action probability, weighted by the action precision β⁻¹
    action_probability = logistic(Vₜ * β⁻¹)

    #Return the action distribution, which is a Bernoulli distribution with the action probability as parameter
    action_distribution = Bernoulli(action_probability)

    return action_distribution
end;

We can then create an ActionModel object, with the observation as an integer (0 or 1) and the action as a Bernoulli distribution.

action_model = ActionModel(
    model_function,
    submodel = submodel,
    parameters = (; action_noise = Parameter(1.0)),
    observations = (; observation = Observation(Int64)),
    actions = (; report = Action(Bernoulli)),
)

-- ActionModel --
Action model function: rescorla_wagner_bernoulli_report
Number of parameters: 1
Number of states: 0
Number of observations: 1
Number of actions: 1
submodel type: ActionModels.BinaryRescorlaWagner

And proceed to simulate

#Initialize agent
agent = init_agent(action_model, save_history = :expected_value)

#Create observation
observations = [0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1]

#Simulate actions
simulated_actions = simulate!(agent, observations)

#Get the history of expected values and transform them to probability space
transformed_expected_values = logistic.(get_history(agent, :expected_value))

#Plot the trajectory of expected values (starting at time 0, before receiving inputs)
plot(0:length(transformed_expected_values)-1, transformed_expected_values, label = "Expected value (probability space)")
plot!(simulated_actions, seriestype = :scatter, label = "Actions")
plot!(observations, seriestype = :scatter, label = "Observations")

Or to fit the model

model = create_model(
    action_model,
    (learning_rate = LogitNormal(), action_noise = LogNormal(), initial_value = Normal()),
    observations,
    Int64.(simulated_actions),
)

chns = sample_posterior!(model)

plot(chns)

Categorical variant

We can also create a categorical Rescorla-Wagner submodel:

submodel = ActionModels.CategoricalRescorlaWagner(
    n_categories = 4,           # Number of categories
    initial_value = zeros(4),   # Initial expected values for each category
    learning_rate = 0.5,        # Learning rate
);

And an accompanying action model function with a categorical report action. Now, the observation is a categorical value (1, 2, 3 or 4), and the expected value is transformed with a softmax function before being used to calculate the action distribution. We additionally rely on LogExpFunctions for the softmax function.

using LogExpFunctions

model_function = function rescorla_wagner_categorical_report(
    attributes::ModelAttributes,
    observation::Int64,
)
    #Load the action noise parameter
    parameters = load_parameters(attributes)
    β = parameters.action_noise

    #Transform the action noise into a precision (or inverse noise)
    β⁻¹ = 1 / β

    #Extract Rescorla-Wagner submodel
    rescorla_wagner = attributes.submodel

    #Update the Rescorla-Wagner expectation based on the categorical observation
    ActionModels.update!(rescorla_wagner, observation)

    #Extract the vector of expected values from the Rescorla-Wagner submodel
    Vₜ = rescorla_wagner.expected_value

    #Transform the expected value with a logistic function to get the action probability, weighted by the action precision β⁻¹
    action_probabilities = softmax(Vₜ .* β⁻¹)

    #Return the action distribution, which is a Bernoulli distribution with the action probability as parameter
    action_distribution = Categorical(action_probabilities)

    return action_distribution
end;

We can then create an ActionModel object, with the observation as an integer (0 or 1) and the action as a Bernoulli distribution.

action_model = ActionModel(
    model_function,
    submodel = submodel,
    parameters = (; action_noise = Parameter(1.0)),
    observations = (; observation = Observation(Int64)),
    actions = (; report = Action(Categorical)),
)

-- ActionModel --
Action model function: rescorla_wagner_categorical_report
Number of parameters: 1
Number of states: 0
Number of observations: 1
Number of actions: 1
submodel type: ActionModels.CategoricalRescorlaWagner

And proceed to simulate:

#Initialize agent
agent = init_agent(action_model, save_history = :expected_value)

#Create observation
observations = [2, 1, 2, 2, 1, 2, 3, 4, 3, 2, 2, 2, 1, 1]

#Simulate actions
simulated_actions = simulate!(agent, observations)

#Get number of timesteps
n_timesteps = length(observations)

Get the history of expected values, which is a vector of vectors for each category

expected_values = get_history(agent, :expected_value)

#Collect the expected values for each category in a matrix, and transform them to probability space
expected_values = hcat(
    logistic.([expected_values[i][1] for i = 1:length(expected_values)]),
    logistic.([expected_values[i][2] for i = 1:length(expected_values)]),
    logistic.([expected_values[i][3] for i = 1:length(expected_values)]),
    logistic.([expected_values[i][4] for i = 1:length(expected_values)]))

#Use a softmax to normalize the expected values for each category into a proper categorical distribution
expected_values = transpose(hcat(softmax.(eachrow(expected_values))...))

15×4 transpose(::Matrix{Float64}) with eltype Float64:
 0.25      0.25      0.25      0.25
 0.241989  0.274032  0.241989  0.241989
 0.266433  0.262324  0.235621  0.235621
 0.255127  0.285061  0.229906  0.229906
 0.245237  0.305257  0.224753  0.224753
 0.268674  0.288169  0.221578  0.221578
 0.256168  0.308003  0.217915  0.217915
 0.249406  0.292533  0.240297  0.217765
 0.244302  0.279621  0.236824  0.239253
 0.238863  0.267488  0.258928  0.23472
 0.232814  0.288708  0.249072  0.229406
 0.227152  0.308107  0.240416  0.224326
 0.222059  0.325262  0.232985  0.219694
 0.242634  0.306948  0.230859  0.219559
 0.264285  0.289743  0.227802  0.21817

Create plot title

plot_title = plot(
    title = "Estimated probability of each category over time",
    grid = false,
    showaxis = false,
    bottom_margin = -180Plots.px,
)
#Create plot of probabilities changing
prob_plot = plot(0:n_timesteps, expected_values, label = ["Cat 1" "Cat 2" "Cat 3" "Cat 4"])
#Create plot of observations and simulated actions
obs_plot = plot(observations, seriestype = :scatter, label = "Observed category")
plot!(simulated_actions .+ 0.1, seriestype = :scatter, label = "Prediction for next timestep")
#Create full plot
plot(plot_title, prob_plot, obs_plot, layout = (3, 1))

Or to fit the model

model = create_model(
    action_model,
    (
        learning_rate = LogitNormal(),
        action_noise = LogNormal(),
        initial_value = MvNormal(zeros(4), I),
    ),
    observations,
    simulated_actions,
)

chns = sample_posterior!(model)

plot(chns)

This page was generated using Literate.jl.