2  Quantitative Techniques

THIS IS A CHAPTER STUB FOR THE SOCIAL SCIENCE FOR SUSTAINABILITY BOOK. IT IS CURRENTLY JUST THE FOLLOWING DRAFT VERSION OF THE CHAPTER THAT WAS MUCH MORE AGENT-BASED MODELING FOCUSED. CALCULUS AND OTHER MATHEMATICAL TOPICS WILL BE INCLUDED IN THE UPDATED VERSION FOR FALL 2025 QUARTER.

2.1 Old Draft Version of this Chapter

This section introduces agent-based modeling in socmod starting with basic functional and object-oriented programming in R. It therefore builds on an assumed knowledge of R basics which can be gained by studying, for example, Hands-On Programming with R or other basic R tutorial introductions.

To model social behavior it helps to have intuitive software tools grounded in an empirically-motivated, self-consistent scientific theory. socmod provides those tools, but to put it to best use it’s helpful to know how it works, especially for people used to using R in a data science context. Many of the same tools are transferable, especially declarative/functional programming, as is used and encouraged by the tidy approach in R for Data Science.

To create simulations that generate social behavior we have to go beyond tabular representations of the world to more complex representations of people and their interactions that can be measured. These measurements are recorded in a standard tabular format, e.g., CSV, which can then be analyzed using tidy/R for Data Science strategies.

In agent-based modeling, we create software representations of simulated people, i.e. agents, or other interacting entities, including the agents’ environment. Object-oriented programming is the natural choice for software design in this case because it provides a structure for defining custom objects like socially interacting agents. An object is one bit of data that could be a number or character type, but could also be something more complicated. In object-oriented programming we define custom classes that specify various data fields and function methods for maintaining and modifying the state of objects. The fields and methods of a class therefore are sets of related data and functions to represent things in the world.

To summarize, objects are a way to track and modify the state of different software entities. We can create simulations of real-world systems by defining custom object types called classes whose state and behaviors are modeled on relevant real-world features and behaviors.

In socmod, we define custom R objects for agent-based models of social behavior using the R6Class method that creates a new object type, i.e., a new class. We use tidy-style functional programming where helpful to represent model features and dynamics.

In the remainder of this chapter, I will first review variable assignment and data structures in R including vectors, lists, data.frames, and tibbles. We then review relevant topics in functional programming, then demonstrate how to create custom classes in R with the library R6. This tutorial then closes with a demonstration of a simple four-agent agent-based model of social behavior, kept simple to highlight the functional and object-oriented design patterns in socmod, representing Agent, an AgentBasedModel, and Trialclasses defined using the R6Class function.

2.2 Data collection types in R

In R, there is a hierarchy of data collection types that are necessary to know about for functional programming, which often involves applying a function across every element of a collection of data. Data collections include vectors, lists, data.frames, and tibbles, which we’ll cover here.

2.2.1 Collections of basic data types

In computer science, a collection is an abstract data type that organizes and stores instances of other data types. In R, the most basic data types are numbers (e.g., double or integer) and text, whose data type is character in R whether it’s a single character or several:

print(typeof('c'))
[1] "character"
print(typeof("social science is kewl"))
[1] "character"

2.2.2 Vectors

R vectors are defined using the c() function, e.g., vec <- c(0, 0, 1). The data type of all vector elements must be the same. This is enforced by R coercing data to different types, for example:

print(c(0, 0, 1))
[1] 0 0 1
print(c(0, 0, "yo"))
[1] "0"  "0"  "yo"

We index vectors using single square brackets, with R indexing starting from 1:

vec <- c(1, 2, 3, 4, 5) # equiv to vec <- 1:5 or seq(1, 5)
print(vec[1])
[1] 1
print(vec[length(vec)])
[1] 5

2.2.3 Lists

If one wants to keep elements of different types in a single collection, use the R list:

l <- list(0, 542, "yo")
print(l)
[[1]]
[1] 0

[[2]]
[1] 542

[[3]]
[1] "yo"

Note the visual representation has changed on printout. We now need to use double-square brackets to index list elements themselves:

print(l[[3]])
[1] "yo"

If we leave off one of the square brackets we effectively get a sub-list with just one element:

print(l[3])
[[1]]
[1] "yo"

This is useful if you want to create a new sub-list with more than one element:

print(l[c(2, 3)])
[[1]]
[1] 542

[[2]]
[1] "yo"

2.2.4 Named lists

Named lists are the primary key-value store in R, just like dict in Python. It allows us to label entries of the list and access them using double square brackets with the character name or using the $ access operator:

named_l <- list(a = c(0, 5, 6), b = c(7, 8, 9))
print(named_l$a == named_l[["a"]]) # compares element-by-element
[1] TRUE TRUE TRUE
print(all(named_l$b == named_l[["b"]]))
[1] TRUE

2.2.5 tibble (and data.frame)

Both the tibble and data.frame classes represent tabular data, meaning data that can be represented in table format, e.g., in comma- or tab-separated value format. There are some subtle differences listed below, but we’ll use tibble for representing our tables.

Tibbles are like fancy lists that have special properties that make data manipulation more efficient. The details of how this works aren’t super important at this point. The important thing to know is how one can initialize tibbles and access tibble columns similarly to lists, as shown in the following example:

tbl <- tibble::tibble(a = c(0, 5, 6), b = c(7, 8, 9))
print(tbl$a)
[1] 0 5 6
print(tbl[["b"]])
[1] 7 8 9

Using tibbles ensures that data manipulation and analysis using the tidyverse will work as expected. For example, the tidyverse provides functions for analyzing different groups of data within a larger dataset. This is a common data analysis pattern called split-apply-combine, which in the tidyverse translates to group-by and summarise when using the dplyr library. For example, we can calculate the mean of measurements in “experimental” and “control” conditions in some fake data:

observations <- tibble::tibble(
  condition = c("experimental", "experimental", "control", "control"),
  measurement = c(13.5, 14.6, 3.4, 5.4)
)

mean_measurement_tbl <- 
  observations %>% 
  dplyr::group_by(condition) %>% 
  dplyr::summarise(mean_measurement = mean(measurement))

print(mean_measurement_tbl)
# A tibble: 2 × 2
  condition    mean_measurement
  <chr>                   <dbl>
1 control                   4.4
2 experimental             14.0

2.2.6 Difference between data.frame and tibble

The tibble library provides a table data representation (also called tibble) that is a bit more flexible and intuitive than the R built-in data.frame. One reason I prefer tibbles is because traditional data.frames automatically convert strings to factors unless you tell them not to. Tibbles don’t, so you’re never surprised.

Column naming is also more flexible with tibbles. A data.frame requires syntactically valid R names, while tibbles can handle column names that include spaces or even non-standard characters. I often use this feature to use column names like Mean adaptation success that print nicely when used as labels in ggplot.

Finally, tibble operations always return a tibble. For example, if df were a data.frame, the operation df[, 1] would return a vector by default. If it were a tibble it would return another tibble, making behavior more predictable in data analysis pipelines.

2.3 Functional programming

Functional programming is especially useful for writing code that applies the same function to several inputs, but wants to leave it up to the user to specify exactly which function should be applied. In socmod we pass functions as arguments to other functions to specify how agents pick their interaction partners (i.e., teachers in a learning context) and how social learning works. Here are some simple examples demonstrating the key concepts and techniques of functional programming that could help with socmod programming.

2.3.1 Higher-order functions: functions with function arguments

In R, and other programming languages supporting the functional style, one can treat functions like any other data and pass it as an argument to other functions. A function that accepts a function as an argument is called a higher-order function. Here is a simple example of a higher order function that takes some data as a first argument and applies the function f to that data twice, putting the result of each calculation in a two-element vector.

repeat_2_higher_order_func <- function(data, f) {
  return (c(f(data), f(data)))
}

Below we call this higher order function by providing data=2 as the first argument and an anonymous function as the second argument, which can be written using \(arg) { ...function body... } syntax:

# \(x) {...} is equivalent to function(x) { ... }; these are 
# anonymous functions.
# Expecting to return c(4, 4)
repeat_2_higher_order_func(data = 2, f = \(x) { return (x * 2) }) 
[1] 4 4

The above is the representation of a vector printed to screen, so we see that our expectations were matched.

c(4, 4)
[1] 4 4

2.3.2 map: a common, useful higher-order function used often in socmod

One of the most useful and common higher-order functions is the map function. This function maps a function, denoted .f below, onto every element in a collection, denoted .x. This notation is from the map family of functions in the purrr library for tidy functional programming.

library(purrr)

Here is an example of mapping an anonymous function that multiplies its input by 3 onto a vector with entries 2 and 8:

# Now get we use map_vec that applies .f to every element of .x,
# expecting the following to return a vector with elements 3*2 
# and 3*8, i.e., c(6, 24).
purrr::map_vec(.x = c(2, 8), .f = \(x) { return(3 * x)})
[1]  6 24

If we want to apply a function that takes two variables, the value and index of an element in a collection, we can use the imap family of functions as follows to return a vector that contains the original element multiplied by its place in the input vector:

input_vec <- c(2, 5050, 6)
purrr::imap_vec(input_vec, \(el, idx) { return (idx * el) })
[1]     2 10100    18

2.4 Custom objects: R6 classes in socmod

Classes are ways to encapsulate diverse distinct, but related, processes, behaviors, data, attributes, and other types of information in a single object, i.e., a software representation of an instance of that entity. R6 is a library for creating our own custom classes that serve as an abstract template that specifies what distinguishes different types of objects/entities. Below we show first how to create a new agent, i.e., a new instance of the Agent class that is provided by socmod. After that is a fun example of how we can design a different social behavioral model, one of football/soccer matches. We write classes for players and teams and develop a play_match(team1, team2) function that pits two teams against each other.

2.4.1 The Agent class in socmod

Below we create a new instance of the Agent class using the class constructor, the function written socmod::Agent$new() below that creates a new instance of the class.

a1 <- socmod::Agent$new(1, name = "Matt", 
                        behavior = "Adaptive", 
                        fitness=1e6)

We can use the access operator, $ in R, to access the fields (i.e., attributes) of agent a1 like so:

print(a1$get_fitness())
[1] 1e+06
print(a1$get_name())
[1] "Matt"
print(a1$get_id())
[1] 1
print(a1$get_behavior())
[1] "Adaptive"

We can also use purrr::map functions over neighbors like so:

# Assign neighbors to a1.

a1$set_neighbors(c(
  socmod::Agent$new(id = 2, name = "n2"),
  socmod::Agent$new(id = 3, name = "n3")
))

# Get the list of neighbors back to check it worked.
neighbors <- a1$get_neighbors()
print(class(neighbors))  # should be [1] "Neighbors" "R6"
[1] "Neighbors" "R6"       
# Neighbors$map() returns a list...
neighbor_names <- neighbors$map(\(n) n$get_name())
print(neighbor_names)
[[1]]
[1] "n2"

[[2]]
[1] "n3"
# ...use `unlist` to convert it to a vector:
print(unlist(neighbor_names))
[1] "n2" "n3"

2.4.2 Exercise: design and define classes from scratch

In class we started creating our model of a soccer player agent called Footballer, defined below. We ran out of time at the end of class to write methods for Footballer, i.e., ways that a football player could interact with the world, or that the world could act upon a soccer player. However we only had time to create a stub for two methods. A stub is a minimal chunk of code that does very little to nothing, but doesn’t get in the way by causing errors or anything like that. It enables us to document our plans for future development in the exact place where it would happen in the code.

Below we have stubs for score_goal and get_penalty methods for in-game behaviors. Other possibilities could include get_traded that would change its team and perhaps get_signed for cases where a player is a free agent.

library(R6)

Footballer <- R6Class("Footballer",
                        
  public = list(
    # Listing attributes as fields and 
    # setting to zero for their definition.
    speed = 0.0,  # units of max km/h
    accuracy = 0.0, # probability of scoring on a shot
    market_value = 0.0, # 
    aggressiveness = 0.0, # units of penalties per match
    team = "",
    
    initialize = function(speed = 15, 
                          accuracy = 0.2, market_value = 1e6, 
                          aggressiveness = 0.5, 
                          team = "Free agent") {
      self$speed = speed
      self$accuracy = accuracy
      self$market_value = market_value
      self$aggressiveness = aggressiveness
      self$team = team
    },
    
    # Stub two SoccerPlayer class methods...
    # ...one for scoring a goal in a game...
    scored_goal = function() {
      return (ifelse(runif(1) < self$accuracy, 1, 0))
    },
    # ...and one for getting a penalty in a game.
    get_penalty_on_play = function() {
      return (runif(1) < self$aggressiveness)
    }
  )
)

Team <- R6Class("Team",
                
  public = list(
    name = "",
    players = list(),
    wins = 0,
    ties = 0,
    payroll = 0,
    games_played = 0,
    
    # Team name is required, with players optionally specified.
    initialize = function(name, players = list()) {

      self$name = name

      # Initialize players and payroll.
      for (player in players) {
        self$payroll <- self$payroll + player$market_value
        player$team <- name
      }

      self$players <- players
    },
    
    # Add a player to the roster.
    sign_player = function(player) {
      # Add the player to the team.
      self$players <- c(players, player)
      # Update payroll.
      self$payroll <- self$payroll + player$market_value
      # Update the player's team to be this team's name.
      player$team <- self$name
    }
  )
)
footballer <- Footballer$new(accuracy = 0.5)
footballer$scored_goal()
[1] 1
# sum(purrr::map_vec(1:10, \(.) p$scored_goal()))

We can define a function that simulates playing a football match where the team that scores more goals wins, or a tie if teams have the same score. We track penalties received but assume they don’t have an effect on the match outcome. We assume each player gets ten chances to score a goal. We leave it to the reader as an exercise to make use of the Footballer$get_penalty_on_play() method within the game to add penalties and consequences in the game (e.g., penalty kicks or something), and to use the Team$sign_player() add more players to each roster, which should increase the mean total scores in each game based on the simple way we’ve modeled gameplay in play_match.

# Define a function that models a football match.
play_match <- function(team1, team2) {
  
  # Walk over each player to see how many scores they get, 
  # summing to get the total team score. 
  team1_score <- sum(
    purrr::map_vec(
      team1$players,
      \(player) {
      # Calculate the total goals scored by the current player.
        sum(purrr::map_vec(1:10, \(.) player$scored_goal()))
      }
    ) 
  )
  
  team2_score <- sum(
    purrr::map_vec(
      team2$players,
      \(player) {
      # Calculate the total goals scored by the current player.
        sum(purrr::map_vec(1:10, \(.) player$scored_goal()))
      }
    ) 
  )
  
  if (team1_score == team2_score) {
    team1$ties <- team1$ties + 1
    team2$ties <- team2$ties + 1
    cat("Tie game!\n")
  } else if (team1_score > team2_score) {
    cat(team1$name, "wins!!!\n")
    team1$wins <- team1$wins + 1
  } else {
    cat(team2$name, "wins!!!\n")
    team2$wins <- team2$wins + 1
  }
  
  cat(team1$name, ": ", team1_score, "   ",  
      team2$name, ": ", team2_score, "\n")
  
  team1$games_played <- team1$games_played + 1
  team2$games_played <- team2$games_played + 1
}

First we need teams, which we create and to which we assign players using the constructor, Team$new() in the code below.

# Initialize two teams, Whales and Squirrels, each with two
# players. All players have identical default attributes.
whales <- Team$new(
  name = "Whales", 
  players = c(Footballer$new(), Footballer$new())
)
squirrels <- Team$new(
  name = "Squirrels", 
  players = c(Footballer$new(), Footballer$new())
)

# Play three matches.
play_match(whales, squirrels)
Squirrels wins!!!
Whales :  3     Squirrels :  4 
play_match(squirrels, whales)
Tie game!
Squirrels :  3     Whales :  3 
play_match(squirrels, whales)
Whales wins!!!
Squirrels :  2     Whales :  5 
# Print how many games each time won.
cat("\nAfter three games...", "\nThe Whales have won", 
    whales$wins, "games and the Squirrels have won", squirrels$wins)

After three games... 
The Whales have won 1 games and the Squirrels have won 1

2.5 An aside on copy vs reference semantics in R: assignment and mutation

When you change something in R, sometimes R makes a copy of the object so the original stays the same, and sometimes it changes the object by reference, meaning it updates the original directly. Understanding whether an operation creates a new copy or modifies an object in place (called mutation) is important because it affects whether you need to assign the result back to a variable or not. Even though this idea is usually taught in more advanced programming, it’s useful for beginners working with modeling and simulation because different parts of R (like data frames vs. agent-based models) follow different rules, and knowing which is which can help avoid confusing bugs.

2.5.1 1. The underlying tension: how R handles object copying and modification

At the heart of R’s behavior lies a subtle technical distinction: whether an object is copied or modified in place when you change it. By default, R uses copy-on-modify semantics, meaning that if you change an object, R creates a new copy behind the scenes and leaves the original unchanged. This protects your data from unintended side effects and reflects R’s origins as a functional programming language for statistics. But as R evolved to support more object-oriented programming and to interface with other languages like C and C++, it introduced types of objects—such as environments, external pointers, and R6 classes—that instead use reference semantics. With these objects, changes happen directly to the object itself without making a copy. Internally, R’s i nterpreter keeps track of when it needs to copy or reuse memory, but these mechanics aren’t always visible, which can lead to surprises when working with more complex objects.

2.5.2 2. A practical guide for navigating copy and reference semantics

For most core R data types—vectors, lists, data frames—you can assume that R makes a copy whenever you modify an object. But for certain kinds of objects, especially environments, R6 classes (used in socmod), and graphs from the igraph package, the behavior is different: these follow reference semantics, meaning they’re modified directly. In socmod, all models and agents are implemented using R6 classes, so when you call something like model$set_parameter("n_agents", 100), the model’s state is updated directly and immediately—no need to reassign model <- .... This makes working with simulations feel more natural, but it also means that habits from working with regular R objects don’t fully apply. A simple rule of thumb: if it’s an R6 object or a special object like an igraph graph, expect it to behave by reference; otherwise, expect copy-on-modify.

2.5.3 3. How igraph fits in: working with C-backed graphs in R

Much of igraph isn’t written in R itself, but in another programming language called C. C allows developers to write code that’s much faster and more efficient than what’s possible in R alone. This efficiency is great for users—igraph can handle large networks quickly—but it also introduces a subtle complication to how we work with graphs in R. When you modify an igraph object, you’re actually asking C code to change the graph behind the scenes. But instead of directly changing the graph you already have, igraph’s functions usually return a new graph object with the changes applied. This means that even though igraph mostly follows reference semantics (where an object can be updated without copying everything), you still need to explicitly reassign the result of a modification to keep the updated graph. It’s different from how socmod’s models and agents work: those are built with R6 classes, which really do modify objects “in place” without needing reassignment.

2.6 Agent-based models of social behavior

Here we develop our first agent-based models of social behavior. We’ll start small, with the following four-household network of solar panel electrification using only success-biased learning (Figure 2.1). Then we’ll do a detailed comparison of different social learning strategies with different parameterizations.

Figure 2.1: Example social network and sustainable adaptation diffusion problem.

In the example, agent 2 has adopted rooftop solar. It has some adaptive fitness, which can be thought of as the financial gain from adopting a sustainable, adaptive behavior. We can directly create the social network using igraph::make_graph() then set it as the model’s social network, along with setting the learning strategy to be contagion learning.

The adoption_rate parameter specifies how likely another agent is to adopt the adaptive behavior when exposed to it. The drop_rate is how likely an agent doing the adaptive behavior is to stop doing it and revert to the non-adaptive behavior.

2.6.1 Example 1: Time series of adaptation diffusion

adoption_rate = 0.8
 
# Run model with 
g <- igraph::make_graph(~ 1-2, 1-3, 1-4, 3-2)
mps <- make_model_parameters(
  contagion_model_dynamics, graph = g, adoption_rate = 0.8, drop_rate = 0.1
)
cabm <- make_abm(mps); 
cabm$get_agent(2)$set_behavior("Adaptive"); 
trial <- run_trial(cabm, stop = fixated)
plot_prevalence(trial, tracked_behaviors = c("Adaptive", "Legacy"))
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?

# Same, but now call the adaptive behavior "Baglowie" instead of "Adaptive".
g <- igraph::make_graph(~ 1-2, 1-3, 1-4, 3-2)
mps <- make_model_parameters(
  contagion_model_dynamics, graph = g, adoption_rate = 0.8, drop_rate = 0.1
)

cabm <- make_abm(mps)
cabm$get_agent(2)$set_behavior("Baglowie"); 

trial <- run_trial(cabm, stop = fixated, adaptive_behavior = "Baglowie")
plot_prevalence(trial, tracked_behaviors = c("Legacy", "Baglowie"))
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?