Abstract:
Humans excel at learning from the ever-present social stimuli in their environment. To understand this cognitive capacity, computational models of social learning often borrow from the longstanding tradition of computationally modelling individual learning; reinforcement learning models can be augmented with social information to model imitation behaviour, and planning models can be inverted to model inference behaviour. While a fruitful approach, this influence comes at a cost: social cognition research commonly adapts experimental designs and analysis methods from individual learning, treating social information as little more than a reward signal. This strips social cognition of the richness of stimuli we experience in even the most banal everyday situations, and risks a systematic underestimation of our social learning capabilities. To address this, I applied findings from three key domains of individual learning to social learning contexts: generalization, learning rate biases, and cost-benefit strategy arbitration. I modified existing paradigms to bring them closer to naturalistic social learning settings, and critically examined to what extent findings from the individual learning literature can be applied to social learning, and how they differ in their application.
In the first project, we examined how humans integrate social information into individual decision-making when learning from others with similar, though not identical, reward functions. Prior studies in this domain have focused on cases where the demonstrator and the observer share the same reward function, limiting our understanding about the flexibility of our social learning abilities. We extended the spatially correlated multi-armed bandit paradigm, which is commonly used to investigate individual generalization, to social settings. This allowed us to better capture how humans may share some general preferences, while maintaining specific individual tastes. Participants used social information more flexibly than previous paradigms indicated, with our novel Social Generalization model, which treats social information as noisier individual information, providing the best fit. Social information was used as an exploration tool, outsourcing costly individual exploration to others – a resource-rational approach consistent with findings from individual learning literature showing that humans often use heuristics to simplify cognitively complex tasks. Additionally, participants performed better in group versus solo settings, in line with existing results on collective intelligence.
In the second project, we investigated how learning rate biases differ between individual and social learning settings. We tested whether the positivity bias frequently reported in individual reinforcement learning also manifests in observational learning settings, and whether this bias benefits agent performance. In individual learning, positivity bias refers to the tendency to weight positive outcomes more strongly than negative outcomes; in social learning, it translates to weighting a demonstrator’s positive feedback more strongly than their negative feedback. While we replicated the persistent positivity bias found in individual learning, we only found a social positivity bias in environments where it was adaptive. This shows higher flexibility of learning rate biases in social settings compared to individual contexts.
In the third project, we tested whether the resource-rational cost-benefit arbitration between strategies observed in individual learning extends to social learning. Inspired by distinctions between model-free learning and model-based learning, we designed a task to investigate how humans switch between imitation, value inference, and model-based inference when relative costs and benefits of each strategy vary. This extends previous literature on arbitration between levels of social learning by incorporating a distinction between value inference and model-based inference, and accounting for cognitive cost in addition to strategy reliability. Participants adjusted their learning strategy based on task demands, favouring imitation when higher level strategies were more costly. This mirrors findings from individual settings, where humans favour simpler strategies when planning is complex and unreliable. However, some participants persisted in their higher level strategy use even when it was not beneficial for them, hinting at a general bias towards social inference. Neural analysis provides insight into how humans make this trade-off: participants imitated when they were uncertain about the structure of the environment and still attempting to infer it. When they were more certain about the environment, they no longer relied on social inference, instead basing their choices on the information they had inferred previously.
Across three domains, this thesis demonstrates the way social learning is characterized by distinct behavioural patterns when testing individual learning principles in settings that preserve some of the richness of our social environments. Participants often performed better in social contexts, underscoring the importance of investigating human intelligence in the settings we evolved to excel in. More broadly, this thesis contributes to descriptions of human behaviour by providing computational models, contrasted with normative strategies and individual learning in the same settings, which help us better understand the unique aspects of human social cognition. [untranslated]