Agent-Based Simulation (ABS) is not yet a common-place term in demographic circles: a quick search of some of the major demographic journals yields not more than a handful of articles that use this or similar terms, and fewer still that have tackled research questions from this perspective. Despite this, it is a methodology which has a lot to offer, and interest in it within the discipline is growing ( see, for example, Diaz et al. (2011), Billari et al. (2007), Entwisle (2007), and a discussion in Courgeau 2012). I hope to use this piece to explain the logic behind ABS, and to argue for its utility in helping demographers in developing causal theories. In this, I follow the work of Francesco Billari and the contributors to the two volumes he co-edited, which have provided the impetus for current demographic endeavours in this direction.
So what is Agent-Based Simulation? In common with many other approaches in social science, it aims to understand the real world by creating a model (Epstein, 2008). These models are representations of the real world: sometimes statistical, sometimes mathematical, and in other cases purely theoretical or even implicit. Whatever the form, it is hoped that by analysing the model we can better understand the phenomena it represents. Simulation is no different, except that the model is formalised as a computer programme instead of, say, a regression equation (c.f. Rossiter, 2010).
In an agent-based simulation, individuals are modelled explicitly (rather than in aggregate), and they are endowed with agency, by which I mean their decision making process is somehow modelled, however simply. Often this means positing behaviour rules or strategies for agents, in forms such as “if X, do Y”. Generally, agent-based models also include interactions between agents and how these may affect decisions.
In many ways, ABS follows similar logic to statistical modelling (see Figure), in that a model is created to test a hypothesis about some real world social process. In statistical modelling, the outcomes of interest are generally parameters on particular model covariates, which are estimated from empirical data. The direction and size of these parameters allows us to infer about the hypothesis. In ABS, synthetic data is generated from running a model that formalises a particular theory about how people behave. Because the simulation is a representation of some social process, this data should be analogous to corresponding quantities in the real world, allowing meaningful comparisons of synthetic and empirical. A high degree of concordance between the generated data and that which is observed gives us confidence that the theory may be sensible (Gilbert and Troitszsch, 2005).
So why should we, as demographers, care about ABS? The most important reason, I would argue, is that ABS allows us to examine the plausibility of causal theories linking individual behaviour to population level outcomes . Given some hypothesis about how demographic change is caused by particular decision-making strategies, a simulation can be created that codes behavioural rules into artificial agents in such a way as to formalise this theory. By comparing synthetic simulation outputs (hazard rates, for instance) with empirical data, we can get an idea about whether the behavioural mechanism we have posited is plausible.
ABS is particularly powerful because it allows a lot of freedom and flexibility in the types of model that can be specified. The micro (individual) and the macro (population) level can be modelled simultaneously and dynamically. Not only that, but we can include relationships and even feedback loops between the different levels of aggregation – for example, one could simulate not only how migration rates depend on individuals’ cost/benefit analyses, but also how these analyses depend on population densities and thus on past migration rates. The social context of demographic behaviour does not have to be overlooked either. Because interaction between agents is central to ABS, phenomena such as social influence, opinion diffusion, and information spread are natural candidates for inclusion in demographic simulations (see, for example, the paper by Billari et al 2007).
Modelling within the confines of a computer programme has further advantages, namely that the researcher has complete control over the variables and processes in the model. This means that simulation can be treated as an experiment; we can start a simulation with one assumption, examine the outcome, and then ‘run back time’ and rerun the model under a counterfactual case, holding everything else constant, and examine the difference. This affords us the ability to determine what causes what, at least in the simulated world.
Not everything is sunshine and rainbows for Agent-Based Simulators, however. The biggest problem surrounds validation. Any simulation created by a modeller is an abstraction from and simplification of reality. A case must be made as to why it is thought that the model bears enough relation to its real world analogue to be able to give us useful insight. In particular, although a simulation based on some behavioural assumptions might generate patterns that match empirical data, this does not guarantee that the assumptions are correct: the model might be ‘right in the wrong way’, as it were, and in fact another mechanism might be driving the process under study (Oreskes 1994).
A couple of strategies can be adopted to try and mitigate these problems. Firstly, stringent approaches can be adopted to matching synthetic to empirical data. One can attempt to match at multiple levels of aggregation, at different times, and in different contexts. Calibrating a model using one set of empirical data, while using a different data for validation is also an option. Secondly, confirmation and support can be sought from other methodologies through a process of triangulation. For example, the simulation model might have implications that can be tested through the collection of new quantitative data. Qualitative studies and insights from psychological and cognitive theory might also give confidence that the behavioural strategies suggested are reasonable (e.g. Hills and Todd, 2008).
We must be humble about the utility of ABS, as the problems of validation remain challenging. However, Agent Based Simulation has a lot to offer demographers. More than anything, it offers an opportunity to model dynamic causal links between individual behaviour and population outcomes, and to examine the ways in which social interactions mediate between these levels.
The author would appreciate comments, questions or critiques relating to any element of this piece.
Contact Jason by email: firstname.lastname@example.org.
Jason Hilton is a PhD student at the Institute for Complex Systems Simulation and the Division of Social Statistics and Demography, University of Southampton. His research examines how agent-based modelling can be used to develop psychologically viable models of demographic decision-making and interaction, and how such models can be validated and calibrated with empirical data. He holds an MSc in Demography from the University of Southampton and a BA in Politics from the University of York. He has also worked in Market Research Analysis at the Nielsen Company. His interests include music and politics.