**A workshop that offered a selection of shorter snapshot presentations of the exciting interface between data, algorithms and models. Read more about these interesting lectures below!**

**Emulation-driven inference for complex spatial meta-population models**

*Dr, T J McKinley, Lecturer in Mathematical Biology, University of Exeter*

Calibration of complex stochastic infectious disease models is challenging. These often have high-dimensional input spaces, with the models exhibiting complex, non-linear dynamics. Coupled with this is a paucity of necessary data, resulting in a large number of hidden states. Likelihood-based approaches to this missing data problem are very flexible, but challenging to code and optimise due to having to monitor and update these hidden states. Methods based on simulating the hidden states directly from the model-of-interest have the advantage that they are often much more straightforward to code, and thus are easier to implement and adapt to changing model structures. However, they often require very large numbers of simulations in order to adequately explore the input space, which can render them infeasible for many large-scale problems. Here we discuss key challenges from ongoing work when applying and adapting emulation-based methods to calibrate a large-scale, stochastic, age-structured, spatial meta-population model of COVID-19 transmission in England and Wales. In particular we discuss problems of seeding, data assimilation and model discrepancy.

**Spatio-temporal prediction of positivity for the COVID-19 test in Uppsala County**

*Vera van Zoest, Researcher at Department of Information Technology, Division of Systems and Control*

*During the COVID-19 pandemic, test positivity rates have been an important measure of the spread of iinfection. We aimed to predict spatio-temporal trends in test positivity using a full year of information including direct and indirect indicators of transmission. We developed and compared four models for week-ahead predictions of test positivity in 50 service areas in Uppsala Län, Sweden. Our findings indicate that the collection of a wide variety of data can contribute to spatio-temporal predictions ofCOVID-19 test positivity.*

**The use of data on animal holdings and animal transports – a route towards sustainability**

*Uno Wennergren, Professor at the Department of Physics, Chemistry and Biology (IFM), Linköping University*

Data is king we usually say but the question is what is the kingdom? I will show how one can use the same data, and maybe add other datasets as well, and thereby tackle different urgent issues. It will include spread of animal diseases, animal welfare, eutrophication, food security and health. Also a short note on time-dependent rates in disease modeling.

**Parameter and state inference using marginalized particle Gibbs for vector-borne disease ****datasets**

*Anna Wigren, PhD Student Department of Information Technology, Uppsala University*

In this talk I will explain how a stochastic, non-linear state-space model that describes outbreaks of vector-borne diseases can be formulated in a way that allows for aggregating information from different time series. More specifically, we consider three outbreaks where each outbreak shares either location or disease with at least one of the other outbreaks. To solve the inference problem we use a marginalized particle Gibbs sampler designed for inference in multiple state-space models with shared parameters. The choice of inference method imposes some constraints on the model that will also be highlighted.

**Are there different kinds of Bayesian inference?**

Does your model have parameters? Do you care about the interpretation of probability? Do you spend time carefully specifying priors? Are you prepared to go beyond a probability to represent uncertainty about your model? There are different answers to these questions which represent different practices, contexts, and fundamental views on Bayesian inference. The theory behind Bayesian inference is agnostic to interpretations and choices made by modelers, but the results from different applications of the theory are not. I will exemplify this claim with my favorite contrasts: parametric versus predictive

inference, subjective versus objective probability and precise versus approximate probability. Adopting a flexible and mindful approach is useful when interacting with Bayesian modelers as well as with non-modelling experts from different backgrounds and contexts. This talk builds on my experience from working with the implementation of quantitative expressions of uncertainty in scientific assessment in food safety.

**Approximate Bayesian computation (ABC) in SimInf**

*Stefan Widgren, Veterinary epidemiologist, PhD at the The National Veterinary Institute (SVA)*

The SimInf R package provides an efficient and very flexible framework to conduct data-driven infectious disease spread modeling. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths, and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and ‘OpenMP’ (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. Recently,

functionality was added to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo (ABC) algorithm of Toni and others (2009). I will illustrate how to specify an infectious disease spread model in SimInf and apply the recently implemented ABC functionality.