### Applied evolutionary biology: tracking and predicting the spread of disease

Richard Neher

Biozentrum, University of Basel

slides at neherlab.org/201712_ICTP2.html

### Sequences record the spread of pathogens

##### The resolution is limited by the number of mutations!

images by Trevor Bedford
### Human seasonal influenza viruses

slide by Trevor Bedford
### Influenza seasonality - USA

- Influenza viruses evolve to avoid human immunity
- Vaccines need frequent updates

## Beyond tracking: can we predict?

slide by Trevor Bedford
### Clonal interference and traveling waves

RN, Annual Reviews, 2013; Desai & Fisher; Brunet & Derrida; Kessler & Levine
- Epitope mutations: association with antigenic change
- Non-epitope mutations: likely deleterious
- Nonlinear component: synonymous mutations

$$W = \frac{x_i(t+1)}{x_i(t)} = e^{f_0 + \alpha f_{ep} + \beta f_{ne} + \gamma f_{nl}}$$

#### Typical tree

#### Bolthausen-Sznitman Coalescent

RN, Hallatschek, PNAS, 2013; see also Brunet and Derrida, PRE, 2007
### Predicting evolution

### Given the branching pattern:

- can we predict fitness?
- pick the closest relative of the future?

RN, Russell, Shraiman, eLife, 2014
### Fitness inference from trees

$$P(\mathbf{x}|T) = \frac{1}{Z(T)} p_0(x_0) \prod_{i=0}^{n_{int}} g(x_{i_1}, t_{i_1}| x_i, t_i)g(x_{i_2}, t_{i_2}| x_i, t_i)$$

RN, Russell, Shraiman, eLife, 2014
### Validation on simulated data

RN, Russell, Shraiman, eLife, 2014
### Prediction of the dominating H3N2 influenza strain

- no influenza specific input
- how can the model be improved? (see model by Luksza & Laessig)
- what other context might this apply?

RN, Russell, Shraiman, eLife, 2014
### Hemagglutination Inhibition assays

Slide by Trevor Bedford
### HI data sets

- Long list of distances between sera and viruses
- Tables are sparse, only close by pairs
- Structure of space is not immediately clear
- MDS in 2 or 3 dimensions

Smith et al, Science 2002
Slide by Trevor Bedford
### Integrating antigenic and molecular evolution

- $H_{a\beta} = v_a + p_\beta + \sum_{i\in (a,b)} d_i$
- each branch contributes $d_i$ to antigenic distance
- sparse solution for $d_i$ through $l_1$ regularization
- related model where $d_i$ are associated with substitutions

RN et al, PNAS, 2016
### Integrating antigenic and molecular evolution

- MDS: $(d+1)$ parameters per virus
- Tree model: $2$ parameters per virus
- Sparse solution

→ identify branches or substitutions that cause titer drop

RN et al, PNAS, 2016
### Are antigenic distances tree-like?

### Rate of antigenic evolution

- Cumulative antigenic evolution since the root: $\sum_i d_i$
- A/H3N2 evolves faster antigenically
- A/H3N2 has a more rapid population turn-over
- Proportion of children is high in B vs A/H3N2 infections

### How many sites are involved?

Mutation | effect |

K158N/N189K |
3.64 |

K158R |
2.31 |

K189N |
2.18 |

S157L |
1.29 |

V186G |
1.25 |

S193F |
1.2 |

K140I |
1.1 |

F159Y |
1.08 |

K144D |
1.08 |

K145N |
0.91 |

S159Y |
0.89 |

I25V |
0.88 |

Q1L |
0.85 |

K145S |
0.85 |

K144N |
0.85 |

N145S |
0.85 |

N8D |
0.73 |

T212S |
0.69 |

N188D |
0.65 |

### Predicting successful influenza clades

### Predicting successful influenza clades

### HI distances on the phylogenetic tree

## NextStrain architecture

#### Using treetime to rapidly compute timetrees

## Summary

- Evolutionary biology can help track and fight disease
- Theory shows how to infer fit clades
- Future influenza population can be anticipated
- Automated real-time analysis can make up-to-date analysis available to every body

### Influenza and Theory acknowledgments

- Boris Shraiman
- Colin Russell
- Trevor Bedford
- Oskar Hallatschek

### nextstrain.org

- Trevor Bedford
- Colin Megill
- Pavel Sagulenko
- Sidney Bell
- James Hadfield
- Wei Ding