At the end of last week, you may have seen some rather scary news: the R value, the average number of people infected by each person who has Covid-19, had gone back up.
Professor John Edmunds of the London School of Hygiene and Tropical Medicine appeared before the Science and Technology Committee of the House of Commons and said as much. The R value was now between 0.6 and 1, he thought, but âif youâd asked me two weeks ago Iâd have given lower numbers, about 0.6 or 0.7, maybe up to 0.8.â It got quite a lot of attention.Â
But I want to suggest (as in fact Edmunds himself did) that this isnât the bad news people seem to think; in fact itâs a product of our success, rather than of our failure. Hereâs why.
Thereâs an interesting statistical anomaly called Simpsonâs paradox. It is that you can find a trend going one way in lots of individual datasets, but that when you combine those datasets, it can make the trend look like itâs going the other way.
That sounds quite dry, so let me use a famous example. In the autumn of 1973, 8,442 men and 4,321 women applied to graduate school at UC Berkeley. Of those, 44% of the men were admitted, compared with just 35% of the women.
But this is not the clear-cut example of sex bias that it seems. In most of the departments of the university, female applicants were more likely than male applicants to be admitted. In the most popular, 82% of women were admitted compared to just 62% of men; in the second most popular, 68% of women compared to 65% of men. Overall, there was a âsmall but statistically significant bias in favour of womenâ.
Whatâs going on? Well: men and women were applying for different departments. For instance, of the 933 applicants for the most popular department â letâs call it A â just 108 were women, but of the 714 applicants for the sixth most popular department (call it B), 341 were women.
Letâs take a look at just those two departments. Women were more likely to be admitted to both. But, crucially, the two departments had hugely different rates of acceptance: in Department A, 82% of female applicants and 62% of male ones were accepted; in Department B, 7% and 6%.
So of the 108 women applying to Department A, 89 of them were admitted (82%), while of the 825 men applying, 511 got in (62%). Meanwhile, of the 341 who applied to Department B, just 24 (7%) were admitted; for men, it was 22 out of 373 (6%).
You see whatâs going on here? In both departments, women were more likely to be accepted. But added together, itâs a different story. Of 449 women who applied to the two departments, just 111 were accepted: 25%. Whereas of the 1,199 men who applied to the two, 533 got in: 44%.
So, to repeat: even though any individual woman applying to either department had a higher chance of being admitted, on average fewer women were admitted because they tended to apply to more competitive departments.
This doesnât mean that weâve once-and-forever ruled out any form of sex bias â it might be, for instance, that there is a lack of investment in the popular but female-dominated classes â but it does show that the simple, top-line, aggregated numbers can mislead. When divided up into smaller groups, they can tell a very different story.
Simpsonâs paradox is a specific case of a wider class of problem known as the âecological fallacyâ, which says that you canât always draw conclusions about individuals by looking at group data. A topical example: local authorities with above-average numbers of over-65s actually have a lower rate of death from Covid-19 than those with below-average numbers. But we know that older people are individually at greater risk. Whatâs going on seems to be that younger areas tend also to be denser, poorer, and more ethnically diverse, all of which drive risk up.
The apparent rise of the R value seems to be something like that. According to Edmunds, the rise is not because lockdown isnât working (âitâs not that people are going about and mixing moreâ), but that it is working. There are, he says, separate epidemics in the community, and in care homes and hospitals.
âWe had a very wide-scale community epidemic,â he told the committee, âand when we measured the R it was primarily the community epidemic. But thatâs been brought down: the lockdown has worked, breaking chains of transmission in the community ⊠now if you measure the R itâs being dominated by care homes and hospitals.â
Letâs imagine that we had two epidemics, of equal size, one in the community and one in care homes. Say 1,000 people are infected in each, and in the community each person on average infects two people, while in the care homes on average each person infects three. The total R is 2.5[1].
But now imagine you lock down and reduce both the R and the number of people infected, but by more in the community than in the care homes. Say that now there are 100 people infected in the community, and they each pass it on to an average of one person; and there are 900 people infected in the care homes, and they pass it on to an average of 2.8 people.
Now your average R is 2.62[2]; itâs gone up! But â just as with the Berkeley graduate students above â when you divide up the data into its constituent parts, itâs actually gone down in each category.
I donât know the numbers, but according to Edmunds something like this has gone on in the real world. The collapse in the number of people with and passing on the disease in the community means that now the epidemic in care homes is a much greater share of the average. And that means, even though the R in care homes hasnât gone up, the average R in total has, because the average in care homes was higher to start with.
To be clear: this doesnât mean that everything is fine, or that weâve won, or anything. âThe thing that worries me is that it might be the overall R that matters,â says Kevin McConway, an emeritus professor of statistics at the Open University, who helped me understand these numbers. Itâs not that the epidemics in the community and in care homes and hospitals are truly separate, islands cut off from each other â theyâre interlinked, so if the disease spreads in care homes it can reinfect those of us outside it.
Edmunds said as much to the Science and Technology Committee: âStrictly speaking you have one R: thereâs one epidemic and linked sub-epidemics; the epidemic in hospitals is not completely separate from the one in the community.â But to understand how it works, you need to look in this more granular fashion: the overall R is not much use on its own.
And while it doesnât mean that weâve won, it certainly shouldnât be taken to mean that the British population has been lax in its approach to lockdown. Compliance has been very high, much higher than modellers anticipated.
But it does show that simple numbers can hide more complex stories. They feed, for instance, into modelling. One simple model is the SIR (susceptible, infected, recovered) model, where you assume everyone just interacts at random, mixing uniformly like molecules in a gas; but if the epidemics in care homes and the community behave very differently, then those models might give out very misleading numbers.
Thatâs why models such as the Imperial one try to simulate human behaviour to some degree; the extent to which it got that right is far from clear, but it was at least trying. Some more simple models that went around the internet did not. McConway, a statistician not a modeller, is profoundly wary of those: âI know enough [about modelling] to say I wouldnât touch it because Iâm not an expert; I see people getting it wrong in ways that I can recognise, whereas Iâd get it wrong in ways I donât recognise.â These subtle misunderstandings can drive major errors.
Weâve seen examples of this throughout the crisis. Early on, people (Donald Trump, notably) paid an awful lot of attention to the case numbers; but the case numbers didnât really tell us how many people had the disease, just how many of them had been tested. And people have tried to place countries in a league table of death rates, but thatâs not particularly informative either (although thatâs not to say comparisons are entirely useless). Trying to boil down this messy, complicated situation to single numbers and saying whether theyâre good or bad is rarely a good idea.
The really key thing, which I keep coming back to, is just how much uncertainty there is. âModelling is bloody hard,â says McConway. âPrediction is bloody hard. The map is not the territory. Weâll know whatâs happened when itâs happened.â Even something as apparently simple as the R value has to be treated with immense caution.
Join the discussion
Join like minded readers that support our journalism by becoming a paid subscriber
To join the discussion in the comments, become a paid subscriber.
Join like minded readers that support our journalism, read unlimited articles and enjoy other subscriber-only benefits.
Subscribe