Because weirdly, the meaning, the predictive value, of an individual’s positive or negative test is changed in different situations, depending on the background rarity of the event that the test is trying to detect. The rarer the event in your population, the worse your test becomes, even though it is the same test.

This is easier to understand with concrete figures. Let’s say the HIV infection rate among high-risk men in a particular area is 1.5 per cent. We use our excellent blood test on 10,000 of these men, and we can expect 151 positive blood results overall: 150 will be our truly HIV-positive men, who will get true positive blood tests; and one will be the one false positive we could expect from having 10,000 HIV-negative men being given a test that is wrong one time in 10,000. So, if you get a positive HIV blood test result, in these circumstances your chances of being truly HIV positive are 150 out of 151. It’s a highly predictive test.

Let’s now use the same test where the background HIV infection rate in the population is about one in 10,000. If we test 10,000 people, we can expect two positive blood results overall. One from the person who really is HIV positive; and the one false positive that we could expect, again, from having 10,000 HIV-negative men being tested with a test that is wrong one time in 10,000.

Suddenly, when the background rate of an event is rare, even our previously brilliant blood test becomes a bit rubbish. For the two men with a positive HIV blood test result, in this population where only one in 10,000 has HIV, it’s only 50:50 odds on whether they really are HIV positive.

Let’s think about violence. The best predictive tool for psychiatric violence has a ‘sensitivity’ of 0.75, and a ‘specificity’ of 0.75. It’s tougher to be accurate when predicting an event in humans, with human minds and changing human lives. Let’s say 5 per cent of patients seen by a community mental health team will be involved in a violent event in a year. Using the same maths as we did for the HIV tests, your ‘0.75’ predictive tool would be wrong eighty-six times out of a hundred. For serious violence, occurring at 1 per cent a year, with our best ‘0.75’ tool, you inaccurately finger your potential perpetrator ninety-seven times out of a hundred. Will you preventively detain ninety-seven people to prevent three violent events? And will you apply that rule to alcoholics and assorted nasty antisocial types as well?

For murder, the extremely rare crime in question in this report, for which more action was demanded, occurring at one in 10,000 a year among patients with psychosis, the false positive rate is so high that the best predictive test is entirely useless.

This is not a counsel of despair. There are things that can be done, and you can always try to reduce the number of actual stark cock-ups, although it’s difficult to know what proportion of the ‘one murder a week’ represents a clear failure of a system, since when you look back in history, through the retrospecto-scope, anything that happens will look as if it was inexorably leading up to your one bad event. I’m just giving you the maths on rare events. What you do with it is a matter for you. Locking you up

In 1999 solicitor Sally Clark was put on trial for murdering her two babies. Most people are aware that there was a statistical error in the prosecution case, but few know the true story, or the phenomenal extent of the statistical ignorance that went on in the case.

At her trial, Professor Sir Roy Meadow, an expert in parents who harm their children, was called to give expert evidence. Meadow famously quoted ‘one in seventy-three million’ as the chance of two children in the same family dying of Sudden Infant Death Syndrome (SIDS).

This was a very problematic piece of evidence for two very distinct reasons: one is easy to understand, the other is an absolute mindbender. Because you have the concentration span to follow the next two pages, you will come out smarter than Professor Sir Roy, the judge in the Sally Clark case, her defence teams, the appeal court judges, and almost all the journalists and legal commentators reporting on the case. We’ll do the easy reason first. The ecological fallacy

The figure of ‘one in seventy-three million’ itself is iffy, as everyone now accepts. It was calculated as 8,543 ? 8,543, as if the chances of two SIDS episodes in this one family were independent of each other. This feels wrong from the outset, and anyone can see why: there might be environmental or genetic factors at play, both of which would be shared by the two babies. But forget how pleased you are with yourself for understanding that fact. Even if we accept that two SIDS in one family is much more likely than one in seventy-three million – say, one in 10,000 – any such figure is still of dubious relevance, as we will now see. The prosecutor’s fallacy

The real question in this case is: what do we do with this spurious number? Many press reports at the time stated that one in seventy-three million was the likelihood that the deaths of Sally Clark’s two children were accidental: that is, the likelihood that she was innocent. Many in the court process seemed to share this view, and the factoid certainly sticks in the mind. But this is an example of a well-known and well-documented piece of flawed reasoning known as ‘the prosecutor’s fallacy’.

Two babies in one family have died. This in itself is very rare. Once this rare event has occurred, the jury needs to weigh up two competing explanations for the babies’ deaths: double SIDS or double murder. Under normal circumstances – before any babies have died – double SIDS is very unlikely, and so is double murder. But now that the rare event of two babies dying in one family has occurred, the two explanations – double murder or double SIDS – are suddenly both very likely. If we really wanted to play statistics, we would need to know which is relatively more rare, double SIDS or double murder. People have tried to calculate the relative risks of these two events, and one paper says it comes out at around 2:1 in favour of double SIDS.

Not only was this crucial nuance of the prosecutor’s fallacy missed at the time – by everyone in the court – it was also clearly missed in the appeal, at which the judges suggested that instead of ‘one in seventy-three million’, Meadow should have said ‘very rare’. They recognised the flaws in its calculation, the ecological fallacy, the easy problem above, but they still accepted his number as establishing ‘a very broad point, namely the rarity of double SIDS’.

That, as you now understand, was entirely wrongheaded: the rarity of double SIDS is irrelevant, because double murder is rare too. An entire court process failed to spot the nuance of how the figure should be used. Twice.

Meadow was foolish, and has been vilified (some might say this process was exacerbated by the witch-hunt against paediatricians who work on child abuse), but if it is true that he should have spotted and anticipated the problems in the interpretation of his number, then so should the rest of the people involved in the case: a paediatrician has no more unique responsibility to be numerate than a lawyer, a judge, journalist, jury member or clerk. The prosecutor’s fallacy is also highly relevant in DNA evidence, for example, where interpretation frequently turns on complex mathematical and contextual issues. Anyone who is going to trade in numbers, and use them, and think with them, and persuade with them, let alone lock people up with them, also has a responsibility to understand them. All you’ve done is read a popular science book on them, and already you can see it’s hardly rocket science. Losing the lottery

You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing …

Richard Feynman

It is possible to be very unlucky indeed. A nurse called Lucia de Berk has been in prison for six years in Holland, convicted of seven counts of murder and three of attempted murder. An unusually large number of people died when she was on shift, and that, essentially, along with some very weak circumstantial evidence, is the substance of the case against her. She has never confessed, she has continued to protest her innocence, and her trial has generated a small collection of theoretical papers in the statistics literature.

The judgement was largely based on a figure of ‘one in 342 million against’. Even if we found errors in this figure – and believe me, we will – as in our previous story, the figure itself would still be largely irrelevant. Because, as we have already seen repeatedly, the interesting thing about statistics is not the tricky maths, but what the numbers mean.

There is also an important lesson here from which we could all benefit: unlikely things do happen. Somebody wins the lottery every week; children are struck by lightning. It’s only weird and startling when something very, very specific and unlikely happens if you have specifically predicted it beforehand.

Here is an analogy.

Imagine I am standing near a large wooden barn with an enormous machine gun. I place a blindfold over my eyes and – laughing maniacally – I fire off many thousands and thousands of bullets into the side of the barn. I then drop the gun, walk over to the wall, examine it closely for some time, all over, pacing up and down. I find one spot

Вы читаете Bad Science
Добавить отзыв
ВСЕ ОТЗЫВЫ О КНИГЕ В ИЗБРАННОЕ

0

Вы можете отметить интересные вам фрагменты текста, которые будут доступны по уникальной ссылке в адресной строке браузера.

Отметить Добавить цитату