Courts’ use of statistics should be put on trial
The Rev. Thomas Bayes was, as the honorific the Rev. suggests, a clergyman. Too bad he wasn’t a lawyer. Maybe if he had been, lawyers today wouldn’t be so reluctant to enlist his mathematical insights in the pursuit of justice.
In many sorts of court cases, from whether talcum powder causes ovarian cancer to The People v. O.J. Simpson, statistics play (or ought to play) a vital role in evaluating the evidence. Sometimes the evidence itself is statistical, as with the odds of a DNA match or the strength of a scientific research finding. Even more often the key question is how evidence should be added up to assess the probability of guilt. In either circumstance, the statistical methods devised by Bayes are often the only reasonable way of drawing an intelligent conclusion.
Yet the courts today seem suspicious of statistics of any sort, and not without reason. In several famous cases, flawed statistical reasoning has sent innocent people to prison. But in most such instances the statistics applied in court have been primarily the standard type that scientists use to test hypotheses (producing numbers for gauging “statistical significance”). These are the same approaches that have been so widely criticized for rendering many scientific results irreproducible. Many experts believe Bayesian statistics, the legacy of a paper by Bayes published posthumously in 1763, offers a better option.
“The Bayesian approach is especially well suited for a broad range of legal reasoning,” write mathematician Norman Fenton and colleagues in a recent paper in the Annual Review of Statistics and Its Application.
But Bayes has for the most part been neglected by the legal system. “Outside of paternity cases its impact on legal practice has been minimal,” say Fenton, Martin Neil and Daniel Berger, all of the School of Electronic Engineering and Computer Science at Queen Mary University London.
That’s unfortunate, they contend, because non-Bayesian statistical methods have severe shortcomings when applied in legal contexts. Most famously, the standard approach is typically misinterpreted in a way known as the “prosecutor’s fallacy.”
In formal logical terms, the prosecutor’s fallacy is known as “the error of the transposed conditional,” as British pharmacologist David Colquhoun explains in a recent blog post. Consider a murder on a hypothetical island, populated by 1,000 people. Police find a DNA fragment at the crime scene, a fragment that would be found in only 0.4 percent of the population. For no particular reason, the police arrest Jack and give him a DNA test. Jack’s DNA matches the crime scene fragment, so he is charged and sent to trial. The prosecutor proclaims that since only 0.4 percent of innocent people have this DNA fragment, it is 99.6 percent certain that Jack is the killer — evidence beyond reasonable doubt.
But that reasoning is fatally (for Jack) flawed. Unless there was some good reason to suspect Jack in the first place, he is just one of 1,000 possible suspects. Among those 1,000, four people (0.4 percent) should have the same DNA fragment found at the crime scene. Jack is therefore just one of four possibilities to be the murderer — so the probability that he’s the killer is merely 25 percent, not 99.6 percent.
Bayesian reasoning averts this potential miscarriage of justice by including the “prior probability” of guilt when calculating the probability of guilt after the evidence is in.
Suppose, for instance, that the crime in question is not murder, but theft of cupcakes from a bakery employing 100 people. Security cameras reveal 10 employees sneaking off with the cupcakes but without a good view of their identities. So the prior probability of any given employee’s guilt is 10 percent. Police sent to investigate choose an employee at random and conduct a frosting residue test known to be accurate 90 percent of the time. If the employee tests positive, the police might conclude there is therefore a 90 percent probability of guilt. But that’s another example of the prosecutor’s fallacy — it neglects the prior probability. Well-trained Bayesian police would use the formula known as Bayes’ theorem to calculate that given a 10 percent prior probability, 90 percent reliable evidence yields an actual probability of guilt of only 50 percent.
You don’t even need to know Bayes’ formula to reason out that result. If the test is 90 percent accurate, it will erroneously identify nine out of the 90 innocent employees as guilty, and it would identify only nine out of the 10 truly guilty employees. If the police tested all 100 people, then, 18 would appear guilty, but nine of those 18 (half of them) would actually be innocent. So a positive frosting test means only a 50 percent chance of guilt. Bayesian math would in this case (and in many real life cases) prevent a rush to injustice.
“Unfortunately, people without statistical training — and this includes most highly respected legal professionals — find Bayes’ theorem both difficult to understand and counterintuitive,” Fenton and colleagues lament.
One major problem is that real criminal cases are rarely as simple as the cupcake example. “Practical legal arguments normally involve multiple hypotheses and pieces of evidence with complex causal dependencies,” Fenton and colleagues note. Adapting Bayes’ formula to complex situations is not always straightforward. Combining testimony and various other sorts of evidence requires mapping out a network of interrelated probabilities; the math quickly can become much too complicated for pencil and paper — and, until relatively recently, even for computers.
“Until the late 1980s there were no known efficient computer algorithms for doing the calculations,” Fenton and colleagues point out.
But nowadays, better computers — and more crucially, better algorithms — are available to compute the probabilities in just the sorts of complicated Bayesian networks that legal cases present. So Bayesian math now provides the ideal method for weighing competing evidence in order to reach a sound legal judgment. Yet the legal system seems unimpressed.
“Although Bayes is the perfect formalism for this type of reasoning, it is difficult to find any well-reported examples of the successful use of Bayes in combining diverse evidence in a real case,” Fenton and coauthors note. “There is a persistent attitude among some members of the legal profession that probability theory has no role in the courtroom.”
In one case in England, in fact, an appeals court denounced the use of Bayesian calculations, asserting that members of the jury should apply “their individual common sense and knowledge of the world” to the evidence presented.
Apart from the obvious idiocy of using common sense to resolve complex issues, the court’s call to apply “knowledge of the world” to the evidence is exactly what Bayesian math does. Bayesian reasoning provides guidance for applying prior knowledge properly in assessing new knowledge (or evidence) to reach a sound conclusion. Which is what the judicial system is supposed to do.
Bayesian statistics offers a technical tool for avoiding fallacious reasoning. Lawyers should learn to use it. So should scientists. And then maybe then someday justice will be done, and science and the law can work more seamlessly together. But as Fenton and colleagues point out, there remain “massive cultural barriers between the fields of science and law” that “will only be broken down by achieving a critical mass of relevant experts and stakeholders, united in their objectives.”