Big Data Alone Can’t Fix a Broken Bail System

(Bloomberg View) -- In the U.S., getting arrested can be extremely punishing, even for the innocent. People who can’t afford bail often languish in pre-trial detention for months, missing school, work and other responsibilities crucial to their livelihoods. Some plead guilty just to get out. The burden falls disproportionately on people from heavily policed minority neighborhoods.

Authorities in Philadelphia think an algorithm might help where the human-run system has failed. Actually, it could make things even worse.

Philly has long been one of the most incarcerated cities in America. Families commonly do time together. At any given moment, about 3 percent of residents are on probation. People in pretrial detention — that is, people who should be presumed innocent — account for 30 percent of the overall prison population.

Now, Philly is joining the nationwide movement for bail reform. As part of a plan backed by a $3.5 million MacArthur Foundation grant, the city intends to employ a computer algorithm to help judges decide who can be trusted to return for a trial date. Using data such as age, the nature of the offense, and the number of previous arrests, the algorithm will spit out a risk-assessment score.

Objective as this might sound, activists are concerned. Hannah Sassaman, policy director of the Media Mobilizing Project, a Philadelphia-based anti-poverty group, sees a number of problems with such algorithms: They don’t necessarily get more people out of jail, they can’t compensate for judges’ biases, and they can actually reinforce biases by using inputs that serve as proxies for race.

The evidence backs up such concerns. In Lucas County, Ohio, the introduction of one commonly used algorithm — the Laura and John Arnold Foundation pretrial risk tool — had mixed results. In Kentucky, pretrial risk assessments had only a small and temporary effect on judges’ decisions, and did not change the racial disparity in detentions. In Chicago, despite an encouraging decline in the use of money bonds, the algorithms didn’t reliably change judges’ behavior.

That’s not to say that judges should strictly follow an algorithm’s recommendations. If a computer uses inputs that are correlated with race and class, it can be as unfair to certain groups as any bigoted human. Consider zip codes: If an algorithm puts weight on them, it will automatically be biased against people from neighborhoods where a lot of people get arrested and miss trial dates — which, in our segregated country, tends to mean black neighborhoods. The same goes for including prior arrests: In a country where whites and blacks smoke pot at similar rates but blacks are four times more likely to be arrested for the offense, this will unfairly lead to worse scores for blacks.

So, what variables will the Philly algorithm use? This has so far been kept secret, but the team of developers includes Dr. Richard Berk, whose probation model heavily weights zip codes and pays attention to prior charges (as opposed to convictions).

Sassaman and her group are trying to get in on the planning and testing of the algorithms before they get used. Her group wants more accountability, which means knowing exactly what the algorithms are trying to predict, what inputs they are using, and how “effectiveness” is defined. Too often, she says, validity studies are too long and sophisticated for people without a Ph.D. in math to understand.

This is all part of a larger problem of accountability in big data algorithms, especially where policing and the justice system are involved. Almost none of the algorithms currently in use are transparent, with publicly available audits that could expose the racial disparities in their scores. In New Orleans, police reportedly kept even the city council in the dark about its use of a Palantir Technologies system to predict crime.

Here’s the irony: Bail reform doesn’t require algorithms. Changing standards alone can be much more effective. In Washington D.C., for example, the courts have achieved much lower rates of pretrial detention, primarily without the aid of big data and computers — and without compromising public safety.

To be sure, algorithms can be used for good. Sassaman offers one promising idea: They could be employed to predict where police are most likely to commit acts of violence, so such areas can be more carefully monitored. Unlike information on regular folks, though, the input data might be hard to find. In New York City, for example, police misconduct records are kept secret by law.

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

Cathy O'Neil is a mathematician who has worked as a professor, hedge-fund analyst and data scientist. She founded ORCAA, an algorithmic auditing company, and is the author of "Weapons of Math Destruction."

To contact the author of this story: Cathy O'Neil at

For more columns from Bloomberg View, visit

©2018 Bloomberg L.P.