Uber Must Go Slow When Drivers Rate Riders

Scott Duke Kominers

13 Jun 2019, 09:00 PM IST

(Bloomberg Opinion) -- One notable feature of the mobile ride-hailing company Uber Technologies Inc. is that customers give drivers ratings, with five stars signaling excellence, and one star indicating unacceptable service. Drivers who get consistently low ratings are forced off the platform. At the same time, the platform allows drivers to rate riders -- although thus far those ratings haven’t been put to much use except in extreme cases.

But the company just announced a shift: it plans to start screening out riders whose ratings don’t make the cut.

Perhaps understandably, drivers have applauded the move. It might give customers an incentive to be more courteous, or at least establish some baseline level of civility. And it brings about a pleasing symmetry: Drivers have stressed about their ratings for years; it seems only fair to make riders sweat a bit, too.

The economist in me might even want Uber to go further; instead of simply banning rude or insufferable riders, it could charge nuisance riders higher prices. That’s more or less the way credit ratings work: they raise the cost of borrowing for individuals deemed to be bigger credit risks.

The growing ubiquity and impact of ratings in our lives is a real concern, as my Bloomberg Opinion colleague Noah Smith discussed the other day. Nevertheless, in ride-hailing, ratings-based pricing may have significant advantages.

It could motivate riders to improve their behavior, in much the same way that consumers try to raise their credit scores by making payments on time or avoiding taking on excessive debt. This might actually do more to balance behavior in the Uber ecosystem than outright disqualification would, since barred riders might just try to skirt the ban by opening new accounts. Indeed, without collecting more detailed user information, there isn't much Uber can do to stop riders from sidestepping their ratings by getting new phone numbers and credit cards.

But there are challenges in using Uber’s rider scores like credit ratings.

Whereas credit ratings are based on verifiable transactions and repayment history, Uber ratings are based on hearsay and subjective assessments of driver and rider performance. That means they’re more likely to reflect prejudices. Moreover, when there’s a dispute, it’s hard to ascertain what happened. That has already caused problems. For example, there’s a scam called -- I kid you not -- vomit fraud, in which drivers claim riders got sick while in transit and tack on phony clean-up fees of as much as $150.

A second problem is reciprocity: credit companies rate us, but we never get to rate them in return, no matter how much we may want to. Uber drivers and riders rate each other, and it’s already common for drivers to ask riders to give them five stars in exchange for a reciprocal five-star rating. The more riders’ ratings matter, the more willing riders will be to make these sorts of exchanges. That would reduce the amount of actual information the ratings contain.

Theoretically, that’s where algorithms come in -- a good algorithm should be able to separate the signal in ratings from the noise. If a driver or rider gets a large number of five-star ratings that are followed by complaints after-the-fact, for example, the algorithm should infer that some sort of ratings pressure may be going on.

But even algorithms sometimes reflect inherent biases in the data and/or implicit biases in the data-generating process. Perhaps taking customers to the airport is especially lucrative, which makes drivers prefer those rides (and thus give those riders higher ratings); this could result in a bias in favor of wealthier jetsetters or businesspeople. Or if drivers don’t enjoy traveling to pick up passengers in low-income areas, the algorithm might infer a problem with riders based there, and this could quickly translate into a bias against the poor. Of course, we’d like our rating algorithms to self-correct these problems, too -- but while there’s hope, this problem is notoriously difficult.

And indeed, Uber has already faced concerns about whether its rating system indirectly reinforces discrimination on the platform. (Of course, credit-rating companies have the same sort of problem all the time and have tried to develop solutions.)

So as Uber starts making riders’ ratings count, there’s serious work to be done to ensure that the system is as objective and evenhanded as possible.

Yet Uber already has experience dealing these issues on the driver side. And almost nobody -- other than bad apples themselves -- would argue that the ride-hailing ecosystem would be better if Uber didn’t screen out unsafe or otherwise unacceptable drivers. It’s hard to make a case that enforcement on the rider side should be any different.

Disclosure: Although I have not received compensation from Uber, I study ride-hailing marketplaces and have periodically talked about strategy and potential collaborative projects with Uber and other ride-hailing companies. I am involved in one early-stage project with Uber unrelated to the topic of this column. But I have not discussed the ideas in this column with anyone at Uber except in attempts to learn about the company’s rating system.

Even Uber’s competitors benefit when Uber eliminates bad drivers, since that bolsters the public perception that ride-hailing services are safe and efficient.

To contact the editor responsible for this story: James Greiff at jgreiff@bloomberg.net

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

Scott Duke Kominers is the MBA Class of 1960 Associate Professor of Business Administration at Harvard Business School, and a faculty affiliate of the Harvard Department of Economics. Previously, he was a junior fellow at the Harvard Society of Fellows and the inaugural research scholar at the Becker Friedman Institute for Research in Economics at the University of Chicago.