Can Twitter Get Us to Be Nice?

(Bloomberg Businessweek) -- Twitter is great for lots of things. It’s one of the best places on the internet to get news. It’s full of funny and interesting commentary by comedians, celebrities, and journalists. It’s also a great place to watch people ruthlessly mock one another and very good for picking a fight with a stranger. No other technology is referred to as a cesspool more often. The app is great at being a cesspool.

But Twitter Inc. is trying to change that. It has spent the past year experimenting with subtle product tweaks designed to encourage healthier online behavior. It now alerts people who are about to retweet misinformation on topics such as elections and Covid-19, and it recently began asking people to actually read articles before retweeting. In some cases, if users try to tweet something mean or offensive, automated pop-ups now ask them to think twice before doing so.

These changes may sound modest, but they’re radical as far as tech companies go. The big social networks—Facebook, Twitter, and YouTube—have historically relied on rules to keep users in line, and even those haven’t always been clear or consistently enforced. But Twitter is unusual in that it’s been exploring changes that would discourage users from deliberate provocation or belligerence—behaviors that the service (like its peers) tacitly encouraged by turning the number of followers and likes into a sort of game. Chief Executive Officer and co-founder Jack Dorsey has said that the prominence of these metrics, in retrospect, was a mistake. Likes, he said at a 2019 conference, don’t “actually push what we believe now to be the most important thing, which is healthy contribution back to the network and conversation to the network.” As Twitter’s head of product, Kayvon Beykpour, puts it, the company wants “to build health more natively into the product.”

Twitter, in other words, is trying to do what may sound impossible: make its users nice—or at least nicer. The challenge may seem hilarious for a social network best known for the bellicosity of Donald Trump, whom it finally banned in early 2021. Twitter is the home of the “ratio” and the birthplace of the “dunk.” To get ratioed is to have thousands of strangers shout at you for saying something about Bitcoin, climate change, Covid-19, or any number of other polarizing topics. To be dunked on is when somebody takes your tweet and adds his own commentary, often with a witty or clever insult appended—though not all dunks are witty. Among the most popular is the classic “F--- you.”

Dunk if you must, but Twitter’s efforts here seem at least a little bit promising, especially in light of new revelations from Facebook whistleblower Frances Haugen about that company’s reluctance to do anything about its impact on mental health and the spread of misinformation. If Twitter can somehow make people more civil, it would have implications for Facebook and other companies. “We don’t know whether these are just tweaks that mainly produce some good PR for Twitter” or if they might be “fundamentally shifting things in the right direction,” says Susan Benesch, faculty associate at the Berkman Klein Center for Internet & Society at Harvard. “Only Twitter can really find that out.”

In its early years, likes and follower counts were Twitter’s main draw. They gave newcomers a sense of validation—and the prospect of amassing a giant audience drew in celebrities and world leaders. In 2009 actor Ashton Kutcher was dubbed the King of Twitter by the Queen of Daytime Television, Oprah Winfrey, when he beat CNN in a race to become the first account with 1 million followers. He celebrated the feat by popping a bottle of Champagne in a livestream video.

In retrospect, the stunt probably helped normalize unhealthy behavior, but back at Twitter headquarters in San Francisco, executives were just happy to be growing quickly. “Had we known our little project was going to become a big deal—which was a stretch at the time—then yeah, I wish we had sat down and said, ‘What would we like people ideally to do?’ ” recalls co-founder Biz Stone. Like Dorsey, Stone knew that displaying metrics such as a user’s follower count meant people would try to make those numbers go up. It just didn’t seem like an issue at the time. “We were like, ‘This is fun,’ ” Stone says.

The attraction for Kutcher and other famous people back in 2009 is the same thing that draws in users today: reach. Not only can you amass an audience, but there is a mechanism, the retweet, to get your message beyond your fans. Retweets, when a Twitter user broadcasts someone else’s message to her followers, also happen to be ideal for spreading misinformation and encouraging harassment. Dunks are just retweets with a dollop of mockery on top.

The retweet evolved organically. In Twitter’s early days, people were reposting manually by typing out the original tweet and then adding “RT” and the person’s name before hitting send. Twitter decided to make this easier and hired software developer Chris Wetherell to lead the project. Wetherell, who now runs the audio startup Myxt, would later have misgivings, though retweets didn’t feel dangerous at the time. Twitter “wasn’t known for being a place where a great deal of harm could happen,” he says. His co-workers and bosses seemed more worried about keeping up with Facebook Inc., Wetherell recalls, and users just wanted their likes and followers to keep going up.

It wasn’t until Wetherell saw how retweets could quickly spread hate during GamerGate, a 2014 trolling campaign, that he really started to worry. Alt-right trolls harassed female game developers and journalists, publishing their home addresses and even threatening to kill or rape them. A prominent developer, Zoë Quinn, had to leave home because of these threats. Meanwhile, Twitter just kept growing. “Social media companies don’t have an obvious stake in what we become when we use their products,” Wetherell says. “Like if people become radical and they abandon truth, the companies aren’t harmed, because often outraged people use their product more.”

Shortly before the 2020 U.S. election, the company started a test where it showed a pop-up if a user tried to retweet a news article without first clicking on it. “Headlines don’t tell the full story,” the alert read. Twitter said the prompt was intended to “promote informed discussion,” a polite way of saying it was meant to cut down on the sharing of misinformation and inflammatory headlines.

The prompt was effective. People who saw it opened articles 40% more often, the company said when it announced that the change would become permanent. Another retweet-focused test from the same time led to more ambiguous results. When people clicked to retweet, Twitter would automatically open a text box encouraging them to add their own commentary to the post (known as a “quote tweet”) instead of just sharing it on its own. The prompt led to a 26% increase in quote tweets and a 23% decrease in retweets, Twitter said, but quote tweets and retweets in total declined 20%. It was a clear sign that the prompt was discouraging people from simply passing along someone else’s post to their own followers, and Twitter acknowledged that it “slowed the spread of misleading information by virtue of an overall reduction in the amount of sharing on the service.” Yet Twitter ultimately abandoned the prompt, claiming that many people didn’t really add their own thoughts to a post before sharing it. In fact, 45% of those quote tweets contained just one word. “Quote Tweets didn’t appear to increase context,” the company concluded.

A few other tests have been more successful. If someone tries to say something that Twitter deems offensive in reply to a tweet, the company will run interference. It uses software to detect curse words, slurs, and epithets, some of which trigger a pop-up. “Want to review this before tweeting?” the warning asks. The intervention seems to be working: In 34% of cases, people either edit their post or don’t send it at all, Twitter says. Another test it started in October tries to warn people about the “vibe” of a conversation before they chime in, using an algorithm based on, according to the company, “the Tweet topic and the relationship between the Tweet author and replier” to detect if a conversation “could get heated or intense.” Fittingly, Twitter was dunked on when it made the announcement by users who pointed out that it still won’t let you edit a tweet after it’s been sent. “Everyone: can we get an edit button?” tweeted @AngryBlackLady. “Twitter: we’re introducing the vibe-o-matic 3000.”

Other fixes attempted to address the “reply guy” problem. The issue, a subset of the ratio, involves men (reply guys) who chime in and mansplain to strangers (typically women) at every opportunity. Twitter started letting users hide replies, which the company will eventually use to decide how prominent they should be. Downvotes won’t be public, but they’ll serve as a signal to Twitter that people aren’t happy with a particular reply. What, exactly, is Twitter trying to weed out? “Low-quality replies,” Beykpour tweeted. “Jerk-like behavior or irrelevant commentary.”

Beykpour tends to emphasize that Twitter’s new interventions are modest—“speed bumps,” he calls them—but they represent a radical change in strategy for a social network. Tech firms, as a rule, design products that push people to interact more often and more quickly. Bumps of any kind are unnatural. “It is a bit counterintuitive for designers, because we generally want to design for engagement,” says Anita Patwardhan Butler, Twitter’s design director for “health,” the word it uses for its effort to make people nicer to one another.

Not all the changes are as simple as tweaking design. The site has historically operated as a place where people follow other people, but the company has added “topics” so that users can follow tweets about an interest area instead of a person. It’s building “communities” as well, where people can share tweets with small groups of others with a common interest. In both instances, the features diminish the need to follow other individual users.

Changing the follower mindset will be “a huge fundamental shift,” Dorsey said in 2019, and pushing people toward following topics could decrease the importance of follower counts while helping people find the stuff they care about faster. “More relevance means less time on the service, and that’s perfectly fine,” he said.

But is it really? Like Facebook, Twitter makes money through advertising, and the more time people spend on Twitter, the more ads they’re shown. It’s been a lucrative business model—Twitter generated more than $1 billion in advertising revenue in the second quarter, though that’s a drop in the bucket compared with Facebook’s $28.6 billion for the same period. It’s a conflict that may not be fixable, says Rebekah Tromble, a George Washington University professor who has studied Twitter for years. “Until the business model changes in a way that people are no longer incentivized to just say outrageous things over and over and over again, Twitter—and really just about every other large social media platform—is likely never going to be a truly healthy place,” she says.

Twitter still shows the metrics associated with inflammatory or aggressive behavior online, such as like counts and retweets, even though Dorsey and others have said they regret their existence. Unlike competitors, such as Instagram, Twitter has never tested hiding “like” counts inside the app, for instance. Beykpour says the idea is on the table. “There are no sacred cows here,” he says.

Moreover, Twitter’s recent history gives critics reasons to be skeptical. The company has often been slow to make changes of any kind, and it hasn’t been especially responsive to outside researchers hoping to clean up the service. In 2018 the Berkman Klein Center’s Benesch announced an experiment in which Twitter would send specially crafted messages to users reminding them about the company’s content rules to see which reminders would discourage offensive and inappropriate tweets. After she negotiated details of the study with Twitter employees and got the company’s permission, her project was shut down unexpectedly after just one month. (Benesch says Twitter blamed a coding error.)

She worked with Twitter to launch the experiment a second time in 2019, but the company again shuttered it just after it started. In the end, she says, she received only a small subset of the data she needed from the first effort and none from the second. “We were trying to shift the behavior of Twitter users on a very large scale,” Benesch says. “It was very disappointing.”

Tromble, the researcher from George Washington, had a similarly frustrating experience. At the beginning of Twitter’s push to make cleaning up the service a priority, Dorsey announced plans to create metrics to measure health on the service. “If you want to improve something, you have to be able to measure it,” he tweeted.

Tromble was part of a group of outside researchers selected by Twitter to help create these new metrics, including one that’s designed to see if people will engage with others who have different beliefs and another to measure intolerance. But red tape delayed the project, and it took Tromble almost three years before she started getting data. A second research group selected for the project abandoned the effort entirely.

Finally starting the research has been encouraging, Tromble says. “It was a long, very, very bumpy road, but I think Twitter is showing a lot of signs of finally understanding what this process needs to look like,” she says. Twitter says it regularly works with researchers and created a Trust and Safety Council of outside groups. The Dangerous Speech Project, a group founded by Benesch that studies online speech, is on the council.

The goal to measure Twitter’s health is an important step. “When you give a team a metric to optimize for, you can get really good at optimizing for that metric,” Beykpour says. Among the new metrics Twitter is now tracking: “uninformed sharers,” which measures how many people retweet articles before reading them. The objective is to get the figure down. Twitter is also tracking (and trying to reduce) “violative impressions,” the number of times people see tweets that violate the company rules before they’re removed.

These efforts have been greeted warmly even by those who’ve been frustrated by the company’s past failures. “I am encouraged that there are quite a lot of people at Twitter who are really trying hard,” says Benesch. After almost 15 years fostering epic ratios and world-class dunks, Twitter is trying to change its own behavior. Soon we’ll see if it can do the same for the rest of us.

Read next: How to Win Back Loved Ones Lost to Extreme Views