Twitter trials anti-troll tool that automatically blocks abusive users

The feature is being trialled among a small group of users, with an emphasis on female journalists and members of marginalised communities. Photograph: Matt Rourke/AP

Twitter is trialling an anti-troll feature that will automatically block accounts sending abuse to users.

Once Twitter’s new “safety mode” is activated by a user, it will temporarily block accounts for seven days if the tech firm’s systems spot them using harmful language or sending repetitive, uninvited replies and mentions.

It comes as social media firms face continued pressure to protect users from online abuse, such as the targeting of black England footballers after the Euro 2020 final.

“We want you to enjoy healthy conversations, so this test is one way we’re limiting overwhelming and unwelcome interactions that can interrupt those conversations,” said Jarrod Doherty, a senior product manager at Twitter.

“Our goal is to better protect the individual on the receiving end of tweets by reducing the prevalence and visibility of harmful remarks.”

The feature will be trialled initially among a small group of users, described as a “feedback group”, with a particular emphasis on female journalists and members of marginalised communities.

The safety mode will be available on iOS, Android and desktop and can be turned on via the Twitter settings. It will not block accounts that users follow or interact with frequently.

Accounts that are autoblocked will not be able to follow your account, see your tweets or send direct messages for one week. However, users will be able to view the details of blocked accounts at any time in order to rapidly undo any misinterpretations by Twitter’s systems.

“We won’t always get this right and may make mistakes, so safety mode autoblocks can be seen and undone at any time in your settings. We’ll also regularly monitor the accuracy of our safety mode systems to make improvements to our detection capabilities,” said Doherty.

Twitter said it had consulted experts in online safety, mental health and human rights as it developed the anti-abuse feature, with those same people helping nominate members of the feedback group.

Article 19, a UK-based digital rights group that took part in the talks, called the feature “another step in the right direction”.

Doherty said the feature would undergo changes before being introduced to the site’s more than 200 million active users: “We’ll observe how safety mode is working and incorporate improvements and adjustments before bringing it to everyone on Twitter.”

Twitter permanently suspended 56 users the day after the Euro 2020 final in July after a public and political outcry over abusive tweets directed at Marcus Rashford, Jadon Sancho and Bukayo Saka. However, the Guardian learned last month that that 30 of the suspended users had since reposted on the network, often under slightly altered usernames.

A Guardian study of Twitter messages directed at and naming the England team during the three group stage matches also identified more than 2,000 abusive messages, including scores of racist posts.

Twitter has already rolled out a new prompt to users who are about to send a tweet that its algorithms believe could be “harmful or offensive”. Those who try to send such a message will be asked if they “want to review this before tweeting”, with the options to edit, delete, or send anyway. In May the company said trials had shown the feature had helped reduce the posting of abuse.

Richard Hartley

Technology, Photography & Film

Leave a Comment Cancel comment