The Data from Facebook Doesn’t Go Far Enough

Facebook recently published its latest quarterly update on how it enforces its rules for users. On the surface, these global numbers look hugely impressive: across the world, Facebook detected and removed millions of posts containing hate speech and inciting violence in the period. But, because the company does not disclose the total volume of hate speech and does not make country level data available, the information is of little practical value.

Facebook has set the industry standard in transparency on enforcement, and for this they should be acknowledged. But if Facebook is serious about preventing the most grievous harms associated with its platforms, it needs to release country level data so that leaders and community groups can take action that could save lives.

In the third quarter of 2020, Facebook removed 22.1 million pieces of content for hate speech violations. This figure is global, but hate speech is highly language – and context – dependent. For those trying to understand and prevent online hate speech and related offline violence in countries such as Ethiopia or Myanmar – or even in the United States – a global metric is of no use.

The addition of a global ‘prevalence’ metric for hate speech to Facebook’s most recent Community Standards Enforcement Report (CSER) is both welcome and frustrating. Prevalence is an estimate of how much hate speech is actually on Facebook – it helps tell us how large the problem is and puts Facebook’s moderation figures in context.

Facebook’s new prevalence metric tells us that around once in every 1000 times a person views a piece of content on Facebook it contains hate speech. Extrapolated across Facebook’s 2.7billion monthly active users this figure is striking. But the global figure doesn’t tell the whole story. That’s because Facebook is heavily reliant on automated systems – Artificial Intelligence – to detect hate speech. And we know that Facebook’s automated systems are unable to consistently identify hate speech in many major languages. Indeed, in the majority of global languages automation doesn’t work at all.

Of the hate speech that Facebook found and took action on globally in this last quarter, around 95% was detected through automation, with the rest reported to the platform by users. This high percentage of automation is presented – and has largely been accepted – as an unalloyed good. Certainly it shows huge improvements in AI, and represents major investment in this key area. But if the significant majority of hate speech Facebook takes action on are in the handful of languages where their automated systems are very high functioning, then for the rest of the world hate speech on Facebook remains largely unmoderated. Automated enforcement is not uniform, and global metrics serve to mask this disparity.

This is why prevalence metrics (assuming they accurately capture the data in the first place) are essential to assessing Facebook’s enforcement. If your local police department told you that 95% of the murders solved in an area were solved through AI you would think the AI was working incredibly well. But if it also turned out that 99% of murders in your community remained unsolved, the AI would no longer look so impressive. Facebook is focusing our attention on the first measurement and not providing the other.

Following extensive lobbying from civil society, Myanmar is the lone country in the world where Facebook has provided limited localized enforcement figures for hate speech content. These numbers highlight the critical role of localization in moderation. Here we see that in the last quarter of 2017, when violence against the Rohingya was at its peak, just 13% of hate speech that Facebook took action in Myanmar was detected through automation. By quarter two of 2020 this figure was up to 97.8%, reflecting two and half years of investment and prioritization of automated Burmese language hate speech detection in response to the crisis – a huge improvement.

Unfortunately, Facebook has still not disclosed any localized prevalence figures for hate speech in Myanmar – or anywhere else for that matter – so we still don’t know how much hate speech their systems are missing. But Facebook does conduct these estimates, and could share them if they chose.

Localized prevalence measurements have real potential to save lives. Facebook shouldn’t keep these numbers to themselves. It was recently revealed that in the US Facebook has an internal metric for “violence and incitement,” which they track in real time. If Facebook shared the data it might help us to understand whether a volume increase in violent speech is an accurate predictor of real-world violence.

We do not know whether Facebook tracks these metrics in countries beyond the US – but they should. In countries like Ethiopia or India, online hate speech and incitement continues to be linked to real world ethnic and religious conflict. If Facebook has internal data that shows a significant rise in hate speech or incitement to violence in these areas – that could help activists or others take preemptive action to prevent this spilling into the real world – why not disclose it?

Facebook has a history of contributing data insights to humanitarian causes. Localized prevalence and enforcement data could magnify these efforts and help monitor and mitigate crises before they turn into humanitarian emergencies. Beyond helping Facebook to track and communicate its performance, such data could save lives.

Rafiq Copeland serves as the Global Platform Accountability Advisor at Internews, a nonprofit that supports digital rights and independent media in 100 countries.

(Banner photo courtesy of PxHere)