Anjana Susarla, Michigan State University
Meta’s decision to change its content moderation policies by replacing centralized fact-checking teams with user-generated community labeling has stirred up a storm of reactions. But taken at face value, the changes raise the question of the effectiveness of Meta’s old policy, fact-checking, and its new one, community comments.
With billions of people worldwide accessing their services, platforms such as Meta’s Facebook and Instagram have a responsibility to ensure that users are not harmed by consumer fraud, hate speech, misinformation or other online ills. Given the scale of this problem, combating online harms is a serious societal challenge. Content moderation plays a role in addressing these online harms.
Moderating content involves three steps. The first is scanning online content – typically, social media posts – to detect potentially harmful words or images. The second is assessing whether the flagged content violates the law or the platform’s terms of service. The third is intervening in some way. Interventions include removing posts, adding warning labels to posts, and diminishing how much a post can be seen or shared.
Content moderation can range from user-driven moderation models on community-based platforms such as Wikipedia to centralized content moderation models such as those used by Instagram. Research shows that both approaches are a mixed bag.
Does fact-checking work?
Meta’s previous content moderation policy relied on third-party fact-checking organizations, which brought problematic content to the attention of Meta staff. Meta’s U.S. fact-checking organizations were AFP USA, Check Your Fact, Factcheck.org, Lead Stories, PolitiFact, Science Feedback, Reuters Fact Check, TelevisaUnivision, The Dispatch and USA TODAY.
Fact-checking relies on impartial expert review. Research shows that it can reduce the effects of misinformation but is not a cure-all. Also, fact-checking’s effectiveness depends on whether users perceive the role of fact-checkers and the nature of fact-checking organizations as trustworthy.
Crowdsourced content moderation
In his announcement, Meta CEO Mark Zuckerberg highlighted that content moderation at Meta would shift to a community notes model similar to X, formerly Twitter. X’s community notes is a crowdsourced fact-checking approach that allows users to write notes to inform others about potentially misleading posts.
Studies are mixed on the effectiveness of X-style content moderation efforts. A large-scale study found little evidence that the introduction of community notes significantly reduced engagement with misleading tweets on X. Rather, it appears that such crowd-based efforts might be too slow to effectively reduce engagement with misinformation in the early and most viral stage of its spread.
There have been some successes from quality certifications and badges on platforms. However, community-provided labels might not be effective in reducing engagement with misinformation, especially when they’re not accompanied by appropriate training about labeling for a platform’s users. Research also shows that X’s Community Notes is subject to partisan bias.
Crowdsourced initiatives such as the community-edited online reference Wikipedia depend on peer feedback and rely on having a robust system of contributors. As I have written before, a Wikipedia-style model needs strong mechanisms of community governance to ensure that individual volunteers follow consistent guidelines when they authenticate and fact-check posts. People could game the system in a coordinated manner and up-vote interesting and compelling but unverified content.
Content moderation and consumer harms
A safe and trustworthy online space is akin to a public good, but without motivated people willing to invest effort for the greater common good, the overall user experience could suffer.
Algorithms on social media platforms aim to maximize engagement. However, given that policies that encourage engagement can also result in harm, content moderation also plays a role in consumer safety and product liability.
This aspect of content moderation has implications for businesses that either use Meta for advertising or to connect with their consumers. Content moderation is also a brand safety issue because platforms have to balance their desire to keep the social media environment safer against that of greater engagement.
AI content everywhere
Content moderation is likely to be further strained by growing amounts of content generated by artificial intelligence tools. AI detection tools are flawed, and developments in generative AI are challenging people’s ability to differentiate between human-generated and AI-generated content.
In January 2023, for example, OpenAI launched a classifier that was supposed to differentiate between texts generated by humans and those generated by AI. However, the company discontinued the tool in July 2023 due to its low accuracy.
There is potential for a flood of inauthentic accounts – AI bots – that exploit algorithmic and human vulnerabilities to monetize false and harmful content. For example, they could commit fraud and manipulate opinions for economic or political gain.
Generative AI tools such as ChatGPT make it easier to create large volumes of realistic-looking social media profiles and content. AI-generated content primed for engagement can also exhibit significant biases, such as race and gender. In fact, Meta faced a backlash for its own AI-generated profiles, with commentators labeling it “AI-generated slop.”
More than moderation
Regardless of the type of content moderation, the practice alone is not effective at reducing belief in misinformation or at limiting its spread.
Ultimately, research shows that a combination of fact-checking approaches in tandem with audits of platforms and partnerships with researchers and citizen activists are important in ensuring safe and trustworthy community spaces on social media.
Anjana Susarla, Professor of Information Systems, Michigan State University
This article is republished from The Conversation under a Creative Commons license. Read the original article.