Meta Reports 50% Drop in Moderation Errors with Community Notes

Meta says its Community Notes system slashed moderation mistakes by half, but shifting enforcement to users raises fresh challenges for creators and brands.

Meta’s latest transparency reports show a dramatic 50% decrease in moderation enforcement errors—a shift credited to its growing reliance on the Community Notes system and more broadly, advanced AI systems integrated into product management and development. For creators, brands, and marketers who depend on social networks for growth, the change could influence both visibility and brand safety in unpredictable ways.

Meta detailed these findings in its new transparency reports, insisting that empowering users to apply Community Notes has led to better moderation and fewer wrongful takedowns, especially in the US. Community Notes now cover more surfaces, including Reels and replies in Threads, and users can even request a note be added to particular posts.

Increasing AI Reliance Across User Experience and Content Moderation

As highlighted by Meta CEO Mark Zuckerberg in a recent overview of the impact of AI, Meta is increasingly using AI-powered systems for more internal development and management tasks—including coding, ad targeting, and product risk assessment. According to NPR, Meta plans to automate up to 90% of all risk assessments across Facebook and Instagram, including product development and rule changes. Internal company documents cited by NPR note that until recently, privacy and integrity reviews were mostly handled by humans, but “up to 90% of all risk assessments will soon be automated.”

Key changes introduced include:

  • 50% reduction in reported moderation mistakes in the US
  • Expanded Community Notes coverage to Reels and Threads replies
  • Option for users to request new community notes
  • Increase in content removals for nudity/sexual and dangerous-organizations content, partially due to detection bugs
  • Rise in suicide, self-harm, and eating disorder content identified, linked to improved detection algorithms
  • Slight decline in automated detection for bullying, harassment, and hate speech
  • Proactive removal of more spam content on Instagram
  • Fake accounts estimated at just 3% of active users, down from prior 5% industry standard

This shift toward community-driven enforcement does have precedent. Twitter (now X) implemented a similar system, but research shows over 85% of X’s Community Notes never reach users’ feeds, especially on divisive issues. If Meta’s approach follows suit, much misinformation could escape moderation, and the celebrated drop in misplaced enforcement might simply reflect fewer posts being reviewed by moderators at all.

Meta’s deployment of large language models has also altered content detection standards. While AI now surpasses prior machine-learning tools in some policy areas, the company admits it has scaled back stringent automation in categories where false positives ran high. In line with this, Meta announced earlier this year a change in its approach to “less severe” policy violations, deactivating automated systems found to be error-prone and requiring “a much higher degree of confidence before a piece of content is taken down.” Meta describes this as getting rid of most content demotions and reducing restrictive actions except when there is high certainty of violation.

For creators, that means some questionable material may slip through, while other legitimate content stands a better chance of surviving. Data for Q1 2025 shows that as Meta tuned down certain automations, automated detection of bullying and harassment on Facebook declined by 12%, meaning more of this content reaches users due to the platform’s refined stance regarding enforcement mistakes. In raw numbers, this could amount to millions of additional posts and comments seen within Meta’s apps.

Government data requests to Meta held steady, with India at the top of the list and the US, Brazil, and Germany not far behind. Meanwhile, the portion of fake accounts flagged is lower than before, signaling confidence in Meta’s detection models but raising questions about reporting accuracy.

For creators and publishers, the visibility of genuine news links remains low. During Q1 2025, over 97% of US Facebook post views involved no outbound link—a figure consistent with previous quarters. While some outlets have noticed marginal improvement, broad trends still sideline external referrals in favor of on-platform content, viral stories, and sensational posts.

On Instagram, product updates and content moderation shifts could further affect how creators build engagement. The impact of these changes can also be seen in Instagram’s recent enhancements to visual presentation, which may influence how Reels and other surface-level features integrate with Community Notes.

Meta’s moderation shift introduces both opportunities and risks. For marketers and creators, relaxed policies might mean fewer erroneous takedowns but potentially greater exposure to harmful or misleading content. The reduced sensitivity in automated detection—especially in areas like harassment and hate speech—suggests a trade-off between user freedom and platform safety. While Meta is confident in the ability of its AI systems to manage moderation and integrity at scale, critics warn that such automation could expose users to more adverse content if the systems fail or err.

Looking ahead, Meta not only plans to enhance detection technology using advanced language models and extend Community Notes’ reach, but is also investing in AI to automate code development and internal processes. Mark Zuckerberg has projected that “sometime in the next 12 to 18 months,” most of Meta’s evolving code base will be written by AI. Still, Meta maintains that only “low-risk” product review decisions are being automated for now—although the ongoing shift offers a glimpse at a future where automated systems directly shape user experience at massive scale.

As platforms increasingly let communities self-police content while handing more responsibility to AI, the results will shape brand safety, audience trust, and long-term strategy for creators. The effectiveness of Community Notes and the scalability of AI moderation remain open questions as Meta adapts its enforcement playbook for 2025 and beyond.

subscribe to

the trend report

stay up to date on the biggest social media strategies and updates

Discover more from Storyy

Subscribe now to keep reading and get access to the full archive.

Continue reading