Most platforms approach online hate the same way: detect it, flag it, remove it. The focus is on protecting users from exposure to harmful content.
But what if we could do more? What if instead of just removing content, we could create the conditions for people to change?
Read the full design document: AI-Powered Continuous Improvement System for Responding to Online Hate (PDF)
The vision
Imagine a system that:
- Detects harmful content in real-time using NLP
- Responds with contextually appropriate interventions—not just removal, but counterspeech, reflective questions, peer engagement
- Amplifies constructive responses through a curated "Peace Feed"
- Learns which interventions work through continuous feedback
This isn't just content moderation—it's behavior change at scale.
Why Bluesky?
Bluesky's decentralized architecture (the AT Protocol) makes new approaches possible:
- Custom feeds can curate content algorithmically—imagine a "Peace Feed" that surfaces restorative interactions
- Decentralized identity means interventions can follow users across the network
- Open protocol allows experimentation without platform gatekeeping
Centralized platforms optimize for engagement. Decentralized networks let us optimize for different values.
Intervention types
Different situations call for different responses. Drawing on the framework, potential interventions include:
Counterspeech
Responding to hate with speech, not censorship. Research by Susan Benesch and others shows counterspeech can be effective—especially when it comes from in-group members, uses humor or empathy, and provides alternative narratives.
Reflective questions
Prompts that encourage self-reflection without attacking: "What made you feel that way?" "How do you think they might respond?" Drawing on restorative practices and motivational interviewing.
Peer visibility
Making constructive responses visible and socially rewarded. When people see their peers engaging positively, complex contagion suggests they're more likely to do the same.
Virtual restorative circles
Facilitated dialogue between affected parties. Not always possible online, but structured formats can create space for accountability and repair.
Behavioral nudges
Small friction points before posting (e.g., "Are you sure you want to share this?") and positive reinforcement for constructive engagement.
The feedback loop
What makes this approach different from static interventions is continuous learning: