Perfil monitorado

Anthropic

@AnthropicAI

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.

Posts coletados: 10 posts
Última publicação: Último · 11 de mai, 13:56
Frequência: Sync · 15 min

11 mai · 13:56·ver no X
Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmith. It includes a Q&A on the writing process, the philosophies that shaped the document, and how it might change as models become more capable. Listen at anthropic.com/constitution
Vídeo
8 mai · 14:52·ver no X
New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude 4 would blackmail users. Since then, we’ve completely eliminated this behavior. How?
8 mai · 14:52·ver no X
We found that training Claude on demonstrations of aligned behavior wasn’t enough. Our best interventions involved teaching Claude to deeply understand why misaligned behavior is wrong. Read more: anthropic.com/research/teach…