TL;DR
Anthropic released Claude Opus 4.8 with reduced hallucination rates, mid-conversation system prompt injection, and lower prompt cache minimums while maintaining previous pricing.
Key Points
- 4x lower incorrect-rate than predecessor by abstaining on uncertain questions rather than confident false claims
- Mid-conversation system messages now supported, enabling dynamic prompt steering without restating full context and preserving prompt cache hits
- Prompt cache minimum reduced from previous threshold to 1,024 tokens, lowering costs for agentic loops
- Pricing unchanged: $5/million input, $25/million output tokens; knowledge cutoff remains January 2026
Why It Matters
For developers building LLM applications, reduced hallucination rates improve reliability in production systems. Mid-conversation system messages enable more sophisticated agentic patterns without cache invalidation, directly reducing costs and latency in long-running conversations. The honesty improvements matter for code generation workflows where flagging uncertainty prevents silent bugs.
Source: simonwillison.net