Anthropic Unveils Claude Sonnet 4.5, Claiming Leap in AI Agent and Coding Capabilities

Key Points
- Claude Sonnet 4.5 ran autonomously for 30 hours, producing roughly 11,000 lines of code for a Slack‑like chat app.
- Anthropic claims the model leads the market in real‑world agents, coding, and computer use.
- Beta tester Canva praised the model’s handling of complex, long‑context tasks across engineering and research.
- The model is deemed especially strong in cybersecurity, financial services, and research applications.
- Anthropic introduced developer tools—virtual machines, memory, context management, and multi‑agent support—to aid custom AI agent creation.
- OpenAI’s recent Pulse feature highlights the competitive push for consumer‑focused AI assistants.
- Product leaders highlighted capabilities such as meeting scheduling, dashboard analysis, and automated status‑update generation.
- Anthropic’s internal testing showed a three‑fold improvement in browser navigation and computer usage over previous technology.
Anthropic announced its new Claude Sonnet 4.5 model, highlighting a 30‑hour autonomous coding run that produced roughly 11,000 lines of code for a chat application. The company touts the model as the leading solution for real‑world agents, coding, and computer use, noting strong performance in cybersecurity, financial services, and research. Early testers such as Canva reported success with complex, long‑context tasks. Anthropic also introduced developer‑focused updates—including virtual machines, memory, and multi‑agent support—to help build custom AI agents, positioning the launch amid fierce competition from OpenAI and Google.
Breakthrough Model Release
Anthropic introduced Claude Sonnet 4.5, describing it as the most capable model for real‑world agents, coding, and computer use. In a demonstration, the model operated autonomously for 30 hours, generating a chat‑app comparable to Slack or Teams and producing about 11,000 lines of code before completing the task. The company contrasted this with its earlier Opus 4 model, which had previously run for seven hours.
Enhanced Computer Use and Skill Level
Dianne Penn, Anthropic’s head of product management, said the new model is more than three times as skilled at navigating browsers and using a computer compared with the company’s technology from the prior October. Feedback from early‑access customers, described as “the GitHubs and Cursors of the world,” drove an intensive month of development focused on improving these capabilities.
Beta Tester Feedback
Canva, a beta tester, reported that Claude Sonnet 4.5 helped with “complex, long‑context tasks—from engineering in our codebase to in‑product features and research.” The model’s strengths were also highlighted in sectors such as cybersecurity, financial services, and research.
Competitive Landscape
The launch comes as other AI leaders, including OpenAI and Google, continue to roll out incremental updates aimed at both consumer assistants and enterprise tools. OpenAI recently announced a new ChatGPT feature called Pulse, designed for users’ morning routines and research needs.
Developer‑Focused Enhancements
Anthropic paired the model release with a suite of developer tools, promising access to virtual machines, memory, context management, and multi‑agent support. These components are described as the building blocks that power Claude Code, enabling developers to construct advanced AI agents.
Practical Applications
Scott White, product lead for Claude.ai, illustrated the model’s ability to schedule meetings, analyze data dashboards, and generate status updates based on one‑on‑one meetings. Penn also shared her own use case, employing the model for deep web searches, profile sourcing, and automatically generating spreadsheets of LinkedIn profiles for hiring purposes.