Today, we’re announcing Claude 3.7 Sonnet1, our most intelligent model to date and the first hybrid reasoning model on the market. Claude 3.7 Sonnet can produce near-instant responses or extended, step-by-step thinking that is made visible to the user. API users also have fine-grained control over how long the model can think for.

Claude 3.7 Sonnet shows particularly strong improvements in coding and front-end web development. Along with the model, we’re also introducing a command line tool for agentic coding, Claude Code. Claude Code is available as a limited research preview, and enables developers to delegate substantial engineering tasks to Claude directly from their terminal.

  • simple@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    ·
    2 days ago

    in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs.

    I was just about to say how useless these benchmarks are. Plenty of LLMs claim to be better than Claude and GPT4, but in real world use they’ve always been more reliable. Claude especially. Good to hear they’re not just chasing scores.