MiroThinker v1.5 Open-Source Agent Achieves 80.8% on GAIA Benchmark

TL;DR

MiroThinker v1.5 introduces interactive scaling for agentic AI, handling 400+ tool calls with 256K context windows, surpassing commercial agents on multiple benchmarks.

Key Points

Achieves 80.8% on GAIA-Val-165, 71.5% on BrowseComp-ZH, outperforming Kimi-K2-Thinking with 1/30th the parameters
Introduces interactive scaling as third performance dimension beyond model size and context length
Supports 256K context window and up to 400 tool calls per task in 30B and 235B parameter scales
Includes MiroFlow agent framework (82.4% GAIA score) and MiroVerse training dataset (147k samples) on HuggingFace

Why It Matters

Open-source agentic AI is rapidly closing the gap with commercial systems. MiroThinker demonstrates that careful architectural choices around tool-use scaling can achieve competitive performance at lower computational cost, making advanced research agents accessible to developers without massive budgets. The interactive scaling approach offers a new optimization dimension for the community to explore.

View on GitHub

Source: github.com