Home AI Article

TabICLv2 Achieves State-of-the-Art Tabular ML Performance

TL;DR

TabICLv2, a transformer-based tabular foundation model, outperforms XGBoost and CatBoost on 80% of benchmarks while running 10x faster than TabPFN-2.5 with zero hyperparameter tuning.

Key Points

  • 10x faster inference than TabPFN-2.5 on H100 GPU; fits and predicts 50K samples × 100 features in under 10 seconds
  • Outperforms heavily tuned XGBoost/CatBoost on ~80% of TabArena datasets without any hyperparameter tuning
  • Handles datasets from 300 to 600K+ samples and up to 2,000 features; O(n² + nm²) runtime complexity
  • Scikit-learn compatible, pip-installable, open source with permissive license; supports KV caching for faster repeated inference

Why It Matters

TabICLv2 democratizes high-performance tabular ML by eliminating the hyperparameter tuning burden that makes XGBoost/CatBoost expensive to deploy. For data scientists and ML engineers, this means faster experimentation cycles and production models that work out-of-the-box on structured data—the most common ML use case.
View on GitHub

Source: github.com