TL;DR
Feldera, a novel incremental computation engine supporting full SQL syntax, processes millions of events per second while avoiding expensive full dataset recomputation.
Key Points
- Evaluates arbitrary SQL incrementally including joins, aggregates, window functions, correlated subqueries, and recursive queries
- Achieves millions of events per second on laptop hardware without tuning; handles datasets exceeding RAM via NVMe spilling
- Backed by formal DBSP mathematical model (VLDB 2023 paper); guarantees strong consistency matching batch system semantics
- Supports Kafka, HTTP, CDC streams, S3, data lakes, and warehouses; includes fault tolerance with exactly-once semantics
Why It Matters
This addresses a fundamental gap in data processing: existing systems force tradeoffs between batch (slow, comprehensive) and streaming (fast, limited SQL). Incremental computation at full SQL expressiveness enables unified pipelines for real-time feature engineering, ETL, and analytics without recomputation overhead—critical for cost-sensitive and latency-sensitive workloads.
Source: github.com