Dropbox Builds Custom Feature Store for Real-Time AI Ranking

TL;DR

Dropbox engineered a hybrid feature store combining Feast, Spark, and custom Go serving to handle sub-100ms latency for AI-powered search ranking across millions of documents.

Key Points

Rewrote Python feature serving in Go to achieve p95 latencies of 25-35ms, handling thousands of requests per second
Reduced batch ingestion write volumes from 100M+ to under 1M records per run, cutting update times from 1+ hour to 5 minutes
Integrated Feast orchestration with Dynovault (in-house DynamoDB-compatible storage) for ~20ms client-side latency without public internet calls
Implemented three-part ingestion system: batch processing, streaming signals, and direct writes to balance freshness with reliability at scale

Why It Matters

This deep-dive reveals production architecture decisions for scaling ML inference at enterprise scale. Engineers building high-throughput ranking systems will find concrete solutions to common bottlenecks: Python's GIL limitations, feature freshness tradeoffs, and hybrid on-prem/cloud infrastructure challenges. The approach of combining open-source foundations (Feast) with custom infrastructure offers a practical middle ground between building from scratch and vendor lock-in.

Read the technical deep-dive

Source: dropbox.tech