What is the difference between lexical and semantic retrieval?

Lexical retrieval uses exact term overlap via inverted indexes (e.g., BM25), ensuring precise keyword matches but missing synonyms. Semantic retrieval uses dense vector embeddings to capture meaning and paraphrases, but requires expensive upfront computation and can be opaque.

Why is BM25 still the standard for production search?

BM25 remains the production standard because it effectively balances term frequency saturation, inverse document frequency, and length normalization. It is computationally efficient, interpretable, and highly effective for exact-match queries, often outperforming neural methods on specific keyword tasks.

How do modern systems handle the scale of billions of documents?

Modern systems use layered architectures: offline indexing with optimized data structures (like HNSW or B-trees), online candidate selection to retrieve top-k documents quickly, and a reranking phase that applies expensive models only to promising candidates.

The Mechanics of Information Retrieval: Ranking Relevance at Scale

Start a project

Vache prompts. Claude codes.How it works