How BM25 and RAG Retrieve Information Differently?
When you type a query into a search engine, something has to decide which documents are actually relevant — and how to rank them. BM25 (Best Matching 25) , the algorithm powering search engines like Elasticsearch and Lucene, has been the dominant answer to that question for decades. It scores documents by looking at three things: how often your query terms appear in a document, how rare those terms are across the entire collection, and whether a document is unusually long. The clever part is that BM25 doesn’t reward keyword stuffing — a word appearing 20 times doesn’t make a document 20 times more relevant, thanks to term frequency saturation. But BM25 has a fundamental blind spot: it only matches the words you typed, not what you meant. Search for “finding similar content without exact word overlap” and BM25 returns a blank stare. This is exactly the gap that Retrieval-Augmented Generation (RAG) with vector embeddings was built to fill — by matching meaning, not just ke...
