RAGs to Riches

There’s something incredibly exciting about recognizing patterns in technology cycles.

The current AI wave feels like a sudden gold rush. But gold rushes don’t create gold. They reveal what was already there.

Large language models (LLMs) are trained on vast amounts of data, but that knowledge is static at the moment training ends. They do not automatically know your latest documents, your internal data, or what changed yesterday. Retrieval-Augmented Generation (RAG) systems are used to bridge that gap. By retrieving relevant information in real time and injecting it as context, we can augment what the model sees and improve what it predicts next.

And beneath that entire mechanism lies a discipline that has been evolving for decades: information retrieval.

This post explores how modern AI architectures build on earlier search technologies, what truly changed when retrieval began feeding generative models, and why the real riches belong to those who understand the foundations.

Long before RAG, there was Search

Long before Retrieval-Augmented Generation (RAG) had a name, teams were solving the same fundamental problem we are solving today:

How do you represent knowledge so that machines can retrieve the right piece of information at the right time?

In the early days of large-scale enterprise systems, organizations were drowning in documents, emails, contracts, research, logs, and reports. The challenge was not generation. It was retrieval. Companies like: Vivisimo, Endeca, FAST, and Autonomy were building systems that indexed enormous corpora and ranked results by relevance.

Then came open ecosystems like Apache Solr and Elasticsearch, which democratized powerful indexing and retrieval infrastructure. Under the hood, these systems relied on:

  • TF-IDF
  • Cosine similarity
  • Vector space models
  • Relevance ranking algorithms

Documents were transformed into mathematical representations.
Queries were transformed into mathematical representations.
Similarity was computed.
Results were ranked.

That is information retrieval.

And it has been evolving quietly for decades.

What Actually Changed

Here’s the distinction that matters:

The big shift isn’t retrieval itself. The big shift is that now retrieval feeds a generative model.

For years, retrieval systems returned ranked lists. Links. Documents. Snippets.

Now, retrieval happens first and generation happens second.

The retrieved content is passed into a large language model as additional context. That augmented context expands what the model sees before it begins predicting what to say next.

Instead of presenting results, we use those retrieved documents to condition the model’s output. Instead of asking users to read ten documents, we compress them into one synthesized response grounded in that retrieved information.

The model is still doing what language models do best: predicting the next token.
It’s just doing it with better, fresher, more relevant context.

That shift is significant.

But the retrieval layer?

That’s evolutionary, not revolutionary.

The intelligence didn’t replace search.

It stands on top of it.

Where RAG Lives

RAG is not a new type of model. It’s an architectural pattern.

A RAG system retrieves relevant information and injects that information into a language model’s prompt before generation occurs.

The language model does not query databases.
It does not browse your documents.
It does not have live access to your knowledge base.

An external retrieval system finds the most relevant pieces of information and places them inside the model’s context window as part of the input.

The model then generates a response conditioned on that supplied context.

Retrieval becomes a preprocessing step.
The prompt becomes a container for retrieved knowledge.
Generation becomes synthesis over selected documents.

RAG doesn’t replace search. It operationalizes it inside the prompt.

The Engine Beneath RAG: Vector Databases

If RAG is the architecture, vector databases are often the engine.

They are the systems responsible for storing and retrieving the representations that make augmented generation possible.

Modern tools like Chroma, FAISS, and pgvector, along with a growing ecosystem of managed vector stores, power much of today’s retrieval infrastructure.

But let’s strip away the branding.

Today’s vector databases:

  • Store high-dimensional vectors
  • Index them for approximate nearest neighbor search
  • Return top-K similar results using cosine or dot product or some other strategy

If you remove the hype, that’s still similarity search in a vector space.

The math changed from sparse TF-IDF vectors to dense learned embeddings.

But conceptually?

We are still:

  • Representing documents as vectors
  • Representing queries as vectors
  • Measuring similarity
  • Ranking results

We just swapped hand-crafted signals for learned representations.

It’s a massive engineering improvement.

It is not a new category of problem.

The Illusion of Overnight Intelligence

Every technological wave creates amnesia.

  • We rename ideas.
  • We polish them.
  • We add compute.
  • We improve the interface.

And suddenly it feels new.

But retrieval, ranking, and representation have been hard problems for decades.

In fact: Search has always been one of the hardest problems in computer science. We just finally built a front end that makes it feel like magic.

The language model is the elegant interface. The retrieval system underneath is the result of years of refinement.

Old Skills Don’t Disappear

Here’s the most important lesson: Old skills don’t disappear. They get renamed.

Relevance tuning becomes prompt context strategy.
Index design becomes embedding pipelines.
Query optimization becomes RAG orchestration.

The vocabulary shifts. The foundations remain.

Which should be encouraging.

AI is not a lightning strike from nowhere.

It is layered innovation.

Search.
Statistics.
Linear algebra.
Distributed systems.
Optimization.

Decades of work, now recombined.

If you’ve ever worked on search, indexing, ranking, or data modeling, you weren’t early to something obsolete.

You were early to this.

So Where Does the Leverage Come From?

Leverage belongs to the people who understand what’s underneath. The ones who:

  • Know that what you retrieve determines what you generate
  • Care about data quality
  • Understand ranking
  • Think in vectors, whether sparse or dense

When a new wave hits, it’s easy to focus on what feels new. But the real advantage often belongs to those who recognize the patterns beneath it.

AI didn’t appear overnight. It compounded.

And if you’ve been thinking about how knowledge is represented, compared, and ranked for years, this isn’t your first gold rush. It’s the moment the rest of the world finally noticed.

The riches aren’t in the hype.

They’re in the depth.

— Andre Lessa