RAG vs. Non-RAG LLMs: What DOTs Need to Know
Who Comes Up with All These Acronyms?
If you’ve spent any time reading about AI (which based on the conversations we’re having is…all of you!), you’ve probably run into the acronym RAG—which stands for Retrieval-Augmented Generation. (And of course, LLMs — Large Language Models).
And you probably sighed, rolled your eyes, and muttered: “Another acronym? Really?” Don’t ask us who names these things.
But despite the clunky name, RAG is an important concept for state DOTs considering AI-driven solutions. And understanding why RAG exists and what its limitations are is critical for making informed decisions when evaluating vendors.
Let’s break it down in plain terms.
LLMs Have Token Limitations (And RAG Tries to Fix It)
Most people assume that AI models—like ChatGPT or Google Gemini—can just “read” a document like a person does. That’s not how they work.
LLMs have token limitations, or the maximum amount of text they can process at one time. It’s like a mental “working memory” for the model. Some models, like Gemini, can handle thousands of pages at once. But eventually, you hit an upper limit.
🔹 What if you have a million pages? What about ten million? 🔹
At that scale, the AI simply can’t keep everything in memory at once. That’s where RAG comes in.
How RAG Works (And Where It Fails for DOTs)
Instead of trying to process an entire 600-page standard operations manual at once, RAG breaks it into smaller chunks and retrieves only the most relevant pieces when you ask a question.
It’s like using a search engine: instead of remembering everything, the model pulls up snippets it thinks are useful and stitches them together to generate an answer – it cherry picks.
✅ This is great for searching and retrieving specific facts.
🚨 But it introduces a major problem for what transportation agencies are trying to solve: loss of context.
Chunking: The Root of Many AI Mistakes
When AI systems chop up a long document, they don’t always do it intelligently.
💡 Think about an SOP for a traffic incident on I-85:
- The first few pages define contents, key terms and stakeholders.
- Middle sections contain various response procedures.
- Later pages reference specific contacts and geographies.
If a RAG retrieves a few random chunks but loses the connection between them, it can generate an answer that sounds confident—but is wildly incorrect.
🚦 Example: A model might retrieve a protocol for a lane closure but fail to reference the section that says it doesn’t apply during rush hour.
Many AI hallucinations when using a RAG often stem from the chosen chunking strategy (how documents are split up).
- Generally, bigger chunks means less context is lost, but will probably be slower,
- whereas smaller chunks means more context is lost, but probably faster.
The errors come in when the model tries to stitch together what was retrieved with too little context.
So What Does This Mean for State DOTs?
The reality is most DOTs aren’t going to be building tools on top of LLMs themselves. But they will be buying solutions that incorporate them!
🔹 Key Takeaway: When evaluating vendors, ask how they handle context and chunking. They should be able to give you a good (and nuanced) answer. 🔹
Some key questions to consider:
1️⃣ How does the system retrieve relevant information? Does it rely solely on RAG, or can it process long documents holistically using other strategies (to be covered in another article!)?
2️⃣ Does it preserve document structure? Can it understand the relationship between sections of an SOP, or does it treat each page as an independent fact?
3️⃣ How does it prevent hallucinations? If it loses context, does it have a fallback mechanism to avoid misleading answers?
AI in transportation isn’t just about having the latest tools—it’s about making sure those tools work in the real-world complexity of DOT operations.
The Future of AI in Transportation: RAG Is Not a Silver Bullet
RAG is an important step forward in AI-powered retrieval, but it’s not a one-size-fits-all solution. State DOTs need to understand where it excels—and where it falls short—so they can make smart investments in AI-powered tools.