If you're building RAG (Retrieval-Augmented Generation) pipelines for clients or your own agency, you've probably wondered whether the work qualifies for R&D tax credits. The short answer is: it can. But not all RAG work qualifies, and HMRC is paying closer attention to AI-related claims than ever before.
This article explains exactly what HMRC looks for, what doesn't qualify, and how to structure a defensible claim for a rag pipeline r&d tax credit. We'll use real examples from agency work, not theoretical scenarios.
What Is a RAG Pipeline?
RAG stands for Retrieval-Augmented Generation. It's a pattern where you combine a large language model (LLM) with an external knowledge base. The LLM retrieves relevant information from your data store before generating a response. This gives you more accurate, grounded outputs than relying on the model's training data alone.
A typical RAG pipeline involves:
- Chunking documents into searchable pieces
- Creating embeddings using a model (OpenAI, Cohere, open-source alternatives)
- Storing those embeddings in a vector database (Pinecone, Weaviate, Qdrant, pgvector)
- Setting up a retrieval mechanism that finds the most relevant chunks for a given query
- Feeding those chunks into an LLM as context for generation
- Handling prompt engineering, chunk overlap, reranking, and fallback logic
If you're building this from scratch or adapting it to a novel domain, you're dealing with technical uncertainty. That's where R&D relief starts to apply.
What HMRC Looks for in R&D Claims
HMRC's R&D definition hasn't changed with the arrival of AI. The test is still whether your project sought to resolve scientific or technological uncertainty. Not whether you built something new to the world. Whether it was uncertain at the outset how to achieve the result.
For a rag pipeline r&d tax credit claim, HMRC will want to see:
- Technical uncertainty that couldn't be resolved by a competent professional in the field
- A systematic approach to resolving that uncertainty (experiments, iterations, testing)
- Recorded evidence of the uncertainty and the work done to resolve it
- Qualifying costs (staff time, subcontractor costs, software licenses, consumables)
HMRC's AI guidance, published in April 2024, specifically acknowledges that integrating AI into existing systems can qualify if it involves resolving technical uncertainty. But simply calling an API and piping the output into a frontend does not.
When Does RAG Work Qualify?
Custom chunking strategies for domain-specific documents
Standard chunking (splitting documents into fixed-size pieces) often fails with legal contracts, medical records, or highly technical documentation. If you're developing a chunking strategy that preserves semantic meaning across complex document structures, that's likely qualifying work. You're experimenting with different approaches, measuring retrieval accuracy, and iterating.
Building retrieval systems for low-resource domains
If your RAG pipeline needs to work with a specialised vocabulary (say, maritime logistics or pharmaceutical compliance) where off-the-shelf embedding models perform poorly, you're facing technical uncertainty. Fine-tuning embedding models or building hybrid retrieval systems that combine vector search with keyword-based methods qualifies.
Optimising for latency or cost constraints
Agency clients often want RAG systems that run under strict latency budgets or on limited hardware. If you're developing novel caching strategies, model distillation approaches, or hybrid architectures to meet those constraints, that's R&D work. You're solving a problem with no obvious off-the-shelf solution.
Building multi-step reasoning pipelines
Simple RAG (retrieve one set of chunks, generate one answer) is well understood. But if you're building a pipeline that decomposes a question, retrieves different information for each sub-question, synthesises results, and handles contradictions across sources, you're in R&D territory. The uncertainty lies in how to chain these steps reliably.
When Does RAG Work Not Qualify?
Using an existing RAG framework out of the box
LangChain, LlamaIndex, and Haystack are mature frameworks. If you're following their tutorials and connecting standard components, you're not doing R&D. You're implementing known solutions. That's valuable client work, but it's not qualifying.
Simple API wrappers
Taking a user query, sending it to OpenAI, and displaying the response is not R&D. Even if you add a vector database with precomputed embeddings, if the approach is well documented and the implementation is straightforward, HMRC will reject the claim.
Prompt engineering alone
Tweaking prompts to get better outputs is not R&D. It's optimisation, not innovation. HMRC explicitly excludes work that "could be resolved by a competent professional applying standard techniques." Prompt engineering, in most cases, falls into that category.
Maintaining an existing RAG system
Bug fixes, performance monitoring, and routine updates are not R&D. The qualifying work ends when the technical uncertainty is resolved. Ongoing maintenance, however complex, doesn't qualify.
Real Examples from Agency Work
Example 1: A 15-person digital agency in Shoreditch built a RAG pipeline for a legal publishing client. The documents were 500-page PDFs with complex cross-references, footnotes, and appendices. Standard chunking broke the semantic connections. The agency spent 6 weeks developing a hierarchical chunking strategy that preserved document structure. They ran 40+ experiments measuring retrieval accuracy against a test set. This qualifies.
Example 2: A 6-person web design agency in Bristol Harbourside built a chatbot for a local council's website using LangChain and Pinecone. They followed the standard tutorial, adjusted prompts, and deployed. Total time: 3 days. This does not qualify.
Example 3: A 20-person creative agency in Manchester Northern Quarter developed a RAG pipeline for a fashion retailer that needed to answer questions about 50,000 product SKUs with high accuracy. The challenge was handling product descriptions that changed weekly, inconsistent metadata, and multilingual queries. The agency built a custom embedding pipeline with incremental update logic and a hybrid retrieval system combining vector search with product category filters. This qualifies.
How to Structure Your Claim
Identify the technical uncertainty early
Before you start building, document what you don't know. "Can we achieve 95% retrieval accuracy on legal documents with standard chunking?" is a specific technical question. Write it down. If you later prove the answer is no, that's evidence of uncertainty.
Track time by project, not by task
Your developers should record time against specific R&D projects, not generic categories like "RAG work" or "AI development." Use project codes in Xero or QuickBooks. HMRC wants to see that the time relates to resolving specific uncertainties.
Keep technical notes
Notebooks, experiment logs, Slack threads where your team discusses failed approaches, pull requests that document iterations. These are your evidence. HMRC doesn't expect formal lab reports. But they need to see that real problem-solving happened.
Separate qualifying from non-qualifying work
Most agencies mix R&D work with routine implementation. You need to separate them in your records. If a developer spends 3 days building a custom chunking strategy (qualifying) and 2 days integrating it into the client's CMS (not qualifying), record both separately.
Common Mistakes Agencies Make
Claiming for the whole project. HMRC will look at the specific activities. If 70% of your RAG pipeline was standard implementation, only the 30% that resolved uncertainty qualifies.
Not documenting the uncertainty. HMRC doesn't accept "we built something new" as a justification. You need to show what was uncertain and how you resolved it.
Using subcontractors without proper agreements. If you subcontract RAG development to another company, the qualifying costs are limited to 65% of the subcontractor payments (under the SME scheme). Make sure your contracts specify the R&D work being done.
Claiming for software costs incorrectly. OpenAI API credits, Pinecone subscriptions, and cloud compute costs can qualify as consumables. But only the portion used directly in the R&D activity. If you're using the same Pinecone instance for production client work, you need to apportion the costs.
How Agency Founder Finance Can Help
As ICAEW qualified accountants working exclusively with agency founders, we've handled R&D claims for agencies building everything from RAG pipelines to custom AI training platforms. We know what HMRC expects because we deal with them regularly.
We'll help you:
- Identify which parts of your RAG work qualify
- Set up time tracking and cost recording systems
- Prepare the technical narrative HMRC wants to see
- File the claim through the correct HMRC channels
- Handle any HMRC enquiries that follow
If you're building RAG pipelines and wondering whether the work qualifies, get in touch. We'll tell you honestly whether a claim makes sense for your situation.
Final Thoughts
RAG pipeline work can qualify for R&D tax credits, but only where it resolves genuine technical uncertainty. The agencies that claim successfully are the ones that document their uncertainty from day one, track their experiments systematically, and separate qualifying work from routine implementation.
The rag pipeline r&d tax credit isn't a loophole. It's a legitimate relief for agencies doing genuine innovation. If that describes your work, it's worth pursuing.
If your contractor mix has changed in the last 12 months, or you've taken on RAG work that pushed your team into unfamiliar technical territory, ask your accountant before year-end. The deadlines for amended R&D claims are strict, and missing them costs real money.

