Public Preview

RAG And Grounded Generation documentation

Retrieval-Augmented and Grounded Generation (RAG)

Definition

RAG augments prompts with retrieved evidence (documents, passages, tables, images, or tools) and instructs the model to generate grounded outputs with citations or source support.

Why It Matters

  • Improves factuality and recency by using external data
  • Reduces hallucinations and enables compliance via source traceability
  • Lowers fine-tuning needs by separating knowledge from the model

2025 State of the Art

  • Multi-hop and multi-vector RAG (dense, sparse hybrid)
  • Structured grounding (citations, quote-attribution, claim-checking)
  • Agentic RAG: iterative retrieval, query planning, tool-use
  • Multimodal RAG: images/video/audio retrieval with text grounding

Key Players

  • Microsoft/Azure AI Search + Azure OpenAI
  • Google (Vertex AI/Gemini function calling, Search APIs)
  • Open-source: LlamaIndex, LangChain, vLLM-based serving

Challenges

  • Retrieval quality (index freshness, chunking/windowing)
  • Latency/cost of iterative retrieval and long-context reading
  • Faithfulness measurement and citation correctness

Reference Architectures

  • Index (vector/sparse/hybrid) + Retriever + Reranker + Prompt Assembler + LLM + Citation post-processor
  • Agentic loop for query refinement and tool selection

Opportunities

  • Learned retrievers/rerankers tuned to domain
  • Structured citation enforcement and auto-evidence checks
  • Freshness pipelines and semantic cache layers

Design Checklist & Acceptance Criteria

  • Define chunking, overlap, and metadata; choose dense/sparse/hybrid
  • Measure end-to-end answer faithfulness and citation precision/recall
  • Add guardrails for missing-evidence states
  • Implement freshness and reindexing SLAs
  • Log retrieval/query traces for observability

References

  • Title: Azure AI Search + RAG (RAG overview) URL: https://learn.microsoft.com/azure/search/retrieval-augmented-generation-overview Publisher/Vendor: Microsoft Accessed: 2025-08-14 Version_or_release: provider_reported
  • Title: Function Calling (Gemini API) URL: https://ai.google.dev/gemini-api/docs/function-calling Publisher/Vendor: Google Accessed: 2025-08-14 Version_or_release: provider_reported
  • Title: OpenAI Structured Outputs & Tools URL: https://platform.openai.com/docs/guides/structured-outputs Publisher/Vendor: OpenAI Accessed: 2025-08-14 Version_or_release: provider_reported