03. CRAG (Corrective RAG)
CRAG: Corrective RAG
This tutorial Corrective RAG (CRAG) Covers how to improve RAG-based systems using strategies.
CRAG is an approach that elaborates search-generated piplines, including self-reflection and self-evaluation steps for searched documents.

What is CRAG?
Corrective-RAG (CRAG) In the RAG (Retrieval Augmented Generation) strategy Methodology that evaluates the documents found in the search process and adds steps to refine knowledge is. This includes a series of processes to check the search results prior to creation, perform an auxiliary search if necessary, and finally generate a high quality answer.
CRAG's key ideas are:
Paper (Corrective Retrieval Augmented Generation) link
If one or more of the searched documents exceeds the predefined relevance threshold, proceed to the creation phase.
Perform knowledge purification steps before creation.
Segment documents into "knowledge stripes". (Here, the number of document search results,
kmeans.)Each knowledge strip is evaluated and the relevance is score. (Here we evaluate in document chunk.)
If all documents are below the relevance threshold or if the evaluation results have low reliability, they are reinforced with additional data sources (e.g. web browsing).
Optimize your search results with reinforcement poetry through web search, and query-Rewrite.
Main contents
In this tutorial, LangGraph is used to implement some ideas from the CRAG approach.
here Knowledge purification steps are omitted And, if necessary, design in a form that can be added as a node.
Also, If there is no relevant document We will use web search to complement your search.
Web search Tavily Search Use and introduce Question Re-writing to optimize your search.
Key steps overview
Retrieval Grader : Evaluate the relevance of the searched document
Generate : Generate answers via LLM
Question Re-writer : Optimization of search quality through rewriting questions
Web Search Tool : Using web search through Tavily Search
Create Graph : Generating CRAG strategy graphs through LangGraph
Use the graph : How to utilize the generated graph
Reference
Preferences
The result is returned as (yes/no) whether it is relevant for a single document.
Here we conduct an evaluation of one single document, not a set of documents.
retrieval_grader Evaluate documents using
First, the evaluator to evaluate the retrieved document ( retrieval-grader ) Generates.
Evaluating the relevance of a searched document is a step in evaluating whether the searched document is related to a question.
Evaluation of the relevance of the searched documents (Question-Retrieval Evaluation)
Files downloaded for practice data Please copy to folder
Author: Jaeheung Lee (AI Policy Institute Office Liability Institute), Lee Ji-soo (AI Policy Lab Yi Phyang Institute)
Link: https://spri.kr/posts/view/23669
File name:
SPRI_AI_Brief_2023λ 12μνΈ_F.pdf
Software Policy Institute (SPRi)-December 2023
Documents utilized for practice
As covered in the previous tutorial, we omit the detailed description.
Reference
However, LangGraph creates Retriever and Chain separately. Only then can you do detailed processing for each node.
Here, we create a Retrieval Chain based on PDF documents. Retrieval Chain with the simplest structure.
Basic PDF-based Retrieval Chain creation
The need for web browsing : When all documents do not meet the relevance threshold or the evaluator is unsure, we will get additional data through web browsing.
Using Tavily Search : Perform a web search using Tavily Search. This optimizes search queries and provides more relevant results.
Rewrite question : Improve search queries by rewriting questions to optimize web browsing.
Web search tools Is used to reinforce the context.
Web search tools
question_rewriter Use to rewrite the question.
Rewriting queries is a step in rewriting questions to optimize web browsing.
Question Re-writer
It's a common Naive RAG chain we know.
The answer generation chain is a chain that generates answers based on the documents retrieved.
Reply generation chain
State
First, define the status above the CRAG graph.
web_search Indicates whether to utilize web search. Expressed as yes or no (yes: web search required, no: no required)
node
Define the nodes to use for the CRAG graph.
Functions to utilize for conditional edges
decide_to_generate After completing the relevance assessment, the function serves to route to the next node depending on whether the web is being searched.
web_search end Yes If query_rewrite After rewriting the query in the node, we do a web search.
if, web_search end No In the case of generate Perform to generate the final answer.
Graph generation
Now define the node and connect the edges to complete the graph.
Visualize the graph.
Graph execution
Now run the graph and check the results.
Run the graph.
Run graphs in streaming format.
Last updated