05. Add query rewrite module

Add query rewrite module

step

Perform Naive RAG
Check relevance for searched documents (Groundedness Check)
Web Search
(This tutorial) Query Rewrite

Reference

It's an extension from the previous tutorial, so there may be overlapping parts. Please refer to the previous tutorial for insufficient explanation.

Preferences

# !pip install -U langchain-teddynote

# Configuration file for managing API keys as environment variables
from dotenv import load_dotenv

# Load API key information
load_dotenv()

 True

# Set up LangSmith tracking. https://smith.langchain.com
# !pip install -qU langchain-teddynote
from langchain_teddynote import logging

# Enter a project name
logging.langsmith("CH17-LangGraph-Structures")

 Start tracking LangSmith. 
[Project name] 
CH17-LangGraph-Structures

Basic PDF-based Retrieval Chain creation

Here, we create a Retrieval Chain based on PDF documents. Retrieval Chain with the simplest structure.

However, LangGraph creates Retirever and Chain separately. Only then can you do detailed processing for each node.

Reference

As covered in the previous tutorial, we omit the detailed description.

from rag.pdf import PDFRetrievalChain

# Load a PDF document.
pdf = PDFRetrievalChain(["data/SPRI_AI_Brief_2023년12월호_F.pdf"]).create_chain()

# Create a retriever and a chain.
pdf_retriever = pdf.retriever
pdf_chain = pdf.chain

State definition

State : Defines the state of sharing between nodes and nodes in Graph.

Generally TypedDict Use format. This time, we add the results of the relevance check to the state. Reference

this time question Define it as list format. This is to store additional rewritten Query.

from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages


# GraphState State Definition
class GraphState(TypedDict):
    question: Annotated[list, add_messages]  # Questions (cumulative list)
    context: Annotated[str, "Context"]  # Search results for the document
    answer: Annotated[str, "Answer"]  # answer
    messages: Annotated[list, add_messages]  # Message (cumulative list)
    relevance: Annotated[str, "Relevance"]  # 관련성

Node definition

Nodes : Nodes that handle each step. Usually implemented as a Python function. Input and output are state values.

Reference

State Updated after performing a defined logic with input State Returns.

from langchain_openai import ChatOpenAI
from langchain_teddynote.evaluator import GroundednessChecker
from langchain_teddynote.messages import messages_to_history
from langchain_teddynote.tools.tavily import TavilySearch
from rag.utils import format_docs


# Document Search Node
def retrieve_document(state: GraphState) -> GraphState:
    # Get the question from the state.
    latest_question = state["question"][-1].content

    # Search the documentation to find relevant articles.
    retrieved_docs = pdf_retriever.invoke(latest_question)

    # Formats the retrieved document (for input into the prompt)
    retrieved_docs = format_docs(retrieved_docs)

    # Stores the searched document in the context key.
    return GraphState(context=retrieved_docs)


# Generate Answer Node
def llm_answer(state: GraphState) -> GraphState:
    # Get the question from the state.
    latest_question = state["question"][-1].content

    # Get the searched documents in status.
    context = state["context"]

    # Call the chain to generate an answer.
    response = pdf_chain.invoke(
        {
            "question": latest_question,
            "context": context,
            "chat_history": messages_to_history(state["messages"]),
        }
    )
    # Stores generated answers, (user's questions, answers) messages in the state.
    return GraphState(
        answer=response, messages=[("user", latest_question), ("assistant", response)]
    )


# Relevance check node
def relevance_check(state: GraphState) -> GraphState:
    # Create a relevance evaluator.
    question_answer_relevant = GroundednessChecker(
        llm=ChatOpenAI(model="gpt-4o-mini", temperature=0), target="question-retrieval"
    ).create()

    # Run a relevance check ("yes" or "no")
    response = question_answer_relevant.invoke(
        {"question": state["question"][-1].content, "context": state["context"]}
    )

    # Note: The relevance evaluator here can be modified using your own Prompt . Create and use your own Groundedness Check!
    return GraphState(relevance=response.score)


# Function to check relevance (router)
def is_relevant(state: GraphState) -> GraphState:
    return state["relevance"]


# Web Search Node
def web_search(state: GraphState) -> GraphState:
    # create a search tool
    tavily_tool = TavilySearch()

    search_query = state["question"][-1].content

    # Search examples using different parameters
    search_result = tavily_tool.search(
        query=search_query,  # search query
        topic="general",  # general toipcs
        max_results=3,  # maximum search results
        format_output=True,  # formatting results
    )

    return GraphState(context="\n".join(search_result))

Add Query Rewrite node

Rewrite existing questions by utilizing the prompts to rewrite Query.

Previous04. Add web search module Next06. Agentic RAG

Last updated 1 year ago

hashtagAdd query rewrite module

hashtagPreferences

hashtagBasic PDF-based Retrieval Chain creation

hashtagState definition

hashtagNode definition

hashtagAdd Query Rewrite node

Add query rewrite module

Preferences

Basic PDF-based Retrieval Chain creation

State definition

Node definition

Add Query Rewrite node