# 02. Naive RAG

## Naive RAG <a href="#naive-rag" id="naive-rag"></a>

**step**

1. Perform Naive RAG

![](https://wikidocs.net/images/page/267809/langgraph-naive-rag.png)

### Preferences <a href="#id-1" id="id-1"></a>

```python
# Configuration file for managing API keys as environment variables
from dotenv import load_dotenv

# Load API key information
load_dotenv()
```

```
 True 
```

```python
# Set up LangSmith tracking. https://smith.langchain.com
# !pip install -qU langchain-teddynote
from langchain_teddynote import logging

# Enter a project name.
logging.langsmith("CH17-LangGraph-Structures")
```

```
 Start tracking LangSmith. 
[Project name] 
CH17-LangGraph-Structures 
```

### Basic PDF-based Retrieval Chain creation <a href="#pdf-retrieval-chain" id="pdf-retrieval-chain"></a>

Here, we create a Retrieval Chain based on PDF documents. Retrieval Chain with the simplest structure.

However, LangGraph creates Retirever and Chain separately. Only then can you do detailed processing for each node.

```python
from rag.pdf import PDFRetrievalChain

# Load a PDF document.
pdf = PDFRetrievalChain(["data/SPRI_AI_Brief_2023년12월호_F.pdf"]).create_chain()

# Create a retriever and a chain.
pdf_retriever = pdf.retriever
pdf_chain = pdf.chain
```

First, use pdf\_retriever to get your search results.

```python
search_result = pdf_retriever.invoke("Please tell us the companies and amounts invested in Anthropic..")
search_result
```

```
 [Document (metadata={'source':'data/SPRI_AI_Brief_2023 Year 12 _F.pdf','file_path':'data/SPRI_AI_Brief_2023 Year 12 _F.pdf','page': 13, 'total_ Policy/law 2. Enterprise/Industry 3. Technology / Research 4. Workforce/training\ngoggles create $20 billion investment in Aspics Enhances AI cooperation\nKEY Contents\nn Google has agreed to invest up to $20 billion in Aspics and has invested $500 million first, and Aspics has also signed a contract to use cloud services with \nnn 3rd Cloud Operator Google, Microsoft and Amazon are representative companies of the next generation AI model. Up to $200 billion investment agreement and cloud service delivery to Ansropic'), Document (metadata={'source':'  data/SPRI_AI_Brief_2023 December issue_F.pdf','file_path':'data/SPRI_AI_Brief_2023 December issue_F.pdf','page': 13,'total_pages': 23,'Author':'dj',  00'",'ModDate': "D:20231208132838+09'00'",'PDFVersion': '1.4'}, page_content='£Google, up to $200 billion investment agreement in Ansropic... ]  00'",'ModDate': "D:20231208132838+09'00'",'PDFVersion': '1.4'}, page_content='£Google, up to $200 billion investment agreement in Ansropic... ] 
```

Pass the previously searched result to the context of the chain.

```python
Copy# Generate answers based on search results.
answer = pdf_chain.invoke(
    {
        "question": "Please tell us the companies and amounts invested in Anthropic.",
        "context": search_result,
        "chat_history": [],
    }
)
print(answer)
```

```
Google has agreed to invest up to $200 billion in Ansropic, of which $500 million has been invested first. In addition, Google has already invested $550 million in February 2023. Amazon has released an investment plan of up to $400 billion in Antwerp. 

**Source** 
-data/SPRI_AI_Brief_2023 December issue_F.pdf (page 13) 
```

### State definition <a href="#state" id="state"></a>

`State` : Defines the state of sharing between nodes and nodes in Graph.

Generally `TypedDict` Use format.

```python
from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages


# GraphState State Definition
class GraphState(TypedDict):
    question: Annotated[str, "Question"]  # question
    context: Annotated[str, "Context"]  # Search results for the document
    answer: Annotated[str, "Answer"]  # answerw
    messages: Annotated[list, add_messages]  # 메시지(누적되는 list)
```

### Node definition <a href="#node" id="node"></a>

* `Nodes` : Nodes that handle each step. Usually implemented as a Python function. Input and output are state values.

**Reference**

* `State` Updated after performing a defined logic with input `State` Returns.

```python
from langchain_teddynote.messages import messages_to_history
from rag.utils import format_docs


# Document Search Node
def retrieve_document(state: GraphState) -> GraphState:
    # Get the question from the state.
    latest_question = state["question"]

    # Search the documentation to find relevant articles.
    retrieved_docs = pdf_retriever.invoke(latest_question)

    # Formats the retrieved document (for input into the prompt)
    retrieved_docs = format_docs(retrieved_docs)

    # Stores the searched document in the context key.
    return GraphState(context=retrieved_docs)


# Generate Answer Node
def llm_answer(state: GraphState) -> GraphState:
    # Get the question from the state.
    latest_question = state["question"]

    # Get the searched documents in status.
    context = state["context"]

    # Call the chain to generate an answer.
    response = pdf_chain.invoke(
        {
            "question": latest_question,
            "context": context,
            "chat_history": messages_to_history(state["messages"]),
        }
    )
    # Stores generated answers, (user's questions, answers) messages in the state.
    return GraphState(
        answer=response, messages=[("user", latest_question), ("assistant", response)]
    )
```

### Edges <a href="#edges" id="edges"></a>

* `Edges` : Currently `State` Run next based on `Node` Python function to determine.

General edges, conditional edges, and more.

```python
from langgraph.graph import END, StateGraph
from langgraph.checkpoint.memory import MemorySaver

# create a graph
workflow = StateGraph(GraphState)

# node definition
workflow.add_node("retrieve", retrieve_document)
workflow.add_node("llm_answer", llm_answer)

# edge definition
workflow.add_edge("retrieve", "llm_answer")  # 검색 -> 답변
workflow.add_edge("llm_answer", END)  # 답변 -> 종료

# Setting the graph entry point
workflow.set_entry_point("retrieve")

# Set checkpoint
memory = MemorySaver()

# compile
app = workflow.compile(checkpointer=memory)
```

Visualize compa-like graphs.

```
from langchain_teddynote.graphs import visualize_graph

visualize_graph(app)
```

### Graph execution <a href="#id-2" id="id-2"></a>

* `config` Parameters convey the necessary setting information when running the graph.
* `recursion_limit` : Set the maximum number of recurses when running the graph.
* `inputs` : Pass the required input information when running the graph.

**Reference**

* Message output streaming [Everything in LangGraph streaming mode ](https://wikidocs.net/265770)Please refer to.

Under `stream_graph` A function is a function that only streams certain nodes.

You can easily check the streaming output of a specific node.

```python
from langchain_core.runnables import RunnableConfig
from langchain_teddynote.messages import stream_graph, random_uuid

# config settings (max recursion count, thread_id)
config = RunnableConfig(recursion_limit=20, configurable={"thread_id": random_uuid()})

# Enter your question
inputs = GraphState(question="Please tell me the companies and amounts invested in Anthropic.")

# Running the graph
stream_graph(app, inputs, config, ["llm_answer"])
```

```
 ================================================== 
🔄 Node: llm_answer🔄 
- - - - - - - - - - - - - - - - - - - - - - - - - - - -  
Google has agreed to invest up to $20 billion in Ansropic, of which $500 million has been invested first. Amazon has released an investment plan of up to $400 billion in Antwerp. 

**Source** 
-data/SPRI_AI_Brief_2023 December issue_F.pdf (page 14) 
```

```python
outputs = app.get_state(config).values

print(f'Question: {outputs["question"]}')
print("===" * 20)
print(f'Answer:\n{outputs["answer"]}')
```

```
 Question: Please tell us the amount of your investment and the companies that have invested in Ansropic. 
============================================================ 
Answer: 
Google has agreed to invest up to $20 billion in Ansropic, of which $500 million has been invested first. Amazon has released an investment plan of up to $400 billion in Antwerp. 

**Source** 
-data/SPRI_AI_Brief_2023 December issue_F.pdf (page 14) 
```

<br>
