06. MultiQueryRetriever

MultiQueryRetriever

Distance-based vector database searches are a way to find documents with similar embedding based on query embedding (expression) and'distance' in high-dimensional space. But query Search results may vary if detailed China embedding does not properly capture the meaning of the data There is. In addition, prompt engineering or tuning to manually adjust this can be cumbersome.

To solve this problem, MultiQueryRetriever Automates the prompt tuning process by utilizing the Language Learning Model (LLM), which automatically generates multiple queries from various perspectives for a given user input query.

This method searches for a set of related documents for each query, extracts a sum of unique documents that encompass all queries, and allows you to get a larger set of potentially related documents.

By creating the same question from multiple perspectives, MultiQueryRetriever Can overcome the limitations of distance-based searches in some part and provide richer search results.

# API A configuration file for managing keys as environment variables.
from dotenv import load_dotenv

# API Load key information
load_dotenv()

True

# LangSmith set up tracking. https://smith.langchain.com
# !pip install langchain-teddynote
from langchain_teddynote import logging

# Enter a project name.
logging.langsmith("CH11-Retriever")

 Start tracking LangSmith. 
[Project name] 
CH11-Retriever

# Building a sample vector DB
from langchain_community.document_loaders import WebBaseLoader
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load blog post
loader = WebBaseLoader(
    "https://teddylee777.github.io/openai/openai-assistant-tutorial/", encoding="utf-8"
)

# Split document
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = loader.load_and_split(text_splitter)

# Definition of embedding
openai_embedding = OpenAIEmbeddings()

# Create vector DB
db = FAISS.from_documents(docs, openai_embedding)

# retriever generation
retriever = db.as_retriever()

# Document Search
query = "OpenAI Assistant API의 Functions 사용법에 대해 알려주세요."
relevant_docs = retriever.invoke(query)

# Output the number of documents found
len(relevant_docs)

USER_AGENT environment variable not set, consider setting it to identify your requests.

Output the contents of 1 document of the searched results.

# Print document 1.
print(relevant_docs[1].page_content)

As the most powerful tool, you can specify custom functions for Assistant. This is very similar to function calls from the Chat Completionions API. 


The Functioning tool allows you to intelligently return the function you need to call along with the factor by explaining the custom function to Assistant. 


The Assistant API pauses execution when a function is called during execution, and can continue running Run by providing the result of the function call again. (This also means that you can get user feedback and get it back. The tutorial below covers in detail).

Usage

MultiQueryRetriever When specifying the LLM to use for and using it for the creation of the vagina, retriever takes care of the rest of the work.

from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI


# ChatOpenAI Initialize the language model. temperature is set to 0.
llm = ChatOpenAI(temperature=0, model="gpt-4o-mini")

multiquery_retriever = MultiQueryRetriever.from_llm(  # MultiQueryRetriever Initialize using a language model.
    # Pass the retriever and language model of the vector database.
    retriever=db.as_retriever(),
    llm=llm,
)

Below is the code you run to debug the intermediate process that creates multiple queries.

first "langchain.retrievers.multi_query" Get the logger.

This is logging.getLogger() This is done using functions. Then, the logarithm level of this logger INFO By setting it to, INFO Only log messages above the level can be output.

# Setting up logging for queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

This code retriever_from_llm Object invoke Given using methods question Search for related documents.

Documents retrieved unique_docs The larvae are stored in variables, and by checking the length of these variables, you can see the total number of related documents retrieved. This course allows you to effectively find relevant information about your questions and determine the amount.

# Define the question.
question = "OpenAI Assistant API의 Functions Please tell me how to use it."
# Document Search
relevant_docs = multiquery_retriever.invoke(question)

# Returns the number of unique documents found.
print(
    f"===============\nNumber of documents searched: {len(relevant_docs)}",
    end="\n===============\n",
)

# Prints the contents of the searched document.
print(relevant_docs[0].page_content)

INFO:langchain.retrievers.multi_query:Generated queries: ['Explain how to use Functions in the OpenAI Assistant API.  'How to utilize Functions from the OpenAI Assistant API?  ','Can you provide a guide to using Functions in the OpenAI Assistant API?']

=============== 
Number of documents searched: 5 
=============== 
OpenAI's new Assistants API provides powerful tool accessibility along with dialogue. This tutorial covers using the OpenAI Assistants API. In particular, it covers how to utilize Code Interpreter, Retrieval, Functions, which are the tools provided by Assistant APIs. In addition, uploading files and submitting user feedback are also included at the end of the tutorial. 



Main content

How to utilize LCEL Chain

Define custom prompts, and create a Chain with a defined prompt.
When Chain receives a user's question (in the example below), it creates 5 questions, "\n" Returns 5 questions created separated by delimiters.

from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# 프Define a prompt template (I created a prompt to generate 5 questions)
prompt = PromptTemplate.from_template(
    """You are an AI language model assistant. 
Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database. 
By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. 
Your response should be a list of values separated by new lines, eg: `foo\nbar\nbaz\n`

#ORIGINAL QUESTION: 
{question}

#Answer in Korean:
"""
)

# Create a language model instance.
llm = ChatOpenAI(temperature=0, model="gpt-4o-mini")

# LLMChain creates.
custom_multiquery_chain = (
    {"question": RunnablePassthrough()} | prompt | llm | StrOutputParser()
)

# Define the question.
question = "OpenAI Assistant API의 Functions 사용법에 대해 알려주세요."

# Run the chain to see the generated multi-query.
multi_queries = custom_multiquery_chain.invoke(question)
# Check the results (generate 5 questions)
multi_queries

'Explain how to use the Functions feature of the OpenAI Assistant API. I am curious how to utilize Functions in the \nOpenAI Assistant API. Please tell me how you can use Functions from the \nOpenAI Assistant API. Use \nFunctions to explain how to utilize the OpenAI Assistant API. \nOpenAI Assistant API  '

Chain previously created MultiQueryRetriever You can retrieve by passing it to.

multiquery_retriever = MultiQueryRetriever.from_llm(
    llm=custom_multiquery_chain, retriever=db.as_retriever()
)

MultiQueryRetriever Use to search for documents and check results.

# result
relevant_docs = multiquery_retriever.invoke(question)

# Returns the number of unique documents found.
print(
    f"===============\nNumber of documents searched: {len(relevant_docs)}",
    end="\n===============\n",
)

# Prints the contents of the searched document.
print(relevant_docs[0].page_content)

 INFO:langchain.retrievers.multi_query:Generated queries: [Explain how to use Functions in the'OpenAI Assistant API.  'How can I use Functions in the OpenAI Assistant API?  ','Please provide information about the Functions feature of the OpenAI Assistant API.  ','Please tell me how you can use the OpenAI Assistant API using Functions.  ','I would like to know more about using Functions in the OpenAI Assistant API.']

=============== 
Number of documents searched: 5 
=============== 
OpenAI's new Assistants API provides powerful tool accessibility along with dialogue. This tutorial covers using the OpenAI Assistants API. In particular, it covers how to utilize Code Interpreter, Retrieval, Functions, which are the tools provided by Assistant APIs. In addition, uploading files and submitting user feedback are also included at the end of the tutorial.

Previous05. ParentDocumentRetriever NextWikiDocs 07. MultiVectorRetriever

Last updated 1 year ago

hashtagMultiQueryRetriever

hashtagUsage

hashtagHow to utilize LCEL Chain

MultiQueryRetriever

Usage

How to utilize LCEL Chain