08. Self Query Retriever
Self-querying
SelfQueryRetriever Is a search tool with the ability to create and solve questions on its own.
This is based on the natural language query provided by the user, query-constructing Use LLM chain to create structured queries. Subsequently, this structured query is applied to the default vector data store (VectorStore) to perform the search.
Through this process, SelfQueryRetriever Beyond simply comparing the user's input query with the content of the stored document, the user's query is about the document's metadata. Extract filter You can find related documents by running this filter.
[Note]
LangChain supports self-query Retriever list here Please check at
# API A configuration file for managing keys as environment variables.
from dotenv import load_dotenv
# API Load key information
load_dotenv()True # LangSmith Set up tracking. https://smith.langchain.com
# !pip install langchain-teddynote
from langchain_teddynote import logging
# Enter a project name.
logging.langsmith("CH11-Retriever")Sample data generation
Based on the description and metadata of cosmetic products, we build a vector repository with similar search.
SelfQueryRetriever
You can now instantiate retriever. To do this, the document supports Metadata field And the content of the document Provide a brief description in advance Should do.
AttributeInfo Classes are used to define information about cosmetic metadata fields.
Category (
category): Indicates the string type, the category of cosmetics, and has the value of one of ['skincare','makeup','closing','selection'].year (
year): Indicates the integer type, the year the cosmetic was released.User rating (
user_rating): Real type, representing user ratings in the range 1-5.
SelfQueryRetriever.from_llm() Using methods retriever Create an object.
llm: Language modelvectorstore: Vector repositorydocument_contents: Description of the contents of the documentsmetadata_field_info: Metadata field information
Query test
Search by entering the query to hang the filter.
You can perform a search using complex filters.
k means the number of documents to import.
SelfQueryRetriever Using k You can also specify This is on the constructor enable_limit=True You can do it by passing.
There are three products released in 2023, but we specify the "k" value as 2 to return only 2.
But explicitly by code search_kwargs In query without specifying 1개, 2개 You can use numbers such as to limit your search results.
Enter deeper
To see what happens inside and to have more custom control, we can reconstruct retriever from scratch.
This course query-construction chain Start by creating.
query_constructor chain generation
query_constructor chain generationGenerating structured queries query_constructor Generate chain. get_query_constructor_prompt Use the function to get the query generator prompt.
query_constructor.invoke() Call the method to perform processing for a given query.
Let's check the generated query.
A key element of the Self-query retriever is the query constructor. In order to create a great search system, you need to make the query configor work fine.
To do this Adjust prompt (Prompt), example within prompt, attribute description, etc. Should do.
Convert to structured queries using structured Query Translator
The next important factor is the structured query translator.
This is common StructuredQuery It is responsible for converting objects into metadata filters that fit the syntax of the vector store in use.
retriever.invoke() Use methods to generate answers to a given question.
Last updated