04. LangSmith dataset generation
LangSmith dataset generation
Let's find out how to build your own RAG evaluation dataset.
First, building a dataset requires a large understanding of the trivalent process.
Case: Retrieval Evaluates Relevant on Question
Question -RetrievalCase: Answer Evaluates Relevant for this QuestionCase: Answer answered within Retrievaled document (Hallucination Check)therefore, Question , Retrieval , Answer It is common to need trivalent information, Retrieval Building Ground Truth for is virtually difficult.
if, Retrieval If Ground Truth for exists, all are stored and utilized as datasets, otherwise Question , Answer You can build and utilize datasets only.
# 설치
# !pip install -qU langsmith langchain-teddynote# API KEY를 환경변수로 관리하기 위한 설정 파일
from dotenv import load_dotenv
# API KEY 정보로드
load_dotenv() True # LangSmith 추적을 설정합니다. https://smith.langchain.com
# !pip install -qU langchain-teddynote
from langchain_teddynote import logging
# 프로젝트 이름을 입력합니다.
logging.langsmith("CH16-Evaluations")Generate data set
inputs Wow outputs Utilize to generate a data set.
Data set question and answer Consists of.

Alternatively, you can take advantage of the Synthetic Dataset generated by your previous tutorial.
The code below is an example that utilizes the uploaded HuggingFace Dataset. (Note) by unpacking and running the comments below datasets Please proceed after updating the library.

Dataset generation for LangSmith test
Datasets & TestingGenerate a new dataset on.
You can also generate datasets directly using LangSmith UI in csv files.
Please refer to the documents below for details.
You can add an example to the dataset later.
Congratulations! The dataset is now ready.
Last updated