09. Experimental (Experiment) evaluation comparison
Experimental (Experiment) evaluation comparison
The Compare feature provided by LangSmith makes it easy to compare experimental results.
Reference
#installation
# !pip install -qU langsmith langchain-teddynote# Configuration file for managing API KEY as environment variable
from dotenv import load_dotenv
# Load API KEY information
load_dotenv() True # Set up LangSmith tracking. https://smith.langchain.com
# !pip install -qU langchain-teddynote
from langchain_teddynote import logging
# Enter a project name.
logging.langsmith("CH16-Evaluations") Start tracking LangSmith.
[Project name]
CH16-Evaluations Define functions for RAG performance testing
We will create a RAG system to use for testing.
Utilize the GPT-4o-mini model and the Ollama model to generate functions that generate answers to your questions.
Evaluate answers using the GPT-4o-mini model and Ollama model.
Proceed for each of the two chains.
Use a comparative view to examine the results.
How to make a comparison view


On Dataset's Experiment tab, select the experiment you want to compare.
Click the "Compare" button at the bottom.
A comparison view
Previous08. Rouge, BLEU, METEOR, SemScore based heuristic evaluationNext10. Assessment of the summary method
Last updated