04. CSV
CSV
Comma-Separated Values (CSV) The file is a delimited text file where values are separated by commas. Each line in the file is a data record.
Each record consists of one or more fields separated by commas.
CSVLoader
CSV Load data one row per document.
from langchain_community.document_loaders.csv_loader import CSVLoader
# Create a CSV loader
loader = CSVLoader(file_path="./data/titanic.csv")
# load data
docs = loader.load()
print(len(docs))
print(docs[0].metadata)891
{'source': './data/titanic.csv', 'row': 0}Customizing CSV parsing and loading
See the csv module documentation for more information on supported csv args.
Use the source_column argument to specify the source of the document generated for each row. Otherwise, file_path is used as the source for all documents.
This is useful when using a chain of questions to answer questions using sources loaded from a CSV file.
UnstructuredCSVLoader
You can also load tables using UnstructuredCSVLoader. One advantage of using UnstructuredCSVLoader is that when used in "elements" mode, the metadata provides an HTML representation of the table.
DataFrameLoader
Output HTML text metadata for the first document
Query the first 5 rows..
PassengerId
Survived
Pclass
Name
Sex
Age
SibSp
Parch
Ticket
Fare
Cabin
Embarked
0
1
0
3
Braund, Mr. Owen Harris
male
22.0
1
0
A/5 21171
7.2500
NaN
S
1
2
1
1
Cumings, Mrs. John Bradley (Florence Briggs Th...
female
38.0
1
0
PC 17599
71.2833
C85
C
2
3
1
3
Heikkinen, Miss. Laina
female
26.0
0
0
STON/O2. 3101282
7.9250
NaN
S
3
4
1
1
Futrelle, Mrs. Jacques Heath (Lily May Peel)
female
35.0
1
0
113803
53.1000
C123
S
4
5
0
3
Allen, Mr. William Henry
male
35.0
0
0
373450
8.0500
NaN
S
Last updated