# 04. Qdrant

## Using QDrant with LangChain

### Introduction to QDrant

QDrant is an open-source, high-performance vector database optimized for storing and retrieving high-dimensional embeddings. It is designed for applications requiring efficient similarity search, such as recommendation systems, semantic search, and machine learning model inference. Unlike traditional databases, QDrant is tailored for managing vector-based data and offers features like filtering, clustering, and real-time updates.

### Setting Up QDrant

#### 1. Installing QDrant

To use QDrant, you need to install the QDrant client. If you haven’t installed it yet, you can do so using:

```bash
pip install qdrant-client
```

If you want to run a local instance of QDrant, you can use Docker:

```bash
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
```

This will start a QDrant server locally, accessible on `http://localhost:6333`.

#### 2. Creating a QDrant Client

Once installed, initialize a QDrant client in Python:

```python
from qdrant_client import QdrantClient

client = QdrantClient(host="localhost", port=6333)
```

If you're using QDrant Cloud, replace `localhost` with the QDrant Cloud API endpoint and provide authentication credentials.

### Integrating QDrant with LangChain

LangChain provides seamless integration with QDrant for vector-based storage and retrieval. The `Qdrant` wrapper in LangChain simplifies adding and retrieving vector embeddings.

#### 1. Creating a QDrant Index

Before storing vectors, define an index (or collection) in QDrant:

```python
from qdrant_client.http.models import Distance, VectorParams

client.recreate_collection(
    collection_name="langchain_docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)
```

This creates a collection named `langchain_docs` with vectors of size 1536 and cosine similarity as the distance metric.

#### 2. Storing Embeddings in QDrant

To store vectors, first generate embeddings using an embedding model (e.g., OpenAI or Hugging Face):

```python
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Qdrant

embeddings = OpenAIEmbeddings()
vector_db = Qdrant(client=client, collection_name="langchain_docs", embeddings=embeddings)
```

Now, store some text data in QDrant:

```python
documents = ["This is a sample document.", "LangChain makes working with LLMs easier."]
vector_db.add_texts(texts=documents)
```

#### 3. Performing Similarity Search

Retrieve documents similar to a given query:

```python
query = "How does LangChain help with LLMs?"
results = vector_db.similarity_search(query, k=2)

for result in results:
    print(result.page_content)
```

This fetches the top 2 documents that are most semantically similar to the query.

### Best Practices and Optimization

* **Use Efficient Distance Metrics**: Choose the right similarity metric (Cosine, Euclidean, Dot Product) based on your use case.
* **Index Maintenance**: Regularly update and clean up old embeddings to keep the index optimized.
* **Filtering and Metadata**: Use QDrant's metadata filtering to refine search results for better precision.
* **Cloud Deployment**: For production, consider using QDrant Cloud for scalability and reliability.

### Conclusion

QDrant provides a powerful alternative to proprietary vector databases like Pinecone while offering open-source flexibility. Its integration with LangChain makes it a great choice for building scalable, efficient, and cost-effective AI applications. With proper setup and optimization, you can leverage QDrant to enhance search, recommendation, and retrieval-augmented generation (RAG) applications.
