Correct way to implement RAG with vllm #103

Hel1zor · 2024-08-22T10:55:38Z

Is this the correct way to implement RAG with vllm API or is there a better or easier way to do it?

`import os
from langchain_pinecone import PineconeVectorStore
from runpod.client import Client as RunPodClient  # Use RunPod client
from sentence_transformers import SentenceTransformer  # Open-source embedding model

# Environment variables (replace with your keys)
os.environ['RUNPOD_API_KEY'] = '<YOUR_RUNPOD_API_KEY>'
os.environ['PINECONE_API_KEY'] = '<YOUR_PINECONE_API_KEY>'

# RunPod client setup
client = RunPodClient(
    api_key=os.environ['RUNPOD_API_KEY'],
    base_url="https://api.runpod.ai/v2/vllm-cf5z42rtrzdc2o/openai/v1",
)

# Set up SentenceTransformer embeddings
embedding_model_name = 'sentence-transformers/all-MiniLM-L6-v2'  # Example model
embeddings = SentenceTransformer(embedding_model_name)

# Function to create embeddings for a list of texts
def embed_texts(texts):
    return embeddings.encode(texts)

# Connect to the existing Pinecone index
index_name = "chatbot-v1"
vectorstore = PineconeVectorStore(index_name=index_name, embedding_function=embed_texts)

# Query the existing index
from langchain_openai import ChatOpenAI  
from langchain.chains import RetrievalQA  

# Set up the language model with RunPod-hosted model
llm = ChatOpenAI(
    openai_api_key=os.environ['RUNPOD_API_KEY'],
    model_name='openchat/openchat-3.5-1210',  # Use your model's name
    temperature=0.0
)

# Set up Retrieval QA chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Example query
query = "Who was Benito Mussolini?"
response = qa.run(query)

# Print the response
print(response)
`

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct way to implement RAG with vllm #103

Correct way to implement RAG with vllm #103

Hel1zor commented Aug 22, 2024 •

edited

Loading

Correct way to implement RAG with vllm #103

Correct way to implement RAG with vllm #103

Comments

Hel1zor commented Aug 22, 2024 • edited Loading

Hel1zor commented Aug 22, 2024 •

edited

Loading