Building a Private RAG System with Ollama, LangChain and Chroma

Part 2: Create the vector retriever

Posted by Kike Bodí on December 10, 2025

Series Index

  1. Prerequisites
  2. Populate the Vector Database
  3. Vector Retriever
  4. RAG implementation
  5. Chat UI
  6. Evaluation
  7. Performance improvements

2. Retriever

From now on, we have our vector database setup with the Embedding LLM all-MiniLM-L6-v2. Now we are going to create a Retriever. An LLM that search our 384-dimensional map, an retrieves the "chunks" that are near the keyword to search.

Note: It is very important to use the same retriever from the model that generate the embeddings, otherwise it won't be able to find the information.

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)
retriever = vectorstore.as_retriever()

Nice work! With this short step, we’ve covered about 35%of our private RAG system.

Missing something? It is probably on the previous section Part 1: Populate your private vector database


More LLM Engineering articles