Series Index
- Prerequisites
- Populate the Vector Database
- Vector Retriever
- RAG implementation
- Chat UI
- Evaluation
- Performance improvements
2. Retriever
From now on, we have our vector database setup with the Embedding LLM all-MiniLM-L6-v2. Now we are going to create a Retriever. An LLM that search our 384-dimensional map, an retrieves the "chunks" that are near the keyword to search.
Note: It is very important to use the same retriever from the model that generate the embeddings, otherwise it won't be able to find the information.
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)
retriever = vectorstore.as_retriever()
Nice work! With this short step, we’ve covered about 35%of our private RAG system.
Missing something? It is probably on the previous section Part 1: Populate your private vector database
More LLM Engineering articles
- LLM Engineering | Token optimization – caching, thin system prompts, and cost-optimized production usage.
- LLM Engineering | Running local LLMs and APIs – Ollama, OpenAI, Anthropic, OpenRouter, LangChain, and LiteLLM.