Building a Private RAG System with LangChain, Chroma, and Local LLMs

Series Index

4. RAG implementation

In addition to the Retriever, we will need our Auto-regresive (conversational) LLM:

llm = ChatOpenAI(temperature=0, model_name=MODEL)

Together with the Retriever we defined in the previous post, we have everything we need to build the RAG:

SYSTEM_PROMPT_TEMPLATE = """
You are a knowledgeable, friendly assistant representing Desert Leaves.
You are chatting internally with a technician from Desert Leaves.
If relevant, use the given context to answer any question.
If you don't know the answer, say so.
Context:
{context}
"""

def fetch_context(question: str) -> list[Document]:
    """
    Retrieve relevant context documents for a question.
    """
    return retriever.invoke(question, k=RETRIEVAL_K)


def combined_question(question: str, history: list[dict] = []) -> str:
    """
    Combine all the user's messages into a single string.
    """
    prior = "\n".join(m["content"] for m in history if m["role"] == "user")
    return prior + "\n" + question


def answer_question(question: str, history: list[dict] = []) -> tuple[str, list[Document]]:
    """
    Answer the given question with RAG; return the answer and the context documents.
    """
    combined = combined_question(question, history)
    docs = fetch_context(combined)
    context = "\n\n".join(doc.page_content for doc in docs)
    system_prompt = SYSTEM_PROMPT.format(context=context)
    messages = [SystemMessage(content=system_prompt)]
    messages.extend(convert_to_messages(history))
    messages.append(HumanMessage(content=question))
    response = llm.invoke(messages)
    return response.content, docs

5. UI with Gradio

gr.ChatInterface(answer_question).launch()

As simple as that

Nice work! Now we have our own private RAG system.

Now a little tune-up for production in the next sections: Part 5: Evaluation

Building a Private RAG System with Ollama, LangChain and Chroma

Part 4: RAG implementation

Series Index

4. RAG implementation

5. UI with Gradio

More LLM Engineering articles