RAG for Dialogue (Lab) — Conversational Agents

Large language models can produce fluent but factually wrong or unsupported answers — a failure mode known as hallucination. Retrieval-augmented generation (RAG) mitigates this by retrieving relevant evidence from an external knowledge source and conditioning the model's response on that evidence.

In this lab you build a RAG pipeline for information-seeking dialogue. You use the HybriDialogue dataset (open-domain, information-seeking) and Contriever as the dense retriever to fetch candidate passages. You then inject the retrieved text into the prompt and generate responses with and without retrieval, comparing factual accuracy and relevance. By the end you will have a minimal RAG-for-dialogue pipeline and an appreciation of how retrieval improves grounding.

The lab (Lab4) requires a GPU and sufficient disk space for the Wikipedia dump. Follow the notebook to set up dependencies, build the retriever index, and run the dialogue evaluation.