A coding guide to build modular QA systems and self-corners with DSPY

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

In this tutorial, we explore how to build a system of answer to intelligent and self-corrigerated questions using the Dspy Framework, integrated into the Google Flash Gemini 1.5 model. We start by defining structured signatures which clearly describe the input-sorting behavior, which DSPY uses as a base to build reliable pipelines. With DSPY's declarative programming approach, we build composable modules, such as Advancedqa and Simplerag, to answer questions using both the context and the generation with recovery. By combining the modularity of DSPY with the powerful reasoning of Gemini, we are setting up an AI system capable of providing precise and step -by -step responses. As we progress, we also take advantage of DSPY optimization tools, such as Bootstrapfewshot, to automatically improve performance according to training examples.

!pip install dspy-ai google-generativeai


import dspy
import google.generativeai as genai
import random
from typing import List, Optional


GOOGLE_API_KEY = "Use Your Own API Key"  
genai.configure(api_key=GOOGLE_API_KEY)


dspy.configure(lm=dspy.LM(model="gemini/gemini-1.5-flash", api_key=GOOGLE_API_KEY))

We start by installing the required libraries, DSPY for the AI ​​Declarative and Google-Generativatei pipelines to access the Gemini models of Google. After importing the necessary modules, we configure the Gemini using our API key. Finally, we have configured DSPY to use the Flash Gemini 1.5 model as a tongue model backend.

class QuestionAnswering(dspy.Signature):
    """Answer questions based on given context with reasoning."""
    context: str = dspy.InputField(desc="Relevant context information")
    question: str = dspy.InputField(desc="Question to answer")
    reasoning: str = dspy.OutputField(desc="Step-by-step reasoning")
    answer: str = dspy.OutputField(desc="Final answer")


class FactualityCheck(dspy.Signature):
    """Verify if an answer is factually correct given context."""
    context: str = dspy.InputField()
    question: str = dspy.InputField()
    answer: str = dspy.InputField()
    is_correct: bool = dspy.OutputField(desc="True if answer is factually correct")

We define two DSPY signatures to structure the inputs and outputs of our system. First, the questionnaires await a context and a question, and it refers both reasoning and a final answer, allowing the model to explain its process of reflection. Then, Billitycheck is designed to verify the veracity of an answer by returning a simple Boolean, helping us to build a self-corrigerated quality insurance system.

class AdvancedQA(dspy.Module):
    def __init__(self, max_retries: int = 2):
        super().__init__()
        self.max_retries = max_retries
        self.qa_predictor = dspy.ChainOfThought(QuestionAnswering)
        self.fact_checker = dspy.Predict(FactualityCheck)
       
    def forward(self, context: str, question: str) -> dspy.Prediction:
        prediction = self.qa_predictor(context=context, question=question)
       
        for attempt in range(self.max_retries):
            fact_check = self.fact_checker(
                context=context,
                question=question,
                answer=prediction.answer
            )
           
            if fact_check.is_correct:
                break
               
            refined_context = f"{context}nnPrevious incorrect answer: {prediction.answer}nPlease provide a more accurate answer."
            prediction = self.qa_predictor(context=refined_context, question=question)
       
        return prediction

We create an Advancedqa module to add a self-correction capacity to our QA system. He first uses a chain predictor of thoughts to generate an answer with reasoning. Then, it checks the factual precision using a predictor to verify the facts. If the answer is incorrect, we refine the context and try again, up to a number of times specified, to ensure more reliable outings.

class SimpleRAG(dspy.Module):
    def __init__(self, knowledge_base: List(str)):
        super().__init__()
        self.knowledge_base = knowledge_base
        self.qa_system = AdvancedQA()
       
    def retrieve(self, question: str, top_k: int = 2) -> str:
        # Simple keyword-based retrieval (in practice, use vector embeddings)
        scored_docs = ()
        question_words = set(question.lower().split())
       
        for doc in self.knowledge_base:
            doc_words = set(doc.lower().split())
            score = len(question_words.intersection(doc_words))
            scored_docs.append((score, doc))
       
        # Return top-k most relevant documents
        scored_docs.sort(reverse=True)
        return "nn".join((doc for _, doc in scored_docs(:top_k)))
   
    def forward(self, question: str) -> dspy.Prediction:
        context = self.retrieve(question)
        return self.qa_system(context=context, question=question)

We build a simplerag module to simulate the recovery generation from recovery using DSPY. We provide a knowledge base and implement a retriever based on basic keywords to recover the most relevant documents for a given question. These documents serve as a context for the Advancedqa module, which then makes reasoning and self-correction to produce a specific response.

knowledge_base = (
    “Use Your Context and Knowledge Base Here”
)


training_examples = (
    dspy.Example(
        question="What is the height of the Eiffel Tower?",
        context="The Eiffel Tower is located in Paris, France. It was constructed from 1887 to 1889 and stands 330 meters tall including antennas.",
        answer="330 meters"
    ).with_inputs("question", "context"),
   
    dspy.Example(
        question="Who created Python programming language?",
        context="Python is a high-level programming language created by Guido van Rossum. It was first released in 1991 and emphasizes code readability.",
        answer="Guido van Rossum"
    ).with_inputs("question", "context"),
   
    dspy.Example(
        question="What is machine learning?",
        context="ML focuses on algorithms that can learn from data without being explicitly programmed.",
        answer="Machine learning focuses on algorithms that learn from data without explicit programming."
    ).with_inputs("question", "context")
)

We define a small knowledge base containing various facts on various subjects, including history, programming and science. This serves as a source of context for recovery. At the same time, we prepare a set of training examples to guide the DSPY optimization process. Each example includes a question, its relevant context and the right answer, helping our system to learn to answer more precisely.

def accuracy_metric(example, prediction, trace=None):
    """Simple accuracy metric for evaluation"""
    return example.answer.lower() in prediction.answer.lower()


print("🚀 Initializing DSPy QA System with Gemini...")
print("📝 Note: Using Google's Gemini 1.5 Flash (free tier)")
rag_system = SimpleRAG(knowledge_base)


basic_qa = dspy.ChainOfThought(QuestionAnswering)


print("n📊 Before Optimization:")
test_question = "What is the height of the Eiffel Tower?"
test_context = knowledge_base(0)
initial_prediction = basic_qa(context=test_context, question=test_question)
print(f"Q: {test_question}")
print(f"A: {initial_prediction.answer}")
print(f"Reasoning: {initial_prediction.reasoning}")


print("n🔧 Optimizing with BootstrapFewShot...")
optimizer = dspy.BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=2)
optimized_qa = optimizer.compile(basic_qa, trainset=training_examples)


print("n📈 After Optimization:")
optimized_prediction = optimized_qa(context=test_context, question=test_question)
print(f"Q: {test_question}")
print(f"A: {optimized_prediction.answer}")
print(f"Reasoning: {optimized_prediction.reasoning}")

We start by defining a simple precision metric to verify if the planned response contains the correct answer. After initializing our simplerag system and a basic QA module on the basic line, we test it on a question sample before any optimization. Then, using the DSPY Bootstrapfewshot optimizer, we refine the QA system with our training examples. This allows the model to automatically generate more effective prompts, leading to improved precision, which we check by comparing the responses before and after optimization.

def evaluate_system(qa_module, test_cases):
    """Evaluate QA system performance"""
    correct = 0
    total = len(test_cases)
   
    for example in test_cases:
        prediction = qa_module(context=example.context, question=example.question)
        if accuracy_metric(example, prediction):
            correct += 1
   
    return correct / total


print(f"n📊 Evaluation Results:")
print(f"Basic QA Accuracy: {evaluate_system(basic_qa, training_examples):.2%}")
print(f"Optimized QA Accuracy: {evaluate_system(optimized_qa, training_examples):.2%}")


print("n✅ Tutorial Complete! Key DSPy Concepts Demonstrated:")
print("1. 🔤 Signatures - Defined input/output schemas")
print("2. 🏗️  Modules - Built composable QA systems")
print("3. 🔄 Self-correction - Implemented iterative improvement")
print("4. 🔍 RAG - Created retrieval-augmented generation")
print("5. ⚡ Optimization - Used BootstrapFewShot to improve prompts")
print("6. 📊 Evaluation - Measured system performance")
print("7. 🆓 Free API - Powered by Google Gemini 1.5 Flash")

We are advancing an advance CLOTH Demo by asking several questions in different fields. For each question, the Simplerag system recovers the most relevant context, then uses the self-corrigerated Advancedqa module to generate a well continued response. We print the answers as well as an overview of the reasoning, showing how DSPY combines recovery and thoughtful generation to provide reliable responses.

In conclusion, we managed to demonstrate the full potential of the DSPY to build advanced QA pipelines. We see how DSPY simplifies the design of intelligent modules with clear interfaces, supports the self-correction loops, integrates basic recovery and allows an optimization of prompts with a minimal code. With only a few lines, we configure and assess our models using real world examples, measure performance gains. This practical experience shows how DSPY, when combined with Google's Gemini API, allows us to prototyper, test and put sophisticated linguistic applications without boilers or complex logic.


Discover the Codes. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter,, YouTube And Spotify And don't forget to join our Subseubdredit 100k + ml and subscribe to Our newsletter.


author profile Sana Hassan

Sana Hassan, consulting trainee at Marktechpost and double -degree student at Iit Madras, is passionate about the application of technology and AI to meet the challenges of the real world. With a great interest in solving practical problems, it brings a new perspective to the intersection of AI and real life solutions.

a sleek banner advertisement showcasing

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.