Tracking OPENAI agents’ responses using MLFLOW

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

MLFLOW is an open source platform to manage and follow automatic learning experiences. When used with the SDK Openai Agents, MLFLOW automatically:

  • Newspapers all agent interactions and API calls
  • Capture the use of tools, input / output messages and intermediate decisions
  • The tracks run for debugging, performance analysis and reproducibility

This is particularly useful when you build multi-agent systems where different agents collaborate or call functions dynamically

1500X500

In this tutorial, we will browse two key examples: a simple transfer between agents and the use of agent railings – while tracing their behavior using MLFLOW.

Dependencies configuration

Installation of libraries

pip install openai-agents mlflow pydantic pydotenv

API OPENAI key

To get an API OPENAI key, visit https://platform.openai.com/settings/organization/api-keys And generate a new key. If you are a new user, you may need to add billing details and make a minimum payment of $ 5 to activate access to the API.

Once the key has been generated, create a .VV file and enter the following:

Replace With the key you have generated.

Multi-agent system (multi_agent_demo.py)

In this script (Multi_AGENT_DEMO.PY), we build a simple multi-agent assistant using the agents SDK, designed to transport user requests to a coding expert or an expert in the kitchen. We activate mlflow.openai.autolog ()which automatically traces and records all agent interactions with the OPENAI API – including inputs, outputs and agent transfers – which facilitates the monitoring and debugging of the system. MLFLOW is configured to use a tracking URI based on local files (./Mlruns) and records any activity under the name of the experience “Agent coding“.

import mlflow, asyncio
from agents import Agent, Runner
import os
from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()                           # Auto‑trace every OpenAI call
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Coding‑Cooking")

coding_agent = Agent(name="Coding agent",
                     instructions="You only answer coding questions.")

cooking_agent = Agent(name="Cooking agent",
                      instructions="You only answer cooking questions.")

triage_agent = Agent(
    name="Triage agent",
    instructions="If the request is about code, handoff to coding_agent; "
                 "if about cooking, handoff to cooking_agent.",
    handoffs=(coding_agent, cooking_agent),
)

async def main():
    res = await Runner.run(triage_agent,
                           input="How do I boil pasta al dente?")
    print(res.final_output)

if __name__ == "__main__":
    asyncio.run(main())

MLFLOW UI

To open the MLFLOW user interface and display all recorded agent interactions, run the following order in a new terminal:

This will start the MLFLOW monitoring server and will display an prompt indicating the URL and the port where the user interface is accessible – generally http: // localhost: 5000 by default.

AD 4nXcj9S1hzZMIF7kvLYMsQmm1qgaqZaarGL7xd77oWqp1H RyaJRTNxUmiqOYtuK3GO9PkMOv1CALzGz346MaMmRLP4qjZztrwq81nHdmkVfyLGY7ByfYNpIit0OUpnn0cdcE8aEAJA?key=UhefgrJbGn u0YruSWKosQAD 4nXcj9S1hzZMIF7kvLYMsQmm1qgaqZaarGL7xd77oWqp1H RyaJRTNxUmiqOYtuK3GO9PkMOv1CALzGz346MaMmRLP4qjZztrwq81nHdmkVfyLGY7ByfYNpIit0OUpnn0cdcE8aEAJA?key=UhefgrJbGn u0YruSWKosQ

We can see the entire interaction flow in the Trace Section – Since the initial entry of the user to the way the wizard sent the request to the appropriate agent, and finally, the answer generated by this agent. This trace from start to finish gives a precious overview of decision -making, transfers and outings, helping you to debug and optimize your agent's workflows.

Railing tracing (Guardrails.Py)

In this example, we implement a customer support agent protected by the Garden Guarder using the SDK Openai agents with MLFlow tracing. The agent is designed to help users with general requests, but it is right to answer medical questions. A dedicated railing agent checks these entries and, if detected, blocks the request. MLFLOW captures the entire flow – including the activation of the railings, the reasoning and the agent's response – offering complete traceability and an overview of the safety mechanisms.

import mlflow, asyncio
from pydantic import BaseModel
from agents import (
    Agent, Runner,
    GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
    input_guardrail, RunContextWrapper)

from dotenv import load_dotenv
load_dotenv()

mlflow.openai.autolog()
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Guardrails")

class MedicalSymptons(BaseModel):
    medical_symptoms: bool
    reasoning: str


guardrail_agent = Agent(
    name="Guardrail check",
    instructions="Check if the user is asking you for medical symptons.",
    output_type=MedicalSymptons,
)


@input_guardrail
async def medical_guardrail(
    ctx: RunContextWrapper(None), agent: Agent, input
) -> GuardrailFunctionOutput:
    result = await Runner.run(guardrail_agent, input, context=ctx.context)

    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.medical_symptoms,
    )


agent = Agent(
    name="Customer support agent",
    instructions="You are a customer support agent. You help customers with their questions.",
    input_guardrails=(medical_guardrail),
)


async def main():
    try:
        await Runner.run(agent, "Should I take aspirin if I'm having a headache?")
        print("Guardrail didn't trip - this is unexpected")

    except InputGuardrailTripwireTriggered:
        print("Medical guardrail tripped")


if __name__ == "__main__":
    asyncio.run(main())

This script defines a customer assistance agent with an entry railing that detects medical issues. It uses a separate guardrail_age to assess whether the user's entry contains a request for medical advice. If such an entry is detected, the railing is triggered and prevents the main agent from responding. The entire process, including the guardian checks and results, is automatically recorded and traced using MLFLOW.

MLFLOW UI

To open the MLFLOW user interface and display all recorded agent interactions, run the following order in a new terminal:

AD 4nXeaXQF0zuIU8f8v8inZigR9KfeSFVfXC4oR8jz2iZoI2CDs87 JBJar268skbdQMVM1ZlJ GNfSec2U9l9H87 6q0wSeC61f4oNOo7F3t9iHRvzkAWOTqm 2UMyhYriQHiWr9sMDw?key=UhefgrJbGn u0YruSWKosQAD 4nXeaXQF0zuIU8f8v8inZigR9KfeSFVfXC4oR8jz2iZoI2CDs87 JBJar268skbdQMVM1ZlJ GNfSec2U9l9H87 6q0wSeC61f4oNOo7F3t9iHRvzkAWOTqm 2UMyhYriQHiWr9sMDw?key=UhefgrJbGn u0YruSWKosQ

In this example, we asked the agent: “Should I take aspirin if I have a headache?”, Who triggered the railing. In the MLFLOW user interface, we can clearly see that the input has been reported, as well as the reasoning provided by the railing agent to explain why the request was blocked.

Discover the Codes. All the merit of this research goes to researchers in this project. Ready to connect with 1 million developers / engineers / researchers? Find out how NVIDIA, LG AI Research and the best IA companies operate Marktechpost to reach their target audience (Learn more)


PASSPORT SIZE PHOTO

I graduated in Civil Engineering (2022) by Jamia Millia Islamia, New Delhi, and I have a great interest in data science, in particular neural networks and their application in various fields.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.