In this tutorial, we will show how to create a graphic of knowledge from an unstructured document using an LLM. Although traditional NLP methods have been used to extract entities and relationships, language models (LLM) like GPT-4-Mini make this process more precise and awake. LLMs are particularly useful when working with disorderly and unstructured data. Using Python, Mirascope and GPT-4O-Mini from Openai, we will build a simple knowledge graphic from a medical newspaper example.
Dependencies installation
!pip install "mirascope(openai)" matplotlib networkx
API OPENAI key
To get an API OPENAI key, visit https://platform.openai.com/settings/organization/api-keys And generate a new key. If you are a new user, you may need to add billing details and make a minimum payment of $ 5 to activate access to the API. Discover the complete Codes here.
import os
from getpass import getpass
os.environ("OPENAI_API_KEY") = getpass('Enter OpenAI API Key: ')
Definition of graphics scheme
Before extracting information, we need a structure to represent it. In this step, we define a simple diagram for our knowledge graph using pyndantics. The diagram includes:
- Node: represents an entity with an ID, a type (such as “Doctor” or “Medicine”) and optional properties.
- Edge: represents a relationship between two nodes.
- KnowledgeGRAPH: a container for all nodes and edges.
Discover the complete Codes here.
from pydantic import BaseModel, Field
class Edge(BaseModel):
source: str
target: str
relationship: str
class Node(BaseModel):
id: str
type: str
properties: dict | None = None
class KnowledgeGraph(BaseModel):
nodes: list(Node)
edges: list(Edge)
Define the patient's newspaper
Now that we have a diagram, define the unstructured data that we will use to generate our knowledge graph. You will find below a sample of patient journal, written in natural language. It contains events, symptoms and key observations linked to a patient named Mary. Discover the complete Codes here.
patient_log = """
Mary called for help at 3:45 AM, reporting that she had fallen while going to the bathroom. This marks the second fall incident within a week. She complained of dizziness before the fall.
Earlier in the day, Mary was observed wandering the hallway and appeared confused when asked basic questions. She was unable to recall the names of her medications and asked the same question multiple times.
Mary skipped both lunch and dinner, stating she didn't feel hungry. When the nurse checked her room in the evening, Mary was lying in bed with mild bruising on her left arm and complained of hip pain.
Vital signs taken at 9:00 PM showed slightly elevated blood pressure and a low-grade fever (99.8°F). Nurse also noted increased forgetfulness and possible signs of dehydration.
This behavior is similar to previous episodes reported last month.
"""
Generate the graph of knowledge
To transform the newspapers of patients not structured into structured information, we use a function supplied by LLM which extracts a knowledge graphic. Each patient entry is analyzed to identify entities (such as people, symptoms, events) and their relationships (as “reported”, “presents a symptom”).
The generate_kg function is decorated with @ OPENAI.Call, taking advantage of the gpt-4o-mini model and the previously defined knowledge scheme. The invite clearly instructs the model on how to map the connection to the nodes and the edges. Discover the complete Codes here.
from mirascope.core import openai, prompt_template
@openai.call(model="gpt-4o-mini", response_model=KnowledgeGraph)
@prompt_template(
"""
SYSTEM:
Extract a knowledge graph from this patient log.
Use Nodes to represent people, symptoms, events, and observations.
Use Edges to represent relationships like "has symptom", "reported", "noted", etc.
The log:
{log_text}
Example:
Mary said help, I've fallen.
Node(id="Mary", type="Patient", properties={{}})
Node(id="Fall Incident 1", type="Event", properties={{"time": "3:45 AM"}})
Edge(source="Mary", target="Fall Incident 1", relationship="reported")
"""
)
def generate_kg(log_text: str) -> openai.OpenAIDynamicConfig:
return {"log_text": log_text}
kg = generate_kg(patient_log)
print(kg)
Question graph
Once knowledge has been generated from the unstructured patient journal, we can use it to respond to medical or behavioral queries. We define an executed function () which takes a question of natural language and the structured graphic, and the transments in an invite so that the LLM interprets and responds. Discover the complete Codes here.
@openai.call(model="gpt-4o-mini")
@prompt_template(
"""
SYSTEM:
Use the knowledge graph to answer the user's question.
Graph:
{knowledge_graph}
USER:
{question}
"""
)
def run(question: str, knowledge_graph: KnowledgeGraph): ...
question = "What health risks or concerns does Mary exhibit based on her recent behavior and vitals?"
print(run(question, kg))
View the graphic
Finally, we use Render_graph (kg) to generate a clear and interactive visual representation of the knowledge graphic, helping us to better understand the patient's condition and the links between the symptoms, behaviors and medical concerns observed.
import matplotlib.pyplot as plt
import networkx as nx
def render_graph(kg: KnowledgeGraph):
G = nx.DiGraph()
for node in kg.nodes:
G.add_node(node.id, label=node.type, **(node.properties or {}))
for edge in kg.edges:
G.add_edge(edge.source, edge.target, label=edge.relationship)
plt.figure(figsize=(15, 10))
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=2000, node_color="lightgreen")
nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=20)
nx.draw_networkx_labels(G, pos, font_size=12, font_weight="bold")
edge_labels = nx.get_edge_attributes(G, "label")
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color="blue")
plt.title("Healthcare Knowledge Graph", fontsize=15)
plt.show()
render_graph(kg)
Discover the Codes. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter And don't forget to join our Subseubdredit 100k + ml and subscribe to Our newsletter.
