Creation of a graph of knowledge using an LLM

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

In this tutorial, we will show how to create a graphic of knowledge from an unstructured document using an LLM. Although traditional NLP methods have been used to extract entities and relationships, language models (LLM) like GPT-4-Mini make this process more precise and awake. LLMs are particularly useful when working with disorderly and unstructured data. Using Python, Mirascope and GPT-4O-Mini from Openai, we will build a simple knowledge graphic from a medical newspaper example.

Dependencies installation

!pip install "mirascope(openai)" matplotlib networkx 

API OPENAI key

To get an API OPENAI key, visit https://platform.openai.com/settings/organization/api-keys And generate a new key. If you are a new user, you may need to add billing details and make a minimum payment of $ 5 to activate access to the API. Discover the complete Codes here.

import os
from getpass import getpass
os.environ("OPENAI_API_KEY") = getpass('Enter OpenAI API Key: ')

Definition of graphics scheme

Before extracting information, we need a structure to represent it. In this step, we define a simple diagram for our knowledge graph using pyndantics. The diagram includes:

  • Node: represents an entity with an ID, a type (such as “Doctor” or “Medicine”) and optional properties.
  • Edge: represents a relationship between two nodes.
  • KnowledgeGRAPH: a container for all nodes and edges.

Discover the complete Codes here.

from pydantic import BaseModel, Field

class Edge(BaseModel):
    source: str
    target: str
    relationship: str

class Node(BaseModel):
    id: str
    type: str
    properties: dict | None = None

class KnowledgeGraph(BaseModel):
    nodes: list(Node)
    edges: list(Edge)

Define the patient's newspaper

Now that we have a diagram, define the unstructured data that we will use to generate our knowledge graph. You will find below a sample of patient journal, written in natural language. It contains events, symptoms and key observations linked to a patient named Mary. Discover the complete Codes here.

patient_log = """
Mary called for help at 3:45 AM, reporting that she had fallen while going to the bathroom. This marks the second fall incident within a week. She complained of dizziness before the fall.

Earlier in the day, Mary was observed wandering the hallway and appeared confused when asked basic questions. She was unable to recall the names of her medications and asked the same question multiple times.

Mary skipped both lunch and dinner, stating she didn't feel hungry. When the nurse checked her room in the evening, Mary was lying in bed with mild bruising on her left arm and complained of hip pain.

Vital signs taken at 9:00 PM showed slightly elevated blood pressure and a low-grade fever (99.8°F). Nurse also noted increased forgetfulness and possible signs of dehydration.

This behavior is similar to previous episodes reported last month.
"""

Generate the graph of knowledge

To transform the newspapers of patients not structured into structured information, we use a function supplied by LLM which extracts a knowledge graphic. Each patient entry is analyzed to identify entities (such as people, symptoms, events) and their relationships (as “reported”, “presents a symptom”).

The generate_kg function is decorated with @ OPENAI.Call, taking advantage of the gpt-4o-mini model and the previously defined knowledge scheme. The invite clearly instructs the model on how to map the connection to the nodes and the edges. Discover the complete Codes here.

from mirascope.core import openai, prompt_template

@openai.call(model="gpt-4o-mini", response_model=KnowledgeGraph)
@prompt_template(
    """
    SYSTEM:
    Extract a knowledge graph from this patient log.
    Use Nodes to represent people, symptoms, events, and observations.
    Use Edges to represent relationships like "has symptom", "reported", "noted", etc.

    The log:
    {log_text}

    Example:
    Mary said help, I've fallen.
    Node(id="Mary", type="Patient", properties={{}})
    Node(id="Fall Incident 1", type="Event", properties={{"time": "3:45 AM"}})
    Edge(source="Mary", target="Fall Incident 1", relationship="reported")
    """
)
def generate_kg(log_text: str) -> openai.OpenAIDynamicConfig:
    return {"log_text": log_text}
kg = generate_kg(patient_log)
print(kg)

Question graph

Once knowledge has been generated from the unstructured patient journal, we can use it to respond to medical or behavioral queries. We define an executed function () which takes a question of natural language and the structured graphic, and the transments in an invite so that the LLM interprets and responds. Discover the complete Codes here.

@openai.call(model="gpt-4o-mini")
@prompt_template(
    """
    SYSTEM:
    Use the knowledge graph to answer the user's question.

    Graph:
    {knowledge_graph}

    USER:
    {question}
    """
)
def run(question: str, knowledge_graph: KnowledgeGraph): ...
question = "What health risks or concerns does Mary exhibit based on her recent behavior and vitals?"
print(run(question, kg))
AD 4nXfpPaL cXs9pF08C1852h9wh44Gp0ZfF5daXF28sSu aArFrvjzFGE5m1GTdjlqHcp3zCw2254ij3hQg65hSAD 4nXfpPaL cXs9pF08C1852h9wh44Gp0ZfF5daXF28sSu aArFrvjzFGE5m1GTdjlqHcp3zCw2254ij3hQg65hS

View the graphic

Finally, we use Render_graph (kg) to generate a clear and interactive visual representation of the knowledge graphic, helping us to better understand the patient's condition and the links between the symptoms, behaviors and medical concerns observed.

import matplotlib.pyplot as plt
import networkx as nx

def render_graph(kg: KnowledgeGraph):
    G = nx.DiGraph()

    for node in kg.nodes:
        G.add_node(node.id, label=node.type, **(node.properties or {}))

    for edge in kg.edges:
        G.add_edge(edge.source, edge.target, label=edge.relationship)

    plt.figure(figsize=(15, 10))
    pos = nx.spring_layout(G)
    nx.draw_networkx_nodes(G, pos, node_size=2000, node_color="lightgreen")
    nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=20)
    nx.draw_networkx_labels(G, pos, font_size=12, font_weight="bold")
    edge_labels = nx.get_edge_attributes(G, "label")
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color="blue")
    plt.title("Healthcare Knowledge Graph", fontsize=15)
    plt.show()

render_graph(kg)
AD 4nXdcNcTsw6j32ny pA6SlPo csNynUtywl1LRAfQ ok1urVeJjfKNvxnjd F7iyGRDyjh15pbkp8 Ux5H6 2YAaaKSyMU3XEOPREDAD 4nXdcNcTsw6j32ny pA6SlPo csNynUtywl1LRAfQ ok1urVeJjfKNvxnjd F7iyGRDyjh15pbkp8 Ux5H6 2YAaaKSyMU3XEOPRED

Discover the Codes. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter And don't forget to join our Subseubdredit 100k + ml and subscribe to Our newsletter.


PASSPORT SIZE PHOTO

I graduated in Civil Engineering (2022) by Jamia Millia Islamia, New Delhi, and I have a great interest in data science, in particular neural networks and their application in various fields.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.