ai-knowledge-graph: The Intelligent Engine Transforming Text into Structured Knowledge
What is ai-knowledge-graph?
ai-knowledge-graph is an open-source project developed by Robert McDermott that aims to convert unstructured text—such as articles, reports, and books—into structured knowledge graphs. The system leverages large language models (LLMs) to extract subject-predicate-object (SPO) triples from text and visualizes the relationships between entities, helping users intuitively understand and analyze information.
Key Features
-
Text Segmentation:
Automatically splits large documents into manageable chunks to fit within the context window of LLMs. -
Knowledge Extraction:
Uses LLMs to extract subject-predicate-object triples from each text chunk, identifying entities and their relationships. -
Entity Normalization:
Ensures consistent naming of the same entities across the entire document to reduce ambiguity. -
Relationship Inference:
Infers implicit relationships not explicitly mentioned in the text to connect fragmented parts of the graph. -
Interactive Visualization:
Generates knowledge graphs in HTML format with features such as zooming, dragging, and community detection for enhanced user interaction. -
Multi-LLM Support:
Compatible with various OpenAI-compatible API endpoints, including Ollama, LM Studio, OpenAI, vLLM, LiteLLM, offering flexible model choices.
Technical Principles
-
Text Segmentation and Processing:
The input text is split into multiple chunks, each containing about 200 words with a 20-word overlap, to accommodate LLM context limits. -
SPO Triple Extraction:
Each text chunk is processed by an LLM to extract subject-predicate-object triples, forming the initial knowledge graph. -
Entity Normalization:
Recognized entities are standardized using LLM-based entity alignment to resolve inconsistent naming across text chunks. -
Relationship Inference:
Applies transitive closure and lexical similarity rules to infer relationships not explicitly stated, reducing graph fragmentation. -
Graph Visualization:
Uses the PyVis library to generate interactive HTML graphs supporting zoom, drag, and community detection, improving comprehension of knowledge structures.
Project Link
- GitHub Repository:
https://github.com/robert-mcdermott/ai-knowledge-graph
Application Scenarios
-
Academic Research:
Rapidly build knowledge graphs in research domains to assist literature reviews and research direction analysis. -
Enterprise Knowledge Management:
Integrate internal documents to construct knowledge graphs that enhance information retrieval and decision-making efficiency. -
Education and Training:
Transform educational materials into knowledge graphs to support teaching and learning. -
News Analysis:
Extract key information from news reports and build event relation graphs to assist public opinion analysis. -
Legal Field:
Analyze legal documents and construct case relation graphs to support legal research and case analysis.