iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🥅

🌍 Knowledge Graph Vol. 4: Making Neo4j Knowledge Graphs "Smart" with Taxonomies and Ontologies

に公開

Introduction

Hello. While proceeding with the development of a farm work recording system using Neo4j, I have been reading the "Knowledge Graph Construction Guide."

Last time, I was impressed by the "self-explaining power" of knowledge graphs. This time, I have learned deeply about the specific construction principles to realize that power, particularly the importance of taxonomy and ontology.

The system currently under development uses Neo4j's powerful property graph model as its foundation while introducing "CAVOC," a farm work ontology. In this article, I will explore how these construction principles evolve our graph into a "user-friendly system."


1. Property Graph Potential and Construction Principles as "Conventions"

The property graph model adopted by Neo4j is a richer model containing more information compared to simple classical graphs.

The following description regarding the flexibility of this model in the book was particularly striking:

"Importantly, some work can be accomplished just by leveraging the characteristics of the property graph model, without requiring prior domain knowledge." (Knowledge Graph Construction Guide, p. 12)

This demonstrates the powerful potential of graph databases, where value can be derived from data even at an early stage by utilizing property (attribute) and relationship types.

However, in large-scale systems or team development, this flexibility alone is not enough.

"For user-facing applications, construction principles are like conventions." (Knowledge Graph Construction Guide, p. 13)

These "conventions" are the ontology, which adds domain knowledge and consistency to the graph's structure, acting as a common language for data interpretation across the entire application.


2. Structure that Enables Inference: Investing in "Taxonomy"

A knowledge graph can provide "insights" beyond mere data retrieval because it incorporates a hierarchical classification system into its structure.

The Value of Expressive Classification (Taxonomy)

Categorizing data using expressive methods rather than just storing it can be considered a "valuable investment." At the heart of this is taxonomy.

Taxonomy is a hierarchical classification scheme that enables inferences such as "x is a type of y."

Feature Content Implementation Example in Knowledge Graph
Hierarchical Classification Organizing categories into hierarchies from perspectives like broad/narrow or generalization/specialization. Creating connections between nodes with relationships such as :SUB_CATEGORY_OF or :IS_A.
Providing Inference Automatically determining the broader category (broad term) that a specific product (narrow term) belongs to. Inference works such that a "query for all headphones" includes both "wired headphones" and "wireless headphones."

Practical Example of Taxonomy (From the system currently under development):

Work (Broad term)
├── Tillage
│ ├── Rotary Tillage
│ └── Plow Tillage
├── Weeding
│ ├── Manual Weeding
│ └── Mechanical Weeding
└── Harvesting
With this hierarchical structure, simply querying for "work time of all tillage work"
automatically includes both rotary tillage and plow tillage.

Integration of Multiple Hierarchical Structures

The true advantage of a knowledge graph lies in its ability to integrate multiple organizational hierarchical structures simultaneously, rather than forcing a single classification on the data. This allows for data analysis from different perspectives for each department or use case, leading to new insights.

Example) Multiple Taxonomies in the Farm Work Recording System:

  1. Work Classification (CAVOC-compliant)

    • Soil Preparation → Tillage → Sowing → Management → Harvesting
  2. Geographical Classification

    • Region → Area → Field → Plot
  3. Material Classification

    • Agricultural Materials → Fertilizer → Organic Fertilizer → Compost
  4. Temporal Classification

    • Year → Season → Month → Week

By integrating these, complex queries such as
"Soil preparation work using organic fertilizer in the O-chozu area during spring"
become possible.


3. CAVOC Ontology: The "Pillar of Classification" in the Farm Work Recording System

In the farm work recording system currently under development, "CAVOC" fulfills precisely this role of taxonomy and construction principles.

Element Role in the System Correspondence to Knowledge Graph Theory
Neo4j Infrastructure for storing data and relationships Property graph model
CAVOC Defines relationships between farm work, resources, fields, etc. Ontology, Taxonomy (Construction Principle)

By introducing CAVOC, it goes beyond a simple record (property graph) stating that "Person A performed work B in field C," and advanced functions like the following are born:

  1. Inference: Inference that "Rotary tillage (narrow term) is always a type of tillage work (broad term)" becomes possible from the data structure itself.
  2. Scalability: More complex data analysis becomes possible by integrating hierarchical structures (e.g., fertilizer classification, pest classification) from different organizations (e.g., agricultural cooperatives or research institutions).

In conclusion, I am convinced that by adding strict conventions and classification via CAVOC to the flexibility of leveraging the characteristics of the property graph, our system will evolve into a truly user-friendly, insight-rich knowledge graph. I will continue development while learning this theoretical backing.

References

Knowledge Graph Construction Guide. Authors: Jesús Barrasa, Jim Webber
https://book.mynavi.jp/ec/products/detail/id=144556

Discussion