iTranslated by AI
🎬 Data Modeling — Prologue: Behind the Scenes of the Three-Schema Architecture
Preface: This is a Story. But it's Built on Facts.
What is written here is composed of "facts + story."
Facts are explicitly indicated as citations,
while the story parts are fiction based on imagination and inference.
However —
this story is not just an old tale.
In the world of data, for a long time:
- Modeling matured (spread as de facto) as a "company-specific approach"
- SQL was established as a "standardized common language"
- RDBMS products differentiated themselves through "proprietary paths," creating axes for lock-in
Evolution has proceeded through these slightly twisted three pillars.
As a result,
a state where "meaning, structure, and reality do not face the same direction" continues today.
By reading this as a story,
I hope you can sense even a little of the background
behind why data modeling remains "fragmented" to this day
and why there is room for a new reorganization.
Chapter 1: Before 1975 — Silent Crustal Movements
■1960s–1970s: The Era of File Systems
Hierarchical DBMS (IMS) and network DBMS (CODASYL) were mainstream.
The exhaust heat from cold mainframes slowly vibrated the laboratory floor.
People standing before giant computers still treated "data" as a single lump.
A world closely tied to hierarchies, networks, and programs.
Data was not something to be "managed," but something "embedded."
In the field, data structure was the very way programs were written.
Every time the structure changed, the application had to be rewritten.
If the person in charge changed, the meaning of the data vanished.
Something was wrong —
but no one could yet put that "wrongness" into words.
Meanwhile, there were those moving quietly.
They thought about separating "what data is" from the program.
That would become the spark that later ignited a revolution.
Chapter 2: Before 1980 — Collision of Two Ideologies
2.1 ANSI's "Three-Schema Architecture"
■1975, ANSI/SPARC Committee
The committee released the "Three-Schema Architecture (External, Conceptual, Internal)."
It is frequently cited in later years as a theoretical model aiming for data independence.
The air in the meeting room was dry.
Theoreticians were trying to layer data.
External, Conceptual, Internal.
Independence of structure.
Separation of application and data.
It was a "grand theory."
However, there were no diagrams or tools.
It was a map of a far, far distant future that showed no sign of working even if handed to practitioners.
2.2 The Following Year, ER Diagrams Change the World
■1976, Peter Chen
Chen published the "Entity–Relationship model" in a paper.
The following year, 1976.
A single diagram appeared.
Though it was a diagram born from a researcher's paper,
it was a diagram the field could use "as is."
Entities.
Relationships.
Attributes.
Drawing them all on one sheet.
Concepts and logic melting into a single piece of paper.
People called it an "ER diagram."
In this moment, the world was quietly split in two.
- ANSI: Theoretical model (not used)
- ER: Field tool (exploded in popularity)
The two ideologies began to advance, passing each other by.
2.3 Why it wasn't Standardized
Normally, it wouldn't have been strange for ER diagrams to be standardized.
This is because neither the ANSI Three-Schema theory nor the ER model essentially compete with each other.
- ANSI: The philosophy of data independence
- ER: The technique of modeling
Originally, they could have been integrated.
However —
ANSI did not do that.
If they had integrated, the ANSI Three-Schema would have been "swallowed by ER."
The Three-Schema Architecture is passed down as a theory,
but it sank as a "textbook concept" that no one in the field uses.
And ER continued to win in the actual field,
while the Three-Schema Architecture remained as "reading material."
That misalignment was the beginning of the "fragmentation" that would last for the next 50 years.
Chapter 3: 1980–2000 — SQL Standardization and the Rise of Vendors
3.1 The Beginning of SQL Standardization
■1986, First Edition of ANSI SQL (SQL-86) Standardized
Adopted by Oracle, IBM, Ingres, and others, the RDB era began in earnest.
When SQL standardization began,
DB vendors rushed into an "implementation race" at a furious pace.
- Oracle
- IBM DB2
- Sybase
- Ingres
- Informix
Each developed their own proprietary extensions,
and "Standard SQL" became a mere formality.
At this time,
if ER had been standardized,
the story might have been a little different.
3.2 Proprietary Vendor Features Become the "Key"
RDB vendors added proprietary features for performance improvement and lock-in.
- Proprietary data types
- Proprietary functions
- Proprietary transaction control
- Proprietary optimizations
- Proprietary DDL extensions
Once customers chose a vendor, they couldn't leave.
The era of so-called vendor lock-in began.
3.3 ER Diagrams Became "Diagramming Tools"
ER diagrams essentially spread in tandem with modeling tools.
However, the tools —
packed logic and physical aspects onto a single sheet,
and even began to represent indexes in the diagrams.
The diagrams became heavy,
and instead of being "diagrams that convey the original intent,"
they became "diagrams that output the implementation structure."
ER diagrams,
while forgetting their original purpose,
were dragged along by SQL and proprietary vendor features.
Chapter 4: From 2000 Onwards — The Rise of OSS and the Return of Chaos
4.1 The OSS Counterattack
■2005 onwards: PostgreSQL/MySQL spread to medium and large scales
OSS permeated rapidly, bolstered by the push of the cloud era.
Commercial DB prices continued to skyrocket.
Specifically, a certain Company O did not back down from its aggressive stance on licensing structures and maintenance costs, and user companies gradually became exhausted.
As a reaction to that, the adoption of OSS proceeded at an accelerated pace.
4.2 Modeling Fragments Even Further
- NoSQL
- Document-oriented
- Column-oriented
- KVS
- Graph DB
- Cloud-native DB
"Correct modeling" no longer existed.
ER diagrams became "classics" in many areas, and the ANSI three-schema architecture became a "philosophy in textbooks."
The world proceeded while carrying the "legacy of that era" when the modeling system was never standardized.
Chapter 5: Epilogue — At the End of 50 Years of Misalignment
The 50 years from 1975 to 2025.
Data modeling was never once standardized.
The reason is simple:
Because "no one was looking in the same direction."
- ANSI: Protecting the philosophy
- ER: Saving the field
- Vendors: Competing with proprietary features
- Tools: Becoming devices that churn out implementations
And today,
the complexity of data structures has increased,
and modeling has regressed back to being an "individual skill."
But —
it is precisely because of now that there is room for reconstruction.
Let's discuss that "next destination" in the main part of this series.
Discussion