iTranslated by AI
Why ER Diagrams Quickly Become Outdated: Sustainable Practices for Real-World Development
Introduction
ER diagrams are often created at the very beginning of many projects. However, in practice, it is not uncommon for ER diagrams to stop being updated over time, eventually becoming a document that no one trusts.
In this article, I will organize the reasons why ER diagrams tend to become outdated and how to manage them in a way that is less likely to break in real-world scenarios. I will also explain a specific reading order to understand a database, which can be used regardless of whether you have history tables or not.
Design diagrams decay at different rates depending on their subject
Not all state transition diagrams become outdated at the same speed.
- Smartphone/Front-end screen states and UI transitions: Relatively long-lasting as long as the basic structure of screens and operations remains unchanged.
- DB design states and ER diagrams: Tend to become outdated frequently due to frequent column additions, attribute splitting, constraint changes, and table increases.
Based on this difference, it is realistic to operate DB-related diagrams not as "precise static documents," but as "regenerated reference materials."
ER diagrams become difficult to maintain because they are "heavy," not "weak"
It is not that ER diagrams themselves are bad, but the reality is that maintenance costs tend to skyrocket during periods of frequent change.
There are three main reasons for this:
- High layout adjustment costs.
- Incompatibility with high model change frequency.
- Tendency to diverge from the implementation.
1. High layout adjustment costs
Every time a table or FK is added, the layout is easily broken, often leading to the following situations:
- Boxes overlapping.
- Lines crossing, making it difficult to read.
- Need for manual rearrangement.
2. Incompatibility with high model change frequency
ER diagrams are effective when the domain is stable. On the other hand, during the initial design phase or when there are large requirement changes, updating the diagrams cannot keep up, and they quickly become outdated.
- Stable phase: ER diagrams are effective.
- Variable phase: ER diagrams tend to become obsolete.
3. Tendency to diverge from the implementation
Actual changes are concentrated in the following:
schema.sql- migration
- The actual DB
If the ER diagram is not synchronized with these, you end up with a state where only the diagram is incorrect.
Practical operations that are less likely to break
1. Assume automatic generation for ER diagrams
By simply not maintaining ER diagrams through manual editing, the operational burden is significantly reduced.
SchemaSpyMySQL WorkbenchDBeaverdbdocstbls
tbls doc mysql://user:pass@localhost/db
With this approach, you can regenerate the ER diagrams, table definitions, and HTML documents all at once.
2. Separate the overview from domain diagrams
If you consolidate a large-scale database into a single diagram, readability will drop. In practice, it is easier to maintain if you divide them as follows:
- user domain
- wallet domain
- trade domain
- campaign domain
3. Treat SQL and migrations as the source of truth
An operation that is hard to break is as follows:
- Make
schema.sqland migration files the primary information. - Treat the ER diagram as a generated product.
- Regenerate when necessary.
How to read a database in practice
This is the most important part of this article. Understanding not just the ER diagram, but the transitions of states will help you grasp the structure faster.
When there are history tables
First, read it along the "flow of time."
Points to check are as follows:
- Events:
event_type,occurred_at,actor_id,reason - History:
start_date,end_date,is_current,from_xxx,to_xxx - Current values: Which history record was folded into this result?
Accepting end_date NULL as a "necessary evil"
While the ideal is to be completely immutable (stacking events and recalculating the current value), applying this to all business domains can make implementation and operation costs too heavy.
Therefore, in practice, it is common to allow end_date NULL in period tables, and define NULL as "currently ongoing."
Examples of application:
- Company affiliation (
company_assignment_end_date) - Department/Section/Team assignment (
assigned_*_end_date) - Project assignment (
assignment_project_end_date) - Assumption of position (
assumption_of_position_end_date)
Points to note:
- Searches using
end_date IS NULLwill increase. - Queries can easily become scattered unless index design and SQL are unified.
- Integrity of start/end dates (
end_date IS NULL OR start_date <= end_date) should be enforced with CHECK constraints.
Cases where it is suitable / unsuitable (Distinction from complete immutability)
Cases where complete immutability (Event Sourcing type) is suitable
- Audit requirements are strict, and it is necessary to record who, when, and why strictly.
- Past reproducibility is paramount, such as in finance, billing, and accounting.
- The organization can maintain an operation where corrections are handled via "compensating events."
- The operational costs of event stores/projection tables are acceptable.
Cases where complete immutability is unsuitable (tends to become too heavy)
- You want to reduce operational costs for learning purposes or within small teams.
- The primary requirement is to quickly reference the "current affiliation/state."
- Business changes are frequent, and you want to prioritize implementation speed.
Cases where period tables (start_date / end_date NULL) are suitable
- Handling "from when to when" for affiliations or assignments is sufficient.
- You want to balance current value searches with history searches.
- You want to minimize implementation and maintenance costs while ensuring minimum auditability.
Decision criteria in practice
- Start with the period table approach, and extract only those domains with strict audit requirements into a completely immutable model.
- Do not attempt to make all domains immutable at once; apply it incrementally starting from areas with high business impact.
Mini example: How to read employee status
When reading this relationship, it is less confusing if you look at it in the following order:
- Check what happened in
employee_events. - Check the status transition in
employee_status_histories. - Check the current status of
employeesas the result.
When there are no history tables (UPDATE-oriented DBs)
Even in designs without history, you can establish a reading order.
For users.status, for example, organize it as follows:
- States:
active,suspended,banned - Transitions:
active -> suspended,suspended -> active,active -> banned - Update processes:
SuspendUser(),ActivateUser(),BanUser() - Table:
users.status
Advantages of this reading method
- Easier to map business flows to DB updates.
- Easier to trace why a state is what it is currently.
- Easier to detect inconsistencies.
- Easier to create testing perspectives.
For example, the following perspectives arise naturally:
- Does the leave event change the status to
leave? - Does the return-to-work event change
leave -> active? - Is current-equivalent data closed upon resignation?
- Is
bannednot updated via prohibited paths?
Summary
The problem with ER diagrams lies in the operational cost rather than the diagram itself. There are three key points for making them easier to handle in practice:
- Lean towards automatic generation for ER diagrams.
- Separate the overview from domain diagrams.
- Understand the DB in the order of State -> Transition -> Update -> Table.
By treating ER diagrams as tools for understanding and SQL/migrations as the primary information, it becomes easier to keep design and implementation consistent.
Discussion