iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🙆‍♀️

Comparing Approaches for Database Schema Change Management

に公開

Introduction

There are several ways to manage database schema changes. Rather than there being a single "correct" answer, the suitability varies depending on the team size and operation phase. Here, I will organize representative methods often seen in the field from a practical perspective.

Relying on Frameworks (ORM/Migration Tools)

This method is represented by Rails, Django, Laravel, GORM AutoMigrate, Hibernate Schema, etc.

Pros

  • Routine changes can be advanced quickly. Adding columns or tables can be completed with a command.
  • Procedures are easy to align within the team. Differences between developers are less likely to occur.
  • Change management and rollback mechanisms can be used, ensuring high reproducibility of development environments.
  • Easy to combine with CI/CD, making the construction of verification environments more stable.

Cons

  • Some changes are difficult to handle with automatic generation alone. Complex DDL or performance requirements often require manual work.
  • Behavior is influenced by the version of the ORM or tool. Unexpected differences may appear during updates.
  • Generated SQL is not always optimal, and situations requiring manual SQL adjustment will remain.
  • Relying heavily on automatic application in production carries the risk of unintended changes being introduced.

Explicitly Managing Migration Files (SQL or DSL)

This method uses Flyway, Liquibase, sql-migrate, migrate, etc.

Pros

  • Strong for auditing and troubleshooting since change history can be preserved as files.
  • Since SQL can be written explicitly, it is easy to handle complex DDL and index optimization.
  • If policies for up/down and re-execution are decided, discrepancies between environments can be suppressed.
  • Content to be applied to production is easy to review beforehand, making it easier to increase safety.

Cons

  • As the number of files increases, organization becomes necessary. Naming conventions and directory design are mandatory.
  • Operations can easily break down if the understanding of SQL is shallow. There are adoption barriers in ORM-centric cultures.
  • Initialization procedures, including version tables, are required for setting up new environments.
  • Confusion can easily occur if tool selection and operation rules are not solidified beforehand.

Manually Managing SQL Files in Folders

Examples include managing files with dates, such as /db/sql/20240201_add_user_table.sql, or creating folders for each version, like /db/sql/3.3.0.0/, and gathering the SQL to be applied for that version.

In some cases, operations involve updating a master file representing the latest state in addition to the incremental SQL files.

Pros

  • The mechanism is simple, and the introduction cost is low.
  • It does not depend on specific tools, so it can be used regardless of the language or execution environment.
  • Since everything is expressed in SQL, it is easy to perform fine-grained control.
  • It is easy to align with existing infrastructure operations and easy to proceed in environments familiar with this approach.

Cons

  • Application order and tracking which scripts have been run tend to become dependent on individuals.
  • Rollbacks also require manual intervention, increasing the recovery burden in the event of a failure.
  • It tends to become difficult to identify discrepancies between environments.
  • If review criteria are not standardized, quality can fluctuate and knowledge can easily become siloed.
  • In operations where both a master file and incremental files are updated, there is a risk of updating one and forgetting the other.

Not Managing Migration Files (Recreating the Schema Every Time)

This method involves having only the latest schema, such as schema.sql, and rebuilding the environment using drop/create for each change.

The choice of this method is often strongly influenced by past operational experience. For example, in projects that have used Laravel migrations for a long time, problems such as the latest schema state becoming difficult to track or migration execution times increasing tend to occur. While there are methods to periodically consolidate and organize migration files, that task itself can be burdensome. As a result, some cases shift to an operation where "only the master file is treated as the source of truth." In practice, some teams operate by taking a dump after confirming changes in a QA environment, updating the master, and then reflecting it in each environment.

Pros

  • Operational overhead is low since managing history files is unnecessary.
  • When setting up a new environment, applying the latest schema once is sufficient.
  • Issues originating from history, such as missing applications or skipped sequence numbers, are less likely to occur.
  • In PoCs or short-lived products, it is easier to achieve high speed through trial and error.

Cons

  • In environments with large production data, an operation based on rebuilding is not realistic.
  • It is difficult to track the reasons and processes of changes, making it weak for audit compliance.
  • It is not suitable for zero-downtime migrations or gradual incremental applications.
  • Data migration needs to be managed separately, which can result in more complex operations.
  • Even for small changes like adding a column, if the operation involves taking a full dump from the QA environment and reflecting it, the application process can take a long time.

Which One Should You Choose?

  • Validation phase / short-lived products: No file management (schema reconstruction). Easy to proceed with speed as the top priority.
  • Ultra-small scale (1-3 people): Manual SQL management. You can iterate quickly with minimal mechanisms.
  • Teams of 3-20 people: Migration file management. Offers a good balance of reproducibility, auditability, and history management.
  • Large-scale / long-term operation: Migration file management + CI/CD automatic application. Easier to balance stable operation and change control.

In practice, a realistic flow is to start with manual SQL management and then transition to migration file management once the team or product grows larger.

Summary

  • Relying on frameworks makes it easy to achieve development speed, but caution is needed regarding transparency in production operations.
  • Migration file management is easy to use as a standard solution, but it is predicated on establishing rules.
  • Manual SQL management is easy to introduce, but it tends to break down as the scale increases.
  • Not managing files (reconstruction type) provides fast initial speed but is not suitable for long-term operation or auditing.

Ultimately, making a choice based on team size, operation period, audit requirements, and the complexity of schema changes is the best way to avoid failure.

Discussion