iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📸

Understanding Metadata Locks in MySQL Online DDL for Safe ALTER TABLE Execution

に公開

Recently, I've been spending my spare time at work almost exclusively on activities to add NOT NULL constraints.

Now that this effort has reached a milestone, I'd like to introduce that initiative and what I learned about MySQL's Online DDL during the process.

Environment

  • MySQL 8.0.32
  • Transaction Isolation Level: REPEATABLE READ
mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 8.0.32    |
+-----------+
1 row in set (0.00 sec)

mysql> SELECT @@GLOBAL.transaction_isolation, @@transaction_isolation;
+--------------------------------+-------------------------+
| @@GLOBAL.transaction_isolation | @@transaction_isolation |
+--------------------------------+-------------------------+
| REPEATABLE-READ      | REPEATABLE-READ         |
+--------------------------------+-------------------------+
1 row in set (0.00 sec)

Introduction of Terms

Regarding English terms that are likely to appear casually in this article:
Since they are all three-letter acronyms and can be confusing until you get used to them, I'll provide a brief explanation beforehand.

Abbreviation Full Name Description Main Command Examples
DDL Data Definition Language Operations for changing the definition of tables or databases CREATE, ALTER, DROP, RENAME, TRUNCATE
DML Data Manipulation Language Commands for manipulating the data content. Adding, updating, or deleting rows SELECT, INSERT, UPDATE, DELETE
DCL Data Control Language Operations for managing and controlling access permissions GRANT, REVOKE
MDL Metadata Lock (*Informal classification) Locks on table definitions. Internal processing used automatically behind DDL and DML No explicit SQL. Occurs internally during execution of SELECT, ALTER, etc.

ALTER TABLE Types and ALGORITHM

Even though we say we're executing an ALTER TABLE, the behavior varies slightly depending on what kind of ALTER TABLE operation you are running.
MySQL's official documentation provides a table summarizing the behavior for various ALTER TABLE operations.
For example, what I was doing this time was adding a NOT NULL constraint, so it corresponds to the "Making a column NOT NULL" row in the table.
MySQL 8.0 Online DDL Operations
MySQL 8.0 Online DDL Support for Generated Column Operations
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-generated-column-operations

Making a column NOT NULL results in:

  • Instant: No
  • In-place: Yes

These "Instant" and "In-place" terms refer to the types of algorithms used during a MySQL ALTER TABLE. There are three types: COPY, INPLACE, and INSTANT.

Roughly speaking, the differences in characteristics are:

  • COPY: Slow. You need to be careful as it takes a long lock.
  • INPLACE: The most common case. While concurrent DML is allowed, be aware that an exclusive metadata lock is still acquired.
  • INSTANT: Fast. No exclusive metadata lock is performed.

The differences can be summarized in the following table:

Feature COPY INPLACE INSTANT
Table Operation Creates a new table and copies data row by row Applies changes directly to the existing table Changes only the metadata within the data dictionary
Table Rebuild Required Generally not required Not required
Processing Speed Slow Moderate Very fast
Impact on DML Large (Concurrent DML is not allowed during the operation) Relatively small (Short-term exclusive metadata locks may occur during the preparation/execution phases, but concurrent DML is supported) Very small (No exclusive metadata locks are performed and table data is unaffected, allowing concurrent DML)
Exclusive Metadata Lock Exclusive lock on the entire table throughout the operation May be acquired for a short time during the preparation/execution phases Not acquired during preparation and execution
Main Usage Significant changes to table structure (Data type changes, character set changes, etc.) Many common ALTER TABLE operations (Adding/dropping columns, adding indexes, etc., where a rebuild is not required) Operations requiring only metadata changes, such as adding a column (at the end only) or renaming a column

Reference:
https://dev.mysql.com/doc/refman/8.0/en/alter-table.html#alter-table-performance

By the way, if you omit ALGORITHM, MySQL will attempt to use the supported methods in the order of INSTANT -> INPLACE -> COPY.

LOCK Types

When making a column NOT NULL, you can select the type of LOCK in addition to the ALGORITHM.

ALTER TABLE tbl_name MODIFY COLUMN column_name data_type NOT NULL, ALGORITHM=INPLACE, LOCK=NONE;
LOCK Option Read (SELECT) Write (INSERT/UPDATE/DELETE)
LOCK=NONE ✅ Executable ✅ Executable
LOCK=SHARED ✅ Executable ⏳ Blocked
LOCK=EXCLUSIVE ⏳ Blocked ⏳ Blocked

What is a Metadata Lock?

In the explanation so far, the term "metadata lock" has appeared. Here, let's review what metadata is and what a metadata lock is.

What is Metadata?

Simply put, it refers to the structure and definition of a table.

Classification Content Specific Examples Operation Examples
Metadata Structure and definition of a table - users table's name VARCHAR(255) NOT NULL definition
- PRIMARY KEY(id)
- DEFAULT CURRENT_TIMESTAMP for created_at
- INT type for age column
- ALTER TABLE
- CREATE INDEX
- DROP COLUMN
Data (not a formal name) Actual contents (records) - id: 1, name: 'Yuko', age: 29
- id: 2, name: 'Takashi', age: 33
- SELECT
- INSERT
- UPDATE
- DELETE

In other words, a metadata lock is a lock on the table definition, which is the metadata. A metadata lock locks the table's "definition" to prevent other transactions from changing the definition while a schema change is in progress.

Review of Shared Locks and Exclusive Locks

When performing SELECT or INSERT during application development, you might acquire row locks to maintain data consistency. These locks include shared locks and exclusive locks. The main difference lies in whether they allow SELECT from other transactions.

Lock Type DML Read from other transactions Write from other transactions
Shared Lock (S Lock) SELECT ... FOR SHARE Allowed Blocked
Exclusive Lock (X Lock) SELECT ... FOR UPDATE Blocked Blocked

Shared Metadata Locks / Exclusive Metadata Locks

Just as with the concepts described above, metadata also has "Shared Metadata Locks" (similar to shared locks) and "Exclusive Metadata Locks" (similar to exclusive locks).

A shared metadata lock is acquired even during operations like SELECT. While an exclusive metadata lock is held, a shared metadata lock cannot be acquired. This means that a SELECT statement, which needs a shared metadata lock, might be blocked from executing.

The detailed differences are summarized in the table below.

Lock Type Main Purpose When is it acquired? Metadata Read from other transactions Metadata Write from other transactions Data Read/Write from other transactions
Shared Metadata Lock Allows concurrent reference to metadata while preventing changes - Execution of SELECT statements
- Other operations that read table metadata, such as obtaining an EXPLAIN plan
Allowed Blocked Allowed
Exclusive Metadata Lock Guarantees exclusive access to metadata - Execution of ALTER TABLE statements, etc.
- Other operations that change table metadata (not always)
Blocked Blocked Blocked

The 3 Steps of ALTER TABLE Execution

So far, we have learned that when ALTER TABLE is executed, an ALGORITHM is selected, metadata locks occur in some cases, and these metadata locks consist of Shared Metadata Locks and Exclusive Metadata Locks.

Next, let's look more specifically at the flow of what happens when ALTER TABLE is run.
The execution of ALTER TABLE is broadly performed in three steps:

  1. Initialization: Determination of the ALGORITHM (Acquires a Shared Metadata Lock)
  2. Execution: Execution of operations such as table copying (Acquires an Exclusive Metadata Lock immediately after execution begins; returns to a Shared Metadata Lock after release)
  3. Commit: DDL Commit (Acquires an Exclusive Metadata Lock immediately after execution begins and commits)

These are the three steps.

Reference:
https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-performance.html#innodb-online-ddl-metadata-locks

By understanding these three steps, the points to be careful about when executing ALTER TABLE become clear.

Precautions When Executing ALTER TABLE

When transactions occur in other sessions before or during the execution of the three steps introduced above, there are some points to be aware of.

Caution 1: When DML on table t1 is running in another session and a shared metadata lock has been acquired before executing ALTER TABLE on table t1

In ALTER TABLE, an exclusive metadata lock is acquired immediately after transitioning from Phase 1 (Initialization) to Phase 2 (Execution).
However, if a long transaction is being held in a certain session at this time, the DDL session will have to wait to acquire the exclusive metadata lock.
Consequently, in another session trying to execute DML, it will wait for the release of the exclusive metadata lock by the DDL to acquire a shared metadata lock, causing even SELECT statements to get stuck.

This can be illustrated as follows:

  • Session 1: DML execution
    • A transaction is held on table t1 before ALTER TABLE is executed.
    • A shared metadata lock is acquired at the same time.
  • Session 2: ALTER TABLE execution
    • A shared metadata lock is acquired.
    • It attempts to upgrade the shared metadata lock to an exclusive metadata lock but waits for the transaction in Session 1 to end.
  • Session 3: DML execution
    • It attempts to acquire a shared metadata lock but waits for the exclusive metadata lock in Session 2 to be released.

This is the resulting state.

In other words, even with online DDL, if a long transaction is running on the table you want to ALTER, there is a risk that other DML will get stuck, so caution is required.

Caution 2: When DML on table t1 is running in another session and a shared metadata lock is acquired during ALTER TABLE execution (before DDL commit) on table t1

This is similar to Caution 1 above.
An exclusive metadata lock is acquired not only immediately after Phase 2 (Execution) of ALTER TABLE but also immediately after Phase 3 (Commit).
If a long transaction is running before acquiring the exclusive metadata lock in Phase 3, subsequent DML will also get stuck.

The diagram is as follows:

Caution 3: When a transaction is running in Session A but only contains DML for table t2 → DDL for table t1 runs in Session 2 → DML for table t1 is then executed in Session A's transaction

This is a point of caution when the transaction isolation level is REPEATABLE READ.
It's a bit complex, but the diagram below illustrates it:

  1. A transaction is started in Session A, but it only involved DML on table t2.
  2. Since the table Session 2 wants to perform ALTER TABLE on is table t1, it was able to acquire an exclusive metadata lock, so the DDL was executed and committed.
  3. DML is then run on table t1 within Session A's transaction.
  4. The definition of table t1 is different from what it was at the start of the transaction in step 1.
  5. A Table definition has changed, please retry transaction error occurs.

In essence, a snapshot is acquired at the start of the transaction. If the table definition at the time of DML execution differs from the one in that snapshot, the Table definition has changed, please retry transaction error will be triggered.

I've introduced these three points of caution. Since all of them are often caused by long-running transactions, it's a good idea to check for any long transactions before execution.

Conclusion

This concludes the long explanation of Online DDL.
I researched Online DDL because I was worried about errors when adding NOT NULL constraints myself.
Adding NOT NULL constraints was completed successfully, even on tables with tens of millions of records.
Next, I plan to start adding foreign key constraints as well.

By the way, adding a foreign key constraint uses ALGORITHM=INPLACE when SELECT @@foreign_key_checks; is 0, but it switches to ALGORITHM=COPY when it is 1.
I hope to continue learning more about MySQL through further research.

Discussion