iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📑

How I Secretly Used SQLite in a Company with a No-RDB Policy

に公開

Note: The system described in this article no longer exists.

Once upon a time, there was a system where the use of RDBs was practically prohibited.
Therefore, the primary choice for data stores was NoSQL. We were using DynamoDB.

By the way, that system had a ranking display feature.
It was a common feature that aggregated sales volumes by product category and displayed them in order starting from 1st place.
If you try to implement this in DynamoDB, you need to fetch all records and aggregate a huge JSON.
Naturally, a realistic response time cannot be expected.
How did that system handle it?
Yes, as the title says, we used SQLite. A bootleg RDB.

Architecture

On the batch server, sales performance for a certain period is output as CSV and imported into SQLite.
Aggregations are performed as needed, and the files are uploaded to S3 periodically.
Since API servers AutoScale, they download the SQLite file from S3 at startup and through periodic execution to replace it with the latest file.
This SQLite file has quite a large file size, so the s3 sync at startup takes time, delaying the boot time.

While moving files with s3 sync, etc., there were rare occasions where the SQLite file would get corrupted and couldn't be opened on the API server.
We implemented countermeasures such as running sqlite3 my.db "PRAGMA integrity_check";, but I remember it being quite a struggle as it consumed significant CPU on the batch server.

Migration

Now, regarding that bootleg SQLite setup, a neighboring system introduced Aurora MySQL as it became the trend, so we decided to take this opportunity to migrate.
I think we used some kind of Python-based tool for the SQLite to MySQL conversion...
On the API server, we were using an O/R mapper, and I thought replacing SQLite with MySQL wouldn't be that difficult, but performance degraded sharply, and we had to roll back once.
When I looked closely at the code, I realized that N+1 issues were occurring.
When using SQLite, it could respond fast enough since no network access was involved, so it hadn't been a problem.
Through all that, we gradually migrated table by table and finally completed the migration over the course of several years.

Summary

When I heard about Cloudflare D1, I thought, "Oh, that's similar to what we had back then."
There might still be cases today where one wants a data store with centralized updates and read-only access.
Looking back now, I wonder how the performance was during multiple connections, but there's no need to verify that anymore...

It might be useful for building a secure WordPress environment.
https://github.com/WordPress/sqlite-database-integration is still in beta, so it's for those confident in their technical prowess.
The setup would involve isolating the management screen for updates and distributing the SQLite file to the read-only WordPress instances.
In fact, you could even perform updates locally and just upload the SQLite file to S3.
Of course, if you have a significant amount of traffic, you're better off using MySQL normally.

Discussion