iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🧪

pg-testkit SQL Unit Test Benchmark (ZTD Support)

に公開

I conducted benchmark measurements for the SQL unit testing kit @rawsql-ts/pg-testkit (hereafter referred to as pg-testkit) and would like to share the results.

What is pg-testkit?

pg-testkit is a SQL unit testing kit written in TypeScript that runs on Node.js and has the following features:

  • Supports Postgres
  • Supports both the ZTD (Zero Table Dependency) method and the Traditional method (general SQL unit testing)

Specifically, support for the ZTD method is crucial.

The ZTD method is a way to unit test SQL without migrations or seeding, allowing for extremely fast verification of SQL.

For a detailed explanation of how ZTD works, please refer to the following article:

https://zenn.dev/mkmonaka/articles/c2413d99ae67bb

Purpose of the benchmark

While it is theoretically obvious that ZTD is fast, I had not yet provided quantitative data on how much of a difference there is compared to the Traditional method.

Therefore, I decided to measure the benchmark under identical conditions and compare the numbers.

Conditions

  • Number of trials: 5
  • Number of tests: 50, 100, 300
  • Test method: ZTD, Traditional
  • Parallelism: 1, 2, 4
  • DB connection method: perTest, shared

Differences in Test Methods

  • ZTD
    Analyzes SQL using the Zero Table Dependency method and performs tests at the semantic level.
    No migrations or seeding occur.
  • Traditional
    A general testing method.
    Requires migrations and seeding before the test, and cleanup processing after the test.

Since ZTD does not involve migrations or seeding, it should theoretically be faster.

Differences in DB Connection Methods

  • perTest
    Connects to and disconnects from the DB for each test.
  • shared
    Connects to the DB per worker and shares the connection across multiple tests.

While the shared method can reduce the cost of DB connection processing, it is more likely to cause contention (waiting) due to transaction sharing.

Since ZTD does not involve table operations, it is thought that shared connections can be utilized more effectively.

Test Code

The following is a simplified example of the actual code, but the testing concept remains the same. The same fixture definitions and repository code are used, and the tests are executed by switching between ZTD mode and Traditional mode.

For the complete code, please refer to the following repository:
https://github.com/mk3008/rawsql-ts/tree/main/benchmarks/sql-unit-test

Repository Class

import { customerSummarySql } from '../sql/customer_summary';
import { CustomerSummaryRepositoryClient } from './CustomerSummaryRepositoryClient';
import { CustomerSummaryRow } from './CustomerSummaryRow';

/** Repository that exposes customer summary aggregations without any additional inputs. */
export class CustomerSummaryRepository {
  constructor(private readonly client: CustomerSummaryRepositoryClient) {}

  customerSummary(): Promise<CustomerSummaryRow[]> {
    return this.client.query<CustomerSummaryRow>(customerSummarySql);
  }
}

Test Code

/**
 * Executes the same test scenario in both ZTD and Traditional modes.
 *
 * On the test code side, just by switching the mode,
 * the SQL, repository, and assertions remain exactly the same.
 */
async function runInlineScenario(mode: 'ztd' | 'traditional') {
  const client = await createTestkitClient(inlineFixtures, { mode });

  try {
    const repository = new CustomerSummaryRepository(client);
    const rows = await repository.customerSummary();
    expect(rows).toEqual(inlineExpectedRows);
  } finally {
    await client.close();
  }
}

SQL Issued in the Traditional Method

Migration

CREATE SCHEMA IF NOT EXISTS "ztd_traditional_mju5br8k_pvove"
;
SET search_path TO "ztd_traditional_mju5br8k_pvove", public
;
CREATE TABLE customer (
  customer_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  customer_name text NOT NULL,
  customer_email text NOT NULL,
  registered_at timestamp NOT NULL
)
;
CREATE TABLE product (
  product_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  product_name text NOT NULL,
  list_price numeric NOT NULL,
  product_category_id bigint
)
;
CREATE TABLE sales_order (
  sales_order_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  customer_id bigint NOT NULL REFERENCES customer (customer_id),
  sales_order_date date NOT NULL,
  sales_order_status_code int NOT NULL
)
;
CREATE TABLE sales_order_item (
  sales_order_item_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  sales_order_id bigint NOT NULL REFERENCES sales_order (sales_order_id),
  product_id bigint NOT NULL REFERENCES product (product_id),
  quantity int NOT NULL,
  unit_price numeric NOT NULL
)
;

Seeding

/*
(parameters: [10,"Widget","25.00",1], [11,"Gadget","75.00",2], [12,"Accessory","5.00",null])
*/
INSERT INTO "ztd_traditional_mju5br8k_pvove"."customer" ("customer_id", "customer_name", "customer_email", "registered_at") VALUES ($1, $2, $3, $4)
;
/*
(parameters: [100,1,"2025-12-04",2], [101,1,"2025-12-06",2], [200,2,"2025-12-05",2])
*/
INSERT INTO "ztd_traditional_mju5br8k_pvove"."sales_order" ("sales_order_id", "customer_id", "sales_order_date", "sales_order_status_code") VALUES ($1, $2, $3, $4)
;
/*
(parameters: [1001,100,10,2,"25.00"], [1002,101,11,1,"75.00"], [1003,200,12,3,"5.00"])
*/
INSERT INTO "ztd_traditional_mju5br8k_pvove"."sales_order_item" ("sales_order_item_id", "sales_order_id", "product_id", "quantity", "unit_price") VALUES ($1, $2, $3, $4, $5)

Test Query

SELECT
  c.customer_id,
  c.customer_name,
  c.customer_email,
  COUNT(DISTINCT o.sales_order_id) AS total_orders,
  COALESCE(SUM(oi.quantity * oi.unit_price), 0) AS total_amount,
  MAX(o.sales_order_date) AS last_order_date
FROM customer c
LEFT JOIN sales_order o ON o.customer_id = c.customer_id
LEFT JOIN sales_order_item oi ON oi.sales_order_id = o.sales_order_id
GROUP BY c.customer_id, c.customer_name, c.customer_email
ORDER BY c.customer_id;

Cleanup

DROP SCHEMA IF EXISTS "ztd_traditional_mju5br8k_pvove" CASCADE

SQL Issued in the ZTD Method

Only one selection query is executed. This query is automatically generated by pg-testkit.
There are no migrations or seeding.

with "public_customer" as (select 1::bigint as "customer_id", 'Alice'::text as "customer_name", 'alice@example.com'::text as "customer_email", '2025-12-01T08:00:00Z'::timestamp as "registered_at" union all select 2::bigint as "customer_id", 'Bob'::text as "customer_name", 'bob@example.com'::text as "customer_email", '2025-12-02T09:00:00Z'::timestamp as "registered_at" union all select 3::bigint as "customer_id", 'Cara'::text as "customer_name", 'cara@example.com'::text as "customer_email", '2025-12-03T10:00:00Z'::timestamp as "registered_at"), "public_sales_order" as (select 100::bigint as "sales_order_id", 1::bigint as "customer_id", '2025-12-04'::date as "sales_order_date", 2::int as "sales_order_status_code" union all select 101::bigint as "sales_order_id", 1::bigint as "customer_id", '2025-12-06'::date as "sales_order_date", 2::int as "sales_order_status_code" union all select 200::bigint as "sales_order_id", 2::bigint as "customer_id", '2025-12-05'::date as "sales_order_date", 2::int as "sales_order_status_code"), "public_sales_order_item" as (select 1001::bigint as "sales_order_item_id", 100::bigint as "sales_order_id", 10::bigint as "product_id", 2::int as "quantity", '25.00'::numeric as "unit_price" union all select 1002::bigint as "sales_order_item_id", 101::bigint as "sales_order_id", 11::bigint as "product_id", 1::int as "quantity", '75.00'::numeric as "unit_price" union all select 1003::bigint as "sales_order_item_id", 200::bigint as "sales_order_id", 12::bigint as "product_id", 3::int as "quantity", '5.00'::numeric as "unit_price") select "c"."customer_id", "c"."customer_name", "c"."customer_email", count(distinct "o"."sales_order_id") as "total_orders", coalesce(sum("oi"."quantity" * "oi"."unit_price"), 0) as "total_amount", max("o"."sales_order_date") as "last_order_date" from "public_customer" as "c" left join "public_sales_order" as "o" on "o"."customer_id" = "c"."customer_id" left join "public_sales_order_item" as "oi" on "oi"."sales_order_id" = "o"."sales_order_id" group by "c"."customer_id", "c"."customer_name", "c"."customer_email" order by "c"."customer_id"

Test Results

The following tables show the "total execution time required to run one set of test scenarios (50 tests / 100 tests / 300 tests)". Each value is the average of five trials.

50 tests - traditional, perTest

Method Connection Tests Parallel Mean (ms) Standard Error (ms) StdDev (ms)
traditional perTest (exclusive) 50 1 2434.68 31.96 71.46
traditional perTest (exclusive) 50 2 2554.37 16.63 37.19
traditional perTest (exclusive) 50 4 2772.67 37.82 84.56

Parallel processing on the DB side became a bottleneck, resulting in execution times worsening as the degree of parallelism increased.

50 tests - traditional, shared

Method Connection Tests Parallel Mean (ms) Standard Error (ms) StdDev (ms)
traditional shared (reused) 50 1 2484.17 28.09 62.82
traditional shared (reused) 50 2 2564.20 49.45 110.58
traditional shared (reused) 50 4 2780.85 48.54 108.54

Even when sharing DB connections, the trend did not change significantly, showing that the improvement effect of different connection methods is limited in the Traditional method.

50 tests - ztd, perTest

Method Connection Tests Parallel Mean (ms) Standard Error (ms) StdDev (ms)
ztd perTest (exclusive) 50 1 573.71 19.39 43.35
ztd perTest (exclusive) 50 2 633.39 24.14 53.99
ztd perTest (exclusive) 50 4 721.67 10.22 22.85

ZTD was up to approximately 4.2 times faster compared to Traditional (2434.68 / 573.71). However, like Traditional, no clear performance improvement from parallelization was observed.

50 tests - ztd, shared

Method Connection Tests Parallel Mean (ms) Standard Error (ms) StdDev (ms)
ztd shared (reused) 50 1 133.23 1.45 3.24
ztd shared (reused) 50 2 176.85 6.60 14.75
ztd shared (reused) 50 4 370.04 17.68 39.54

Under these conditions, ZTD resulted in being approximately 18.2 times faster than Traditional (2434.68 / 133.23).

This result indicates that in ZTD, the cost of opening and closing DB connections accounts for a very large proportion of the total execution time.


In the Traditional method, even if the number of open/close operations was reduced by sharing DB connections, the total execution time hardly changed.

This is thought to be because, in the Traditional method, actual table operations such as migrations, seeding, and cleanup occur on a per-test basis. As a result, shared connections might have increased transaction and lock contention, potentially worsening the execution time.

On the other hand, since the ZTD method does not operate on actual tables at all, contention due to shared connections almost never occurs. Therefore, the effect of reducing open/close operations through connection sharing directly manifested as a performance improvement.

I also measured 100 and 300 tests, but since the trends were exactly the same as the 50 tests, I have omitted them.

Summary

  • In the Traditional method, the actual table operations themselves are the dominant cost rather than connection management, so improvements through parallelization or connection method optimizations are limited.
  • The ZTD method is more than 18 times faster than the Traditional method. In particular, ZTD has excellent compatibility with shared connections.

It was quantitatively confirmed that ZTD is not only fast on its own (4x faster) but becomes even faster (18x faster) when connection sharing is used.

When performing SQL unit tests, I think it is practical to first consider whether testing can be done using the ZTD method and only use the Traditional method when it is necessary to verify DB-dependent behavior.

Side Note: Building the Benchmark

Since writing test code for pg-testkit can be quite a lot of work, I developed a CLI to write it efficiently. This has also turned out to be an interesting project, so I would like to introduce it in the near future.

Discussion