🗂️

【Python】pipでインストールするデータベースLMDB ～インプロセスデータベース

2025/05/07に公開

データベースをインストールしないデータベース(？)

$MySQL$ を使うには、 $MySQL$ をインストールする必要がある。
$PostgreSQL$ を使うには、 $PostgreSQL$ をインストールする必要がある。
$Redis$ を使うには、 $Redis$ をインストールする必要がある。
$CockroachDB$ を使うには、 $CockroachDB$ をインストールする必要がある。

当たり前とはいえ、プログラミング言語のみで完結しないのはよく考えると面倒ではないか🤔。
データベースをインストールして設定する手間、データベースに於けるユーザー管理の手間、データベースとの通信の手間。縦令それが苦ならざるとも、これら全てが省略できるというならば、事情によっては省略したいだろう。

そこで、pipでインストールできるデータベース「lmdb」があったので紹介。

https://pypi.org/project/lmdb/

導入

$LMDB$ はpipコマンドで入手する。 $PEP668$ でインストールできない場合があるものの、仮想環境を使えば解決する。仮想環境の作り方など、それ以上詳しい話題は省略する。

pip install lmdb

使い方

$LMDB$ は、上に挙げた $Redis$ のように、「 $key$ - $value$ の対応」で表現されるデータを扱う $NoSQL$ のデータベースである。「 $NoSQL$ は $SQL$ を使わない」ため、教学も必要ない。

プログラムは次のものを改変した。

main.py

# https://github.com/jnwatson/py-lmdb/blob/master/examples/address-book.py

import lmdb


def main():
    ''' create a database environment '''
    # db = lmdb.open(
    db_env: lmdb.Environment = lmdb.Environment(path = 'sample_db', max_dbs = 10)

    ''' create sub-databases '''
    home_db: lmdb._Database = db_env.open_db(key = b'home')
    business_db: lmdb._Database = db_env.open_db(key = b'business')

    ''' write new data '''
    with db_env.begin(write = True) as write_transaction:
        write_transaction.put(db = home_db, key = b'mother', value = b'098-7654-3210')
        write_transaction.put(db = home_db, key = b'father', value = b'012-3456-7890')
        write_transaction.put(db = business_db, key = b'vendor', value = b'086-4209-7531')
        write_transaction.put(db = business_db, key = b'manager', value = b'013-5790-2468')

    ''' read data '''
    with db_env.begin() as read_transaction:
        home_cursor: lmdb.Cursor = read_transaction.cursor(db = home_db)
        business_cursor: lmdb.Cursor = read_transaction.cursor(db = business_db)
        for key, value in home_cursor:
            print(key, value)
        for key, value in business_cursor:
            print(key, value)

if __name__ == "__main__":
    main()

実行結果

> python main.py
b'father' b'012-3456-7890'
b'mother' b'098-7654-3210'
b'manager' b'013-5790-2468'
b'vendor' b'086-4209-7531'

実行後、sample_dbというフォルダーが生成される。これがデータベースに該当する。

内容を表示してみる

> type .\sample_db\data.mdb
����
����

���business
0home   ���business
0home���
father012-3456-7890
mother098-7654-3210���business0home���
father012-3456-7890
mother098-7654-3210���
manager013-5790-2468
vendor086-4209-7531���(

(

                ���
father012-3456-7890
mother098-7654-3210
���
manager013-5790-2468
vendor086-4209-7531�
��((


���
manager013-5790-2468
vendor086-4209-7531�
���(

        (

それらしいデータが認められる。

配布もできる

pyinstallerを使うと、 $Python$ を実行ファイルにコンパイルできる。今回使っているデータベースはlmdbという「 $Python$ のモジュール」であるため、データベースも含めて実行ファイルになり、データベースも含めて配布できる。これは $Redis$ 等にはできない芸当に違いない。

pip install pyinstaller

仮想環境にpyinstallerを導入した際の使い方はこちらの記事から。

なお「仮想環境にlmdbをインストールした場合」、「同じ仮想環境にpyinstallerをインストール」しなければならない。lmdbのない環境にインストールしたpyinstallerで実行ファイルにすると、lmdbが見つからないというエラーが出る。環境は一致させること。

コンパイル

pyinstaller main.py

コンパイル後、buildとdistという二つのフォルダーが生成される。そのうち、配布用となるのはdistの方となる。

今回はdist/main/main.exeというファイルができた。これを実行すると、先と同様、データベースとしてsample_dbというフォルダーが生成される。

実行結果

> .\dist\main\main.exe
b'father' b'012-3456-7890'
b'mother' b'098-7654-3210'
b'manager' b'013-5790-2468'
b'vendor' b'086-4209-7531'

用語：インプロセスデータベース

$Python$ にはlmdbがあるが、 $Rust$ にもsledというものがある。こちらも使い勝手はlmdbとほぼ同じであり、 $key$ - $value$ の対応で表す $NoSQL$ データベースである。

このようなデータベースに合致する概念がどれなのか、浅学者の私には分別覚束ぬが、「インプロセスデータベース」が近いように思えた。システムアーキテクト試験(令和 $4$ 年度春期午前 $Ⅱ$ )にこのような出題がある。

組込みシステムで $DBMS$ を用いるときには，通信のオーバヘッド，通信負荷の発生を防ぐこと，必要なメモリ容量をリソース制限内に抑えることなどを目的としてインプロセスデータベースを用いることがある。このインプロセスデータベースの説明として，適切なものはどれか。

正答はこのような選択肢であった。

データベースエンジンはライブラリ形式で提供され，アプリケーションプログラムとリンクされて同一メモリ空間で動作する。

https://qe.hpeo.jp/entry/ipa-sa/e22h04

lmdbは $Python$ の、sledは $Rust$ のライブラリーとして提供されている。またlmdbもsledも、データベースに当るファイルを読み込んでいるため、恐らくこれがプログラムと同じメモリー上での動作に該当すると見える。

$SQLite$

種明かしをしよう。わざわざlmdbをインストールせずとも、 $Python$ には標準でsqlite3というモジュールが搭載されている。 $SQLite$ は、インプロセスデータベースとして知られている。

これもまた先述のものと同様、「 $SQLite$ 」というものをインストールする必要がなく、データベースはファイルとして生成される。但し、先のものが $NoSQL$ であるのに対し、 $SQLite$ は $SQL$ を使った関係データベースとなる。

tutorial.py

# https://docs.python.org/ja/3.13/library/sqlite3.html

import sqlite3

# connect or create db file
db_conn = sqlite3.connect(database = 'sample.db')
# cursor
db_cursor = db_conn.cursor()

# create table
db_cursor.execute("CREATE TABLE movie(title, year, score)")

# select table name → fetch
table_name = db_cursor.execute("SELECT name FROM sqlite_master").fetchone()

print(f'table name: {table_name}')

# insert data
db_cursor.execute(
    '''
    INSERT INTO movie VALUES
    ('Monty Python and the Holy Grail', 1975, 8.2),
    ('And Now for Something Completely Different', 1971, 7.5)
    '''
)
# commit
db_conn.commit()

# select score → fetch
score = db_cursor.execute("SELECT score FROM movie").fetchall()

print(f'score: {score}')

実行結果

> python .\tutorial.py
table name: ('movie',)
score: [(8.2,), (7.5,)]

sample.dbの様子

> type .\sample.db  
SQLite format 3@  .�)
��<YtablemoviemovieCREATE TABLE movie(title, year, score)
���8aAnd Now for Something Completely Different�@-KMonty Python and the Holy Grail�@ ffffff

$DuckDB$

更に、標準ではないものの、 $DuckDB$ というものも使うことができる。こちらもまた $SQL$ によって操作する関係データベースである。

pip install duckdb

$SQLite$ とは使い方が稍異なるので、似たような内容で比較する。特に、CREATE TABLE文に於いてデータ型を省略できない。~~寧ろデータ型を省略できることが疑問ではあった。~~

main.py

# https://duckdb.org/docs/stable/clients/python/overview#persistent-storage

import duckdb


def main():
    with duckdb.connect('sample.db') as db_conn:
        # create table
        db_conn.sql('CREATE TABLE movie (title VARCHAR(255), year INTEGER, score FLOAT)')
        # insert data
        db_conn.sql(
            '''
            INSERT INTO movie VALUES
            ('Monty Python and the Holy Grail', 1975, 8.2),
            ('And Now for Something Completely Different', 1971, 7.5)
            '''
        )
        # show table
        db_conn.table('movie').show()

if __name__ == "__main__":
    main()

実行結果

> python .\main.py
┌────────────────────────────────────────────┬───────┬───────┐
│                   title                    │ year  │ score │
│                  varchar                   │ int32 │ float │
├────────────────────────────────────────────┼───────┼───────┤
│ Monty Python and the Holy Grail            │  1975 │   8.2 │
│ And Now for Something Completely Different │  1971 │   7.5 │
└────────────────────────────────────────────┴───────┴───────┘

記述量も少ない上に、表示が非常に分かりやすい。

sample.dbの様子

> type .\sample.db
.�'���DUCK@v1.2.27c039464e4w���?�8-�\�`����������������1,��]J��������dcddfmaini����cddesamplefmaini�movie�ddtitleed��gh��dyeared
��gh��dscoreed��gh������ed��������e���fg����defd��ghdefgAnd Now Monty Py���*������edefd��ghdefg����������defde`��ghdefg�de����de���������edefd��ghdefg����������defdeh��ghdefg�de�@���de33A��������edefd��ghdefg����������dddefgAnd Now Monty Py���*����edefde�HYLL������ddefg�de����de�������edefde�HYLL������ddefg�de�@���de33A������edefde�HYLL��������������ede��e������defd����������d��������es��d��������e���g������������������Ү`=�k�IYIAnd Now for Something Completely DifferentMonty Python and the Holy Grail��33A�@

本記事のまとめ

本記事で紹介した四つのインプロセスデータベースについて、その特徴を表にまとめて跋文に代え、筆を置くこととする。

	言語	$SQL$ $or$ $NoSQL$	データ形式
$LMDB$	$Python$	$NoSQL$	$key$ - $value$
$Sled$	$Rust$	$NoSQL$	$key$ - $value$
$SQLite$	$C$ $Python$ など	$SQL$	表形式
$DuckDB$	$C$ $Python$ $Dart$ $Go$ $Julia$ $Java$ $R$ $Rust$ $Swift$ ︙	$SQL$	表形式

データベースをインストールしないデータベース(？)

導入

使い方

配布もできる

用語：インプロセスデータベース

SQLite

DuckDB

本記事のまとめ

Discussion

$SQLite$

$DuckDB$