⚙️
カラム型 vs JSON文字列格納のパフォーマンス比較

yushi-s
2025/08/31に公開
 はじめにデータベース設計において、データの格納方式はパフォーマンスに大きな影響を与えます。今回は、カラム型ストレージとJSON文字列格納のパフォーマンスを10万件のデータで比較検証しました。
特に注目したのは、JSON型が実際には「JSON構造体を文字列として格納」する方式であることによる性能への影響です。

 比較対象の実装方式
 カラム型ストレージ（正規化されたテーブル構造）CREATE TABLE users_column (
    id CHAR(36) PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) NOT NULL UNIQUE,
    age INTEGER NOT NULL,
    bio TEXT,
    avatar_url VARCHAR(500),
    preferences JSON,        -- 設定情報をJSONとして格納
    social_links JSON,       -- SNSリンクをJSONとして格納
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

 JSON文字列格納（非正規化されたテーブル構造）CREATE TABLE users_json (
    id CHAR(36) PRIMARY KEY,
    data JSON NOT NULL,      -- JSON構造体を文字列として格納
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
重要なポイント: JSON文字列格納は、実際にはJSON構造体を文字列として格納する方式です。つまり、複雑なデータ構造をJSON文字列としてシリアライズして保存し、読み取り時にデシリアライゼーションして使用します。

 具体的な格納方式の違いカラム型: 各フィールドが専用のカラムに格納
id: "uuid-123"
name: "John Doe"        ← 専用カラム
email: "john@example.com" ← 専用カラム
age: 25                 ← 専用カラム
JSON文字列格納: JSON構造体を文字列として格納
id: "uuid-123"
data: '{"name":"John Doe","email":"john@example.com","age":25,...}' ← JSON文字列

 ベンチマーク結果
 10万件データでの比較結果

件数
カラム型
JSON文字列格納
複雑JSON処理
勝者


1,000
8ms
4ms
44ms
JSON文字列格納

10,000
93ms
247ms
447ms
カラム型

50,000
280ms
638ms
2,262ms
カラム型

100,000
501ms
1,289ms
4,440ms
カラム型


 結果の分析
 1. データ量による性能の逆転現象
小規模（1,000件以下）: JSON文字列格納が優位

大規模（10,000件以上）: カラム型が圧倒的に優位

 2. スケーラビリティの違い
カラム型: 線形的な増加

JSON文字列格納: より急激な増加

複雑処理: 指数関数的な増加

 3. 複雑な処理の影響複雑なJSON処理は、シンプルなJSON読み取りの約3.4倍の時間
データ量が増えるほど、処理時間の差が拡大

 ベンチマークの詳細検証内容
 1. カラム型ストレージのベンチマーク検証内容: 正規化されたテーブル構造でのデータ読み取り性能
async fn benchmark_column(
    State(pool): State<AppState>,
    Path(count): Path<i32>,
) -> Result<Json<serde_json::Value>, StatusCode> {
    let start = std::time::Instant::now();
    
    // カラム型テーブルからデータを取得
    let rows = sqlx::query(
        r#"
        SELECT id, name, email, age, bio, avatar_url, preferences, social_links, created_at
        FROM users_column
        ORDER BY created_at DESC
        LIMIT ?
        "#
    )
    .bind(count)
    .fetch_all(&pool)
    .await?;

    // 取得したデータをUser構造体に変換
    let _users: Vec<User> = rows
        .iter()
        .map(|row| {
            let id_str: String = row.get("id");
            let preferences_str: String = row.get("preferences");
            let social_links_str: String = row.get("social_links");
            
            // JSONフィールドのデシリアライゼーション
            let preferences: HashMap<String, String> = 
                serde_json::from_str(&preferences_str).unwrap_or_default();
            let social_links: Vec<String> = 
                serde_json::from_str(&social_links_str).unwrap_or_default();

            User {
                id: Uuid::parse_str(&id_str).unwrap(),
                name: row.get("name"),
                email: row.get("email"),
                age: row.get("age"),
                profile: UserProfile {
                    bio: row.get("bio"),
                    avatar_url: row.get("avatar_url"),
                    preferences,
                    social_links,
                },
                created_at: row.get("created_at"),
            }
        })
        .collect();

    let duration = start.elapsed();
    
    Ok(Json(serde_json::json!({
        "storage_type": "column",
        "count": count,
        "duration_ms": duration.as_millis(),
        "records_processed": rows.len()
    })))
}
特徴:
各カラムから直接データを取得
JSONフィールド（preferences, social_links）のみデシリアライゼーション
インデックスが効率的に機能

 2. JSON文字列格納のベンチマーク検証内容: JSON構造体を文字列として格納した場合のデータ読み取り性能
async fn benchmark_json(
    State(pool): State<AppState>,
    Path(count): Path<i32>,
) -> Result<Json<serde_json::Value>, StatusCode> {
    let start = std::time::Instant::now();
    
    // JSON文字列として格納されたデータを取得
    let rows = sqlx::query(
        r#"
        SELECT data
        FROM users_json
        ORDER BY created_at DESC
        LIMIT ?
        "#
    )
    .bind(count)
    .fetch_all(&pool)
    .await?;

    // JSON文字列から構造体へのデシリアライゼーション
    let _users: Vec<User> = rows
        .iter()
        .filter_map(|row| {
            let data_str: String = row.get("data");
            // JSON文字列をRust構造体に変換
            serde_json::from_str(&data_str).ok()
        })
        .collect();

    let duration = start.elapsed();
    
    Ok(Json(serde_json::json!({
        "storage_type": "json_string_storage",
        "count": count,
        "duration_ms": duration.as_millis(),
        "records_processed": rows.len()
    })))
}
特徴:
JSON文字列として格納された全データを取得
全てのデータを文字列から構造体にデシリアライゼーション
シリアライゼーション/デシリアライゼーションのオーバーヘッドが発生
JSON構造体を文字列として格納する方式の特性が性能に影響

 3. 複雑なJSON処理のベンチマーク検証内容: JSON文字列として格納されたデータのアプリケーション側処理性能
async fn benchmark_complex_processing(
    State(pool): State<AppState>,
    Path(count): Path<i32>,
) -> Result<Json<serde_json::Value>, StatusCode> {
    let start = std::time::Instant::now();
    
    // JSON文字列として格納されたデータを取得
    let rows = sqlx::query(
        r#"
        SELECT data
        FROM users_json
        ORDER BY created_at DESC
        LIMIT ?
        "#
    )
    .bind(count)
    .fetch_all(&pool)
    .await?;

    // JSON文字列から構造体への変換と複雑な処理を実行
    let mut processed_users = Vec::new();
    
    for row in rows {
        let data_str: String = row.get("data");
        // JSON文字列を構造体に変換してから処理
        if let Ok(user_data) = serde_json::from_str::<serde_json::Value>(&data_str) {
            let mut processed_user = user_data.clone();
            
            // 1. 統計情報の計算（エンゲージメント率）
            if let Some(profile) = processed_user.get_mut("profile") {
                if let Some(stats) = profile.get_mut("statistics") {
                    if let Some(posts) = stats.get("posts_count").and_then(|v| v.as_u64()) {
                        if let Some(followers) = stats.get("followers_count").and_then(|v| v.as_u64()) {
                            let engagement_rate = if followers > 0 {
                                (posts as f64 / followers as f64) * 100.0
                            } else {
                                0.0
                            };
                            stats["engagement_rate"] = serde_json::json!(engagement_rate);
                        }
                    }
                }
            }
            
            // 2. タグの分析（検証率計算）
            if let Some(metadata) = processed_user.get_mut("metadata") {
                if let Some(tags) = metadata.get("tags").and_then(|v| v.as_array()) {
                    let tag_count = tags.len();
                    let verified_tags = tags.iter().filter(|tag| tag.as_str() == Some("verified")).count();
                    metadata["tag_analysis"] = serde_json::json!({
                        "total_tags": tag_count,
                        "verified_tags": verified_tags,
                        "verification_rate": if tag_count > 0 { 
                            (verified_tags as f64 / tag_count as f64) * 100.0 
                        } else { 
                            0.0 
                        }
                    });
                }
            }
            
            // 3. 実績の集計（ポイント合計）
            if let Some(profile) = processed_user.get("profile") {
                if let Some(achievements) = profile.get("achievements").and_then(|v| v.as_array()) {
                    let total_points: u64 = achievements.iter()
                        .filter_map(|achievement| achievement.get("points").and_then(|v| v.as_u64()))
                        .sum();
                    
                    if let Some(profile_mut) = processed_user.get_mut("profile") {
                        profile_mut["total_achievement_points"] = serde_json::json!(total_points);
                    }
                }
            }
            
            // 4. 文字列処理（単語数、文字数、文の分析）
            if let Some(profile) = processed_user.get("profile") {
                if let Some(bio) = profile.get("bio").and_then(|v| v.as_str()) {
                    let word_count = bio.split_whitespace().count();
                    let char_count = bio.chars().count();
                    let sentence_count = bio.split('.').count() - 1;
                    
                    if let Some(profile_mut) = processed_user.get_mut("profile") {
                        profile_mut["bio_analysis"] = serde_json::json!({
                            "word_count": word_count,
                            "char_count": char_count,
                            "sentence_count": sentence_count,
                            "avg_words_per_sentence": if sentence_count > 0 { 
                                word_count as f64 / sentence_count as f64 
                            } else { 
                                0.0 
                            }
                        });
                    }
                }
            }
            
            processed_users.push(processed_user);
        }
    }

    let duration = start.elapsed();
    
    Ok(Json(serde_json::json!({
        "storage_type": "complex_json_processing",
        "count": count,
        "duration_ms": duration.as_millis(),
        "records_processed": processed_users.len(),
        "processing_details": {
            "engagement_calculation": "completed",
            "tag_analysis": "completed", 
            "achievement_aggregation": "completed",
            "text_analysis": "completed"
        }
    })))
}
処理内容:

エンゲージメント率計算: 投稿数÷フォロワー数×100

タグ分析: 検証済みタグの割合計算

実績集計: 全実績のポイント合計

文字列分析: 単語数、文字数、文の数、平均単語数/文
重要なポイント: この処理では、JSON文字列として格納されたデータを一度構造体に変換してから、アプリケーション側で複雑な計算処理を行います。これにより、JSON型の「文字列として格納」という特性によるオーバーヘッドが明確に現れます。

 4. テストデータ生成検証内容: 各ストレージ方式でのデータ生成性能（JSON型は文字列として格納）
// カラム型データ生成
async fn generate_column_data(
    State(pool): State<AppState>,
    Path(count): Path<i32>,
) -> Result<Json<serde_json::Value>, StatusCode> {
    for i in 1..=count {
        let user_id = Uuid::new_v4().to_string();
        let preferences = serde_json::json!({
            "theme": if i % 2 == 0 { "dark" } else { "light" },
            "language": match i % 3 { 0 => "ja", 1 => "en", _ => "es" },
            "notifications": if i % 4 == 0 { "true" } else { "false" }
        });
        let social_links = serde_json::json!([
            format!("https://twitter.com/user{}", i),
            format!("https://github.com/user{}", i)
        ]);
        
        sqlx::query(
            r#"
            INSERT INTO users_column (id, name, email, age, bio, avatar_url, preferences, social_links)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?)
            "#
        )
        .bind(user_id)
        .bind(format!("User {}", i))
        .bind(format!("user{}@example.com", i))
        .bind(20 + (i % 60))
        .bind(format!("Bio for user {}", i))
        .bind(if i % 3 == 0 { Some(format!("https://example.com/avatar{}.jpg", i)) } else { None })
        .bind(preferences.to_string())
        .bind(social_links.to_string())
        .execute(&pool)
        .await?;
    }

    Ok(Json(serde_json::json!({
        "message": format!("Generated {} records in users_column", count)
    })))
}

// 複雑なJSONデータ生成（文字列として格納）
async fn generate_complex_data(
    State(pool): State<AppState>,
    Path(count): Path<i32>,
) -> Result<Json<serde_json::Value>, StatusCode> {
    for i in 1..=count {
        let user_id = Uuid::new_v4().to_string();
        
        // 複雑なJSON構造体を文字列として格納するためのデータを生成
        let complex_data = serde_json::json!({
            "id": user_id,
            "name": format!("Complex User {}", i),
            "email": format!("complex.user{}@example.com", i),
            "age": 20 + (i % 60),
            "profile": {
                "bio": format!("Complex bio for user {} with very long description that includes multiple sentences and various details about their background, interests, and activities.", i),
                "avatar_url": if i % 3 == 0 { Some(format!("https://example.com/avatar{}.jpg", i)) } else { None },
                "preferences": {
                    "theme": if i % 2 == 0 { "dark" } else { "light" },
                    "language": match i % 3 { 0 => "ja", 1 => "en", _ => "es" },
                    "notifications": if i % 4 == 0 { "true" } else { "false" },
                    "timezone": "Asia/Tokyo",
                    "currency": "JPY",
                    "accessibility": {
                        "high_contrast": i % 2 == 0,
                        "screen_reader": i % 3 == 0,
                        "font_size": "medium"
                    }
                },
                "social_links": [
                    format!("https://twitter.com/complex_user{}", i),
                    format!("https://github.com/complex_user{}", i),
                    format!("https://linkedin.com/in/complex_user{}", i),
                    format!("https://facebook.com/complex_user{}", i)
                ],
                "achievements": [
                    {
                        "id": format!("achievement_{}", i),
                        "name": format!("Achievement {}", i),
                        "description": format!("Description for achievement {}", i),
                        "earned_at": "2024-08-30T08:00:00Z",
                        "points": 100 + (i * 10)
                    }
                ],
                "statistics": {
                    "posts_count": 100 + (i * 5),
                    "followers_count": 500 + (i * 20),
                    "following_count": 200 + (i * 10),
                    "likes_received": 1000 + (i * 50),
                    "comments_made": 50 + (i * 3)
                }
            },
            "metadata": {
                "created_at": "2024-08-30T08:00:00Z",
                "last_login": "2024-08-30T21:00:00Z",
                "login_count": 100 + i,
                "is_verified": i % 5 == 0,
                "is_premium": i % 7 == 0,
                "tags": [
                    format!("tag_{}", i),
                    format!("category_{}", i % 10),
                    if i % 2 == 0 { "active" } else { "inactive" },
                    if i % 3 == 0 { "verified" } else { "unverified" }
                ]
            }
        });
        
        // JSON構造体を文字列としてデータベースに格納
        sqlx::query(
            r#"
            INSERT INTO users_json (id, data)
            VALUES (?, ?)
            "#
        )
        .bind(user_id)
        .bind(complex_data.to_string())  // JSON構造体を文字列に変換して格納
        .execute(&pool)
        .await?;
    }

    Ok(Json(serde_json::json!({
        "message": format!("Generated {} complex records in users_json", count)
    })))
}

 技術的な実装詳細
 JSON型で保存される実際のデータ構造{
  "id": "uuid-string",
  "name": "Complex User 1",
  "email": "complex.user1@example.com",
  "age": 25,
  "profile": {
    "bio": "Complex bio with long description...",
    "avatar_url": "https://example.com/avatar1.jpg",
    "preferences": {
      "theme": "dark",
      "language": "ja",
      "notifications": "true",
      "timezone": "Asia/Tokyo",
      "currency": "JPY",
      "accessibility": {
        "high_contrast": true,
        "screen_reader": false,
        "font_size": "medium"
      }
    },
    "social_links": [
      "https://twitter.com/complex_user1",
      "https://github.com/complex_user1",
      "https://linkedin.com/in/complex_user1",
      "https://facebook.com/complex_user1"
    ],
    "achievements": [
      {
        "id": "achievement_1",
        "name": "Achievement 1",
        "description": "Description for achievement 1",
        "earned_at": "2024-08-30T08:00:00Z",
        "points": 110
      }
    ],
    "statistics": {
      "posts_count": 105,
      "followers_count": 520,
      "following_count": 210,
      "likes_received": 1050,
      "comments_made": 53
    }
  },
  "metadata": {
    "created_at": "2024-08-30T08:00:00Z",
    "last_login": "2024-08-30T21:00:00Z",
    "login_count": 101,
    "is_verified": false,
    "is_premium": false,
    "tags": ["tag_1", "category_1", "active", "unverified"]
  }
}

 複雑な処理の内容ベンチマークで実行された複雑な処理：

統計情報の計算: エンゲージメント率の算出

タグの分析: 検証率の計算

実績の集計: ポイントの合計計算

文字列処理: 単語数、文字数、文の分析

 なぜカラム型が優位なのか
 カラム型の優位性
データベースエンジンの最適化: MariaDBのInnoDBエンジンがカラム型データを効率的に処理

インデックスの効果: カラム型では個別のカラムにインデックスが効く

メモリ効率: カラム型は必要なデータのみを読み込み

SQL集計関数の活用: データベース側で集計処理が可能

 JSON型の制限
シリアライゼーションオーバーヘッド: JSON文字列の解析に時間がかかる

メモリ使用量: 全データを一度に読み込む必要

複雑な処理: アプリケーション側での処理が非効率

インデックスの制限: JSON文字列内の特定フィールドへのインデックスが困難

文字列処理のオーバーヘッド: JSON構造体を文字列として格納・取得する際の変換コスト

 実務での選択指針
 カラム型を選ぶべき場合大量のデータ（10,000件以上）
複雑な集計・分析処理
パフォーマンスが重要な場合
データの正規化が可能な場合
頻繁な検索・フィルタリングが必要な場合

 JSON文字列格納を選ぶべき場合小規模なデータ（1,000件以下）
スキーマが頻繁に変更される場合
柔軟性が重要な場合
プロトタイピング段階
構造が複雑で正規化が困難な場合

 実装コード例
 Rustでの実装// JSON型でのデータ保存
async fn save_json_user(pool: &MySqlPool, user_data: &serde_json::Value) -> Result<()> {
    sqlx::query("INSERT INTO users_json (id, data) VALUES (?, ?)")
        .bind(Uuid::new_v4().to_string())
        .bind(serde_json::to_string(user_data)?)
        .execute(pool)
        .await?;
    Ok(())
}

// カラム型でのデータ保存
async fn save_column_user(pool: &MySqlPool, user: &User) -> Result<()> {
    sqlx::query("INSERT INTO users_column (id, name, email, age, bio, avatar_url, preferences, social_links) VALUES (?, ?, ?, ?, ?, ?, ?, ?)")
        .bind(&user.id)
        .bind(&user.name)
        .bind(&user.email)
        .bind(user.age)
        .bind(&user.bio)
        .bind(&user.avatar_url)
        .bind(&user.preferences)
        .bind(&user.social_links)
        .execute(pool)
        .await?;
    Ok(())
}

 実行方法# アプリケーションの起動（自動でベンチマーク実行）
cargo run

# 手動でベンチマーク実行
curl -X POST http://localhost:3000/generate/complex/1000
curl http://localhost:3000/benchmark/complex/1000

 結論このベンチマーク結果は、データ量と処理の複雑さに応じて、適切なストレージ方式を選択する重要性を明確に示しています。

小規模データ: JSON文字列格納（JSON構造体を文字列として格納）が有利

大規模データ: カラム型（正規化されたテーブル）が圧倒的に有利

複雑な処理: カラム型の優位性がより明確になる
特に、JSON文字列格納は「JSON構造体を文字列として格納」する方式であるため、データ量が増加するとシリアライゼーション/デシリアライゼーションのオーバーヘッドが顕著になり、パフォーマンスが劣化することが確認できました。

 参考資料プロジェクトのGitHubリポジトリ
MariaDB JSON型のドキュメント
SQLx Rust SQL Toolkit
ベンチマークに使用したソースコード
件数	カラム型	JSON文字列格納	複雑JSON処理	勝者
1,000	8ms	4ms	44ms	JSON文字列格納
10,000	93ms	247ms	447ms	カラム型
50,000	280ms	638ms	2,262ms	カラム型
100,000	501ms	1,289ms	4,440ms	カラム型
コラボスタイル DevelopersPublication
「ワークスタイルの未来を切り拓く」を理念に掲げ、リアルとデジタルの2つの働く場に向けた事業展開に留まらず、自社が率先して新しいワークスタイルに挑戦し、発信を行うことでワークスタイルの未来を切り拓いていきます。
はじめに

比較対象の実装方式

カラム型ストレージ（正規化されたテーブル構造）

JSON文字列格納（非正規化されたテーブル構造）

具体的な格納方式の違い

ベンチマーク結果

10万件データでの比較結果

結果の分析

1. データ量による性能の逆転現象

2. スケーラビリティの違い

3. 複雑な処理の影響

ベンチマークの詳細検証内容

1. カラム型ストレージのベンチマーク

2. JSON文字列格納のベンチマーク

3. 複雑なJSON処理のベンチマーク

4. テストデータ生成

技術的な実装詳細

JSON型で保存される実際のデータ構造

複雑な処理の内容

なぜカラム型が優位なのか

カラム型の優位性

JSON型の制限

実務での選択指針

カラム型を選ぶべき場合

JSON文字列格納を選ぶべき場合

実装コード例

Rustでの実装

実行方法

結論

参考資料

Discussion