🐷

MLOpsにおけるテストの書き方：機械学習プロジェクトの品質を向上させる方法

2025/03/14に公開

機械学習

MLOps（Machine Learning Operations）は、機械学習モデルを本番環境で安定して運用するための手法を提供する分野です。MLOpsでは、従来のソフトウェア開発と同様に「テスト」が重要な役割を果たします。しかし、機械学習特有の課題もあり、一般的なソフトウェアのユニットテストや統合テストだけでは不十分です。
本記事では、MLOpsにおける「テストの書き方」について詳しく解説します。具体的には、以下のような種類のテストを取り上げます。
ユニットテスト（Unit Testing）
データテスト（Data Testing）
モデルテスト（Model Testing）
統合テスト（Integration Testing）
エンドツーエンドテスト（End-to-End Testing）
モニタリングと継続的テスト（Monitoring & Continuous Testing）

 1. ユニットテスト（Unit Testing）
ユニットテストは、個々の関数やクラスが正しく動作するかを確認するためのテストです。機械学習では、前処理関数や特徴量エンジニアリングの処理、データ変換関数などに適用されます。

 例：データ前処理関数のユニットテスト
import unittest
import pandas as pd
from sklearn.preprocessing import StandardScaler

# 前処理関数
def preprocess_data(df: pd.DataFrame) -> pd.DataFrame:
    scaler = StandardScaler()
    df_scaled = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)
    return df_scaled

class TestPreprocessing(unittest.TestCase):
    def test_preprocess_data(self):
        df = pd.DataFrame({'feature1': [1, 2, 3], 'feature2': [4, 5, 6]})
        processed_df = preprocess_data(df)
        
        # 形状のチェック
        self.assertEqual(processed_df.shape, df.shape)

        # 平均が0に近いかどうか
        self.assertAlmostEqual(processed_df.mean().sum(), 0, places=5)

if __name__ == '__main__':
    unittest.main()

 ポイント

assertEqual()：変換後のデータの形が元のデータと一致するか確認。

assertAlmostEqual()：標準化後のデータの平均が0になるか確認。

 2. データテスト（Data Testing）
機械学習ではデータの品質がモデルの精度に大きく影響します。そのため、データに異常値や欠損値がないかをチェックするテストを実施します。

 例：欠損値と異常値のチェック
import pytest
import pandas as pd
import numpy as np

# データ検証関数
def check_missing_values(df: pd.DataFrame):
    return df.isnull().sum().sum() == 0

def check_outliers(df: pd.DataFrame, threshold=3):
    z_scores = (df - df.mean()) / df.std()
    return (z_scores.abs() > threshold).sum().sum() == 0

@pytest.fixture
def sample_data():
    return pd.DataFrame({'feature1': [1, 2, 3, 4, 5], 'feature2': [10, 15, 10, np.nan, 20]})

def test_missing_values(sample_data):
    assert not check_missing_values(sample_data), "欠損値が含まれています！"

def test_outliers(sample_data):
    assert check_outliers(sample_data) == False, "異常値が検出されました！"

 ポイント

check_missing_values()：データに欠損値がないか確認。

check_outliers()：異常値がないかをZスコアを使ってチェック。

pytest.fixture を使ってテストデータを提供。

 3. モデルテスト（Model Testing）
モデルの学習・推論が正しく動作するかを確認するテストです。主に以下のような観点でチェックを行います。

学習前後のスコア確認（学習が適切に進んでいるか）

推論の出力型のチェック（適切なデータ型か）

予測の一貫性テスト（同じ入力で同じ出力が得られるか）

 例：モデルの学習と予測のテスト
import pytest
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

@pytest.fixture
def sample_data():
    X, y = make_classification(n_samples=100, n_features=5, random_state=42)
    return train_test_split(X, y, test_size=0.2, random_state=42)

def test_model_training(sample_data):
    X_train, X_test, y_train, y_test = sample_data
    model = LogisticRegression()
    model.fit(X_train, y_train)

    # 精度が0.5以上かをチェック
    assert model.score(X_test, y_test) > 0.5, "モデルの精度が低すぎます！"

def test_model_prediction(sample_data):
    X_train, X_test, y_train, y_test = sample_data
    model = LogisticRegression()
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    # 出力の型を確認
    assert isinstance(y_pred, np.ndarray), "予測の出力がNumPy配列ではありません！"

 ポイント

model.score() を使って学習が適切に進んでいるか確認。
予測結果のデータ型を isinstance() でチェック。

 4. 統合テスト（Integration Testing）
データの前処理 → 学習 → 推論の一連の流れをテストする。

 例：学習から推論までの統合テスト
def train_and_predict():
    X, y = make_classification(n_samples=100, n_features=5, random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    model = LogisticRegression()
    model.fit(X_train, y_train)
    return model.predict(X_test)

def test_pipeline():
    predictions = train_and_predict()
    
    # 予測結果の長さをチェック
    assert len(predictions) > 0, "予測結果が空です！"

 5. エンドツーエンドテスト（End-to-End Testing）
本番環境にデプロイされたモデルが正しく動作するかをテスト。
APIエンドポイントの動作確認
モデルのレスポンス時間チェック
import requests

def test_model_api():
    url = "http://localhost:5000/predict"
    response = requests.post(url, json={"features": [1.2, 3.4, 5.6, 7.8, 9.0]})

    assert response.status_code == 200, "APIのレスポンスが正しくありません！"
    assert "prediction" in response.json(), "予測結果が返されていません！"

 まとめ
MLOpsにおけるテストは、単なるユニットテストだけでなく、データの品質チェックやモデルの精度確認、統合テスト、エンドツーエンドテストまで幅広く行う必要があります。これにより、機械学習モデルの品質を向上させ、安定した運用が可能になります。
次のステップとして、CI/CDと組み合わせた自動テストの構築に取り組むと、さらに効率的にMLOpsを実践できます！

Discussion

ログインするとコメントできます