🥻

領収書OCR管理を機能強化 OCR精度を気軽の確認可能

マッサン (Masanori Yoshida)

2025/03/16に公開

はじめに

先日投稿した下記記事に記載している領収書OCR管理を機能強化しました。

機能強化した結果のシステム全体概要

このChrome拡張機能「AI OCR Extension」は、PCのカメラを使って領収書を撮影し、AIを活用して領収書の情報をテキストデータとして抽出するツールです。
経理処理の効率化する可能性を秘めており、領収書の手動入力の手間を削減しつつ、AIの抽出結果を人間がチェック・編集できる設計になっています。精度に改善の余地があるものの、OCR結果に画像上の座標情報を含めることで、抽出データと画像の対応関係を視覚的に確認できます。

実際に使っている様子。抽出データと画像の対応関係は改善の余地あり。

主な特徴：

マルチAIモデル対応：

3つの最新AIモデル（Gemini、Claude、ChatGPT）に対応
ユーザーは性能や利用状況に応じて選択可能

カメラ撮影とOCR処理：

PCのウェブカメラを使って領収書を撮影

AIモデルによる画像認識で以下の情報を抽出：

支払先会社名、発行日、支払金額（税込）
通貨、登録番号（適格請求書発行事業者の登録番号）
その他注記事項

結果の確認と編集：

抽出結果と原画像の対応箇所を視覚的に確認可能（座標付きJSONデータを活用）
フォームフィールドを選択すると画像上の該当部分がハイライト表示
不正確な情報を手動で編集可能

データ保存：

編集済みデータをJSON形式で保存可能
撮影画像もPNG形式で保存可能

APIキー管理：

各AIサービスのAPIキーを安全に管理
セキュリティのためのマスキング機能

処理フロー：

拡張機能起動 → カメラ画面表示 → AIモデル選択 → 撮影 → OCR処理
結果確認・編集画面 → データ編集 → JSON/画像保存または撮り直し

ソースコード

manifest.json

Chrome拡張機能の設定ファイル
名前「AI OCR Extension」：PCカメラで撮影し、Gemini/Claude/ChatGPTのAPIでOCRを行うChrome拡張
権限：storage（データ保存）、tabs（タブ操作）
バックグラウンド処理：service-worker.js
設定画面：options.html

manifest.json

manifest.json
{
    "name": "AI OCR Extension",
    "description": "PCカメラで撮影し、Gemini/Claude/ChatGPT APIでOCRを行うChrome拡張",
    "version": "1.0.0",
    "manifest_version": 3,
    "permissions": [
      "storage",
      "tabs"
    ],
    "action": {
      "default_title": "AI OCR Extension"
    },
    "background": {
      "service_worker": "service-worker.js"
    },
    "options_ui": {
      "page": "options.html",
      "open_in_tab": true
    }
}

service-worker.js

拡張機能のアイコンがクリックされたときに新しいタブでpopup.htmlを開く簡単なスクリプト

service-worker.js

service-worker.js
// service-worker.js
chrome.action.onClicked.addListener(() => {
  chrome.tabs.create({
    url: 'popup.html'
  });
});

メインの機能画面：カメラ起動・撮影・OCR処理
3つのAIモデル（Gemini 2.0 Flash、Claude 3.7 Sonnet、ChatGPT-4o）から選択可能
カメラ操作（起動、撮影、撮り直し）機能
選択したAIモデルに対応するAPI呼び出しでOCR処理を実行
AIに領収書情報を特定のJSON形式で抽出させるための詳細なプロンプト定義
OCR処理結果を保存してcomparison.htmlに遷移

popup.htmlを表示している様子

popup.html

popup.html
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>領収書OCR</title>
    <style>
      body {
        font-family: sans-serif;
        margin: 20px;
        max-width: 800px;
        margin: 0 auto;
        padding: 20px;
      }
      h1, h2 {
        margin: 0.5em 0;
      }
      button {
        margin: 8px 8px 8px 0;
        padding: 8px 16px;
        cursor: pointer;
      }
      /* 動画/画像の枠のスタイル */
      #video, #capturedImage {
        display: block;
        width: 100%;
        max-height: 480px;
        object-fit: cover;
        border: 1px solid #ccc;
        margin-bottom: 15px;
        background: #eee;
      }
      .section {
        margin-bottom: 20px;
      }
      /* ボタングループのスタイル */
      .button-group {
        display: flex;
        gap: 10px;
        flex-wrap: wrap;
      }
      /* 撮り直しボタンの強調スタイル */
      #resetBtn {
        background-color: #f5f5f5;
        border: 1px solid #ddd;
      }
      /* コンテナーレイアウト */
      .container {
        max-width: 800px;
        margin: 0 auto;
      }
      /* AIモデル選択 */
      .model-selection {
        margin-bottom: 15px;
        display: flex;
        align-items: center;
        gap: 10px;
      }
      .model-selection label {
        font-weight: bold;
      }
      .model-selection select {
        padding: 8px;
        border-radius: 4px;
        border: 1px solid #ccc;
      }
      /* レスポンシブ対応 */
      @media (max-width: 600px) {
        body {
          padding: 10px;
        }
        #video, #capturedImage {
          max-height: 360px;
        }
      }
    </style>
  </head>
  <body>
    <div class="container">
      <h1>領収書 OCR</h1>
      
      <!-- AIモデル選択 -->
      <div class="model-selection">
        <label for="aiModel">AIモデル:</label>
        <select id="aiModel">
          <option value="gemini">Gemini 2.0 Flash</option>
          <option value="claude">Claude 3.7 Sonnet</option>
          <option value="chatgpt">ChatGPT-4o</option>
        </select>
      </div>
      
      <!-- カメラ映像／撮影画像エリア -->
      <div class="section">
        <video id="video" autoplay></video>
        <img id="capturedImage" alt="撮影画像プレビュー" style="display:none;" />
        <!-- Canvas要素 -->
        <canvas id="canvas" style="display:none;"></canvas>
      </div>
      <!-- ボタン群 -->
      <div class="section button-group">
        <button id="captureBtn">撮影</button>
        <button id="resetBtn" style="display:none;">撮り直し</button>
        <button id="ocrBtn" disabled>OCR解析</button>
      </div>
    </div>
    <script src="popup.js"></script>
  </body>
</html>

popup.js

popup.js
// popup.js
let currentStream = null;

// 要素取得
const video = document.getElementById('video');
const canvas = document.getElementById('canvas');
const captureBtn = document.getElementById('captureBtn');
const ocrBtn = document.getElementById('ocrBtn');
const capturedImage = document.getElementById('capturedImage');
// 撮り直しボタン取得
const resetBtn = document.getElementById('resetBtn');

// カメラ起動関数（再利用するため関数化）
function startCamera() {
  // 既存のストリームがあれば停止
  if (currentStream) {
    currentStream.getTracks().forEach(track => track.stop());
  }
  
  // カメラを起動
  return navigator.mediaDevices.getUserMedia({ video: true })
    .then(stream => {
      currentStream = stream;
      video.srcObject = stream;
      video.style.display = 'block';
      
      // UI状態をリセット
      capturedImage.style.display = 'none';
      resetBtn.style.display = 'none';
      ocrBtn.disabled = true;
      
      return stream;
    })
    .catch(err => {
      console.error('Camera access error:', err);
      alert('カメラへのアクセスに失敗しました: ' + err.message);
    });
}

// 1. 初期カメラ起動
startCamera();

// 2. 撮影(キャプチャ) → Canvasに描画 → 動画を非表示 + 画像プレビューを表示
captureBtn.addEventListener('click', () => {
  if (!video.srcObject) {
    alert('カメラが起動していません。');
    return;
  }
  
  // videoのサイズを取得しcanvasへ描画
  const width = video.videoWidth;
  const height = video.videoHeight;
  canvas.width = width;
  canvas.height = height;
  const ctx = canvas.getContext('2d');
  ctx.drawImage(video, 0, 0, width, height);
  
  // 撮影した画像をDataURLに変換してimgタグに表示
  const dataUrl = canvas.toDataURL('image/png');
  capturedImage.src = dataUrl;
  
  // カメラ映像を停止&非表示にする
  if (currentStream) {
    currentStream.getTracks().forEach(track => track.stop());
    currentStream = null;
  }
  video.style.display = 'none';
  video.srcObject = null;
  
  // 代わりに撮影画像表示
  capturedImage.style.display = 'block';
  
  // その他ボタン活性化
  ocrBtn.disabled = false;
  // 追加: 撮り直しボタン表示
  resetBtn.style.display = 'block';
});

// AIモデル選択の参照を取得
const aiModelSelect = document.getElementById('aiModel');

// 選択されたAIモデルを保存する
aiModelSelect.addEventListener('change', () => {
  chrome.storage.local.set({ selectedAiModel: aiModelSelect.value });
});

// 保存されたAIモデルの選択を復元する
chrome.storage.local.get('selectedAiModel', (data) => {
  if (data.selectedAiModel) {
    aiModelSelect.value = data.selectedAiModel;
  }
});

// プロンプトテキスト定義（どのモデルでも共通）
const getPromptText = () => {
  return `
  あなたは優秀な経理担当者です。受け取った領収書を画像解析して文字や金額を起こしてください。
## 重要事項
- わからない項目がある場合は、正直に「N/A」と記入してください。
- 1枚の画像に複数の領収書が含まれている場合は、それぞれの領収書ごとに別々のJSONを作成してください。
- 回答はJSONのみで出力してください。
- 標準的でない形式や追加情報がある場合は、各行の注記として記載してください。
- テキスト出力した根拠となる画像の場所について、クロップできるように、それぞれ座標(x,y)と幅、高さも教えてください。単位はpxでお願いします。

## 項目の説明
- 支払先会社名
- 発行日
- 支払金額税込
- 通貨
- 登録番号

## 出力形式
以下の項目をJSON形式で出力してください。

## 出力項目（優先順位順）
1. 支払先会社名
2. 発行日
3. 支払金額税込
4. 通貨
5. 登録番号
6. 注記

## JSONの定義
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "InvoiceFields",
"type": "object",
"properties": {
  "imageWidthPx": {
    "type": "integer",
    "description": "撮影した画像の幅(px)"
  },
  "imageHeightPx": {
    "type": "integer",
    "description": "撮影した画像の高さ(px)"
  },
  "payeeName": {
    "type": "object",
    "title": "支払先会社名",
    "description": "支払先の会社名。宛名や請求先ではなく、実際に支払う先の会社名を示します。",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（支払先会社名）"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  },
  "issueDate": {
    "type": "object",
    "title": "発行日",
    "description": "領収書を発行した日付（YYYY-MM-DD形式）",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（発行日）",
        "pattern": "^[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|1\\d|2\\d|3[01])$",
        "example": "2025-03-15"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  },
  "amountIncludingTax": {
    "type": "object",
    "title": "支払金額税込",
    "description": "税込み合計金額（カンマ区切り、小数点以下2桁まで）。税抜き金額しかない場合は、税額を加算して税込みにしてください。",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（支払金額税込）",
        "pattern": "^\\d{1,3}(,\\d{3})*(\\.\\d{2})?$",
        "example": "12,345.67"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  },
  "currency": {
    "type": "object",
    "title": "通貨",
    "description": "支払金額の通貨。例：JPY、USD、EUR",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（通貨）",
        "pattern": "^[A-Z]{3}$",
        "example": "JPY"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  },
  "registrationNumber": {
    "type": "object",
    "title": "登録番号",
    "description": "適格請求書発行事業者の登録番号。法人番号がある場合は「T+法人番号」、ない場合は「T+13桁の固有番号」(例：T0000000000000)。",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（登録番号）",
        "pattern": "^T\\d{13}$",
        "example": "T1234567890123"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  },
  "notes": {
    "type": "object",
    "title": "注記",
    "description": "領収書や支払に関して補足や特記事項があれば記入します。",
    "properties": {
      "value": {
        "type": "string",
        "description": "実際の文字列値（注記）"
      },
      "x": {
        "type": "number",
        "description": "座標X"
      },
      "y": {
        "type": "number",
        "description": "座標Y"
      },
      "width": {
        "type": "number",
        "description": "幅"
      },
      "height": {
        "type": "number",
        "description": "高さ"
      }
    },
    "required": ["value", "x", "y", "width", "height"]
  }
},
"required": [
  "imageWidthPx",
  "imageHeightPx",
  "payeeName",
  "issueDate",
  "amountIncludingTax",
  "currency",
  "registrationNumber",
  "notes"
]
}
`.trim();
};

// Gemini APIを使用してOCRを実行する関数
async function processWithGemini(base64Data, apiKey) {
  const promptText = getPromptText();
  
  // Gemini 2.0 Flash APIのエンドポイント
  const apiUrl = `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=${apiKey}`;

  // リクエストボディ
  const requestBody = {
    contents: [
      {
        parts: [
          {
            text: promptText
          },
          {
            inline_data: {
              mime_type: "image/png",
              data: base64Data
            }
          }
        ]
      }
    ]
  };

  const response = await fetch(apiUrl, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(requestBody)
  });

  if (!response.ok) {
    const errorData = await response.json().catch(() => ({}));
    console.error('Gemini OCR request failed:', errorData);
    throw new Error(`Gemini OCR request failed: ${response.status}`);
  }

  const ocrData = await response.json();

  let ocrText = '';
  if (ocrData.candidates && ocrData.candidates.length > 0) {
    const firstCandidate = ocrData.candidates[0];
    if (firstCandidate.content?.parts) {
      ocrText = firstCandidate.content.parts.map(part => part.text).join('\n');
    }
  }

  return ocrText;
}

// Claude APIを使用してOCRを実行する関数
async function processWithClaude(base64Data, apiKey) {
  const promptText = getPromptText();
  
  // Claude APIのエンドポイント
  const apiUrl = 'https://api.anthropic.com/v1/messages';

  // リクエストボディ
  const requestBody = {
    // model: "claude-3-5-sonnet-20241022",
    model: "claude-3-7-sonnet-20250219",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: [
          {
            type: "image",
            source: {
              type: "base64",
              media_type: "image/png", // PNGとして送信する
              data: base64Data
            }
          },
          {
            type: "text",
            text: promptText
          }
        ]
      }
    ]
  };

  // ヘッダー準備（順序とフォーマットが重要）
  const headers = {
    'x-api-key': apiKey,
    'anthropic-version': '2023-06-01',
    'content-type': 'application/json',
    'anthropic-dangerous-direct-browser-access': true
  };

  console.log('Claude API Request Headers:', JSON.stringify(headers));
  console.log('Claude API Request Body (structure):', JSON.stringify({
    model: requestBody.model,
    max_tokens: requestBody.max_tokens,
    messages: [{
      role: "user",
      content: "[image and text content]"
    }]
  }));

  const response = await fetch(apiUrl, {
    method: 'POST',
    headers: headers,
    body: JSON.stringify(requestBody)
  });

  if (!response.ok) {
    const errorText = await response.text();
    let errorDetails;
    try {
      errorDetails = JSON.parse(errorText);
      console.error('Claude API Error Response:', errorDetails);
    } catch (e) {
      console.error('Claude API Error (non-JSON):', errorText);
    }
    throw new Error(`Claude OCR request failed: ${response.status} - ${errorText}`);
  }

  const ocrData = await response.json();
  
  // Claude APIからの応答に対応する処理
  let ocrText = '';
  if (ocrData.content && ocrData.content.length > 0) {
    ocrText = ocrData.content
      .filter(part => part.type === 'text')
      .map(part => part.text)
      .join('\n');
  }

  return ocrText;
}

// ChatGPT APIを使用してOCRを実行する関数
async function processWithChatGPT(base64Data, apiKey) {
  const promptText = getPromptText();
  
  // ChatGPT APIのエンドポイント
  const apiUrl = 'https://api.openai.com/v1/chat/completions';

  // リクエストボディ
  const requestBody = {
    model: "gpt-4o",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: promptText
          },
          {
            type: "image_url",
            image_url: {
              url: `data:image/jpeg;base64,${base64Data}`
            }
          }
        ]
      }
    ],
    max_tokens: 1024
  };

  console.log('ChatGPT API Request (structure):', JSON.stringify({
    model: requestBody.model,
    messages: [{
      role: "user",
      content: "[text and image content]"
    }]
  }));

  const response = await fetch(apiUrl, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${apiKey}`
    },
    body: JSON.stringify(requestBody)
  });

  if (!response.ok) {
    const errorText = await response.text();
    let errorDetails;
    try {
      errorDetails = JSON.parse(errorText);
      console.error('ChatGPT API Error Response:', errorDetails);
    } catch (e) {
      console.error('ChatGPT API Error (non-JSON):', errorText);
    }
    throw new Error(`ChatGPT OCR request failed: ${response.status} - ${errorText}`);
  }

  const ocrData = await response.json();
  
  // ChatGPT APIからの応答に対応する処理
  let ocrText = '';
  if (ocrData.choices && ocrData.choices.length > 0) {
    ocrText = ocrData.choices[0].message.content;
  }

  return ocrText;
}

// 3. OCRボタン押下時に画像を送信 → 結果を取得
ocrBtn.addEventListener('click', async () => {
if (!canvas) {
  alert('画像が撮影されていません。');
  return;
}

try {
  // OCR処理中の表示
  ocrBtn.disabled = true;
  ocrBtn.textContent = '処理中...';
  
  // CanvasからBase64を取得
  const dataUrl = canvas.toDataURL('image/png');
  const base64Data = dataUrl.split(',')[1]; // 先頭 "data:image/png;base64," を取り除く

  // 選択されたAIモデルを取得
  const selectedModel = aiModelSelect.value;
  
  // モデルに応じたAPIキーをストレージから取得
  const apiKeys = await chrome.storage.local.get(['geminiApiKey', 'claudeApiKey', 'openaiApiKey']);
  
  let apiKey;
  let modelName;
  
  // APIキーの確認
  switch (selectedModel) {
    case 'gemini':
      apiKey = apiKeys.geminiApiKey;
      modelName = 'Gemini';
      break;
    case 'claude':
      apiKey = apiKeys.claudeApiKey;
      modelName = 'Claude';
      break;
    case 'chatgpt':
      apiKey = apiKeys.openaiApiKey;
      modelName = 'ChatGPT';
      break;
  }
  
  // APIキーがない場合はオプションページに誘導
  if (!apiKey) {
    alert(`先にオプションページで${modelName}のAPIキーを設定してください。`);
    chrome.runtime.openOptionsPage();
    ocrBtn.textContent = 'OCR解析';
    ocrBtn.disabled = false;
    return;
  }
  
  // 選択されたAIサービスで処理
  let ocrText;
  try {
    switch (selectedModel) {
      case 'gemini':
        ocrText = await processWithGemini(base64Data, apiKey);
        break;
      case 'claude':
        ocrText = await processWithClaude(base64Data, apiKey);
        break;
      case 'chatgpt':
        ocrText = await processWithChatGPT(base64Data, apiKey);
        break;
    }
  } catch (error) {
    console.error(`${modelName} OCR処理エラー:`, error);
    alert(`${modelName} OCR処理中にエラーが発生しました: ${error.message}`);
    ocrBtn.textContent = 'OCR解析';
    ocrBtn.disabled = false;
    return;
  }

  // 取得した画像データとOCR結果をストレージに保存
  await chrome.storage.local.set({
    'ocrResults': ocrText,
    'capturedImageData': dataUrl,
    'usedAiModel': selectedModel // 使用したAIモデルを保存
  });

  // 比較画面に遷移する (タブ内で遷移)
  window.location.href = 'comparison.html';

} catch (error) {
  console.error('OCR処理エラー:', error);
  alert('OCR処理中にエラーが発生しました: ' + error.message);
  ocrBtn.textContent = 'OCR解析';
  ocrBtn.disabled = false;
}
});

// 4. 撮り直しボタンの機能
resetBtn.addEventListener('click', () => {
  // 状態をリセット
  capturedImage.style.display = 'none';
  ocrBtn.disabled = true;
  
  // カメラを再起動
  startCamera()
    .then(() => {
      console.log('カメラ再起動成功');
      resetBtn.style.display = 'none';
    })
    .catch(err => {
      console.error('カメラ再起動エラー:', err);
      alert('カメラの再起動に失敗しました。ページをリロードしてください。');
    });
});

// 7. ページ離脱時のクリーンアップ
window.addEventListener('beforeunload', () => {
  if (currentStream) {
    currentStream.getTracks().forEach(track => track.stop());
  }
});

comparison.html / comparison.js

OCR処理結果の表示・編集画面
撮影画像と抽出データ（支払先会社名、発行日、金額など）の表示
フォームフィールドを選択すると画像上の対応箇所をハイライト表示
抽出データの編集機能
編集済みデータのJSON保存機能と画像の保存機能
popup.htmlに戻る機能

comparison.htmlを表示している様子

comparison.html

comparison.html
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <title>OCR結果確認</title>
    <style>
      body {
        font-family: sans-serif;
        margin: 20px;
        max-width: 1200px;
      }
      h1, h2 {
        margin: 0.5em 0;
      }
      button {
        margin: 8px 8px 8px 0;
        padding: 8px 16px;
        cursor: pointer;
      }
      .container {
        display: flex;
        flex-wrap: wrap;
        gap: 20px;
      }
      .image-section, .results-section {
        flex: 1;
        min-width: 300px;
      }
      .image-section {
        display: flex;
        flex-direction: column;
      }
      #originalImage {
        max-width: 100%;
        border: 1px solid #ccc;
        margin-bottom: 10px;
      }
      #cropOverlay {
        position: absolute;
        border: 3px solid rgba(255, 0, 0, 0.8);
        background-color: rgba(255, 0, 0, 0.2);
        pointer-events: none;
        display: none;
        z-index: 10;
        box-shadow: 0 0 5px rgba(255, 0, 0, 0.8);
      }
      .image-container {
        position: relative;
        margin-bottom: 20px;
        overflow: visible;
      }
      #originalImage {
        display: block;
        max-width: 100%;
        height: auto;
      }
      .form-group {
        margin-bottom: 15px;
      }
      .form-group label {
        display: block;
        margin-bottom: 5px;
        font-weight: bold;
      }
      .form-group input[type="text"] {
        width: 100%;
        padding: 8px;
        box-sizing: border-box;
        border: 1px solid #ccc;
        border-radius: 4px;
      }
      .form-group textarea {
        width: 100%;
        height: 80px;
        padding: 8px;
        box-sizing: border-box;
        border: 1px solid #ccc;
        border-radius: 4px;
      }
      .button-group {
        margin-top: 20px;
        display: flex;
        gap: 10px;
      }
      #backBtn {
        background-color: #f5f5f5;
      }
    </style>
  </head>
  <body>
    <h1>OCR結果確認</h1>
    
    <div class="container">
      <!-- 左側: 画像表示エリア -->
      <div class="image-section">
        <h2>撮影された画像</h2>
        <div class="image-container">
          <img id="originalImage" alt="撮影画像" />
          <div id="cropOverlay"></div>
        </div>
      </div>
      
      <!-- 右側: OCR結果表示エリア -->
      <div class="results-section">
        <h2>OCR結果</h2>
        
        <div class="form-group">
          <label for="payeeName">支払先会社名</label>
          <input type="text" id="payeeName" />
        </div>
        
        <div class="form-group">
          <label for="issueDate">発行日</label>
          <input type="text" id="issueDate" />
        </div>
        
        <div class="form-group">
          <label for="amountIncludingTax">支払金額税込</label>
          <input type="text" id="amountIncludingTax" />
        </div>
        
        <div class="form-group">
          <label for="currency">通貨</label>
          <input type="text" id="currency" />
        </div>
        
        <div class="form-group">
          <label for="registrationNumber">登録番号</label>
          <input type="text" id="registrationNumber" />
        </div>
        
        <div class="form-group">
          <label for="notes">注記</label>
          <textarea id="notes"></textarea>
        </div>
      </div>
    </div>
    
    <!-- アクションボタン -->
    <div class="button-group">
      <button id="backBtn">戻る</button>
      <button id="saveImageBtn">画像を保存</button>
      <button id="saveJsonBtn">JSONを保存</button>
    </div>
    
    <!-- Debug JSON Display -->
    <div class="form-group" style="margin-top: 20px;">
      <h3>デバッグ用JSON</h3>
      <textarea id="debugJson" style="width: 100%; height: 150px; font-family: monospace;"></textarea>
    </div>
    
    <!-- Canvas for image data processing (hidden) -->
    <canvas id="hiddenCanvas" style="display:none;"></canvas>
    
    <script src="comparison.js"></script>
  </body>
</html>

comparison.js

comparison.js
// comparison.js
// DOM Elements
const originalImage = document.getElementById('originalImage');
const cropOverlay = document.getElementById('cropOverlay');
const hiddenCanvas = document.getElementById('hiddenCanvas');

// Form fields
const payeeNameInput = document.getElementById('payeeName');
const issueDateInput = document.getElementById('issueDate');
const amountInput = document.getElementById('amountIncludingTax');
const currencyInput = document.getElementById('currency');
const registrationInput = document.getElementById('registrationNumber');
const notesInput = document.getElementById('notes');

// Debug display
const debugJson = document.getElementById('debugJson');

// Buttons
const backBtn = document.getElementById('backBtn');
const saveImageBtn = document.getElementById('saveImageBtn');
const saveJsonBtn = document.getElementById('saveJsonBtn');

// Store the OCR data
let ocrData = null;
let imageScale = 1;
let originalOcrText = ''; // Store the original OCR text
let originalImageWidth = 0; // Original image width from OCR data
let originalImageHeight = 0; // Original image height from OCR data

// Function to recalculate image scaling factors
function recalculateImageScaling() {
  if (!originalImage) return;
  
  // Get current dimensions of the displayed image
  const displayedWidth = originalImage.clientWidth;
  const displayedHeight = originalImage.clientHeight;
  const naturalWidth = originalImage.naturalWidth;
  const naturalHeight = originalImage.naturalHeight;
  
  console.log(`Image dimensions updated - Natural: ${naturalWidth}x${naturalHeight}, Displayed: ${displayedWidth}x${displayedHeight}`);
  
  // If any overlay is currently visible, update its position by calling showCropOverlay again
  if (cropOverlay.style.display === 'block') {
    // Try to determine which field is currently active
    const activeFields = ['payeeName', 'issueDate', 'amountIncludingTax',
                         'currency', 'registrationNumber', 'notes'];
    
    // First, try to find which form field has focus
    const focusedElement = document.activeElement;
    let activeFieldName = null;
    
    if (focusedElement) {
      for (const fieldName of activeFields) {
        if (focusedElement.id === fieldName) {
          activeFieldName = fieldName;
          break;
        }
      }
    }
    
    // If no field has focus, try to determine by checking the overlay position
    if (!activeFieldName) {
      for (const fieldName of activeFields) {
        if (ocrData && ocrData[fieldName] &&
            ocrData[fieldName].width > 0 && ocrData[fieldName].height > 0) {
          showCropOverlay(fieldName);
          return; // Only update one field
        }
      }
    } else {
      // Update the active field's overlay
      showCropOverlay(activeFieldName);
    }
  }
}

// Initialize the page
document.addEventListener('DOMContentLoaded', () => {
  // Get data passed from popup
  chrome.storage.local.get(['ocrResults', 'capturedImageData', 'usedAiModel'], (data) => {
    if (data.capturedImageData) {
      originalImage.src = data.capturedImageData;
      
      // Wait for image to load to set up scaling
      originalImage.onload = () => {
        // Calculate image scaling factors (actual displayed size vs original size)
        recalculateImageScaling();
        
        // Now that we have the scaling factors, we can display any highlights
        setupFormHighlighting();
        
        // Add resize event listener to handle window resizing
        window.addEventListener('resize', () => {
          recalculateImageScaling();
        });
      };
    } else {
      console.error('No image data found');
    }
    
    // 使用したAIモデルの情報を表示に追加
    if (data.usedAiModel) {
      let modelName;
      switch (data.usedAiModel) {
        case 'gemini':
          modelName = 'Gemini 2.0 Flash';
          break;
        case 'claude':
          modelName = 'Claude 3.7 Sonnet';
          break;
        case 'chatgpt':
          modelName = 'ChatGPT-4o';
          break;
        default:
          modelName = data.usedAiModel;
      }
      
      // タイトルにモデル名を追加
      const titleElement = document.querySelector('h1');
      if (titleElement) {
        titleElement.textContent = `OCR結果確認 (${modelName})`;
      }
    }
    
    if (data.ocrResults) {
      try {
        // Store original OCR text for debugging
        originalOcrText = data.ocrResults;
        
        // Display raw JSON in debug textarea
        debugJson.value = originalOcrText;
        
        // Try to parse as JSON
        try {
          ocrData = JSON.parse(data.ocrResults);
        } catch (e) {
          // If it's not JSON directly, look for JSON in the string
          // This handles the case where API might return extra text around the JSON
          const jsonMatch = data.ocrResults.match(/\{[\s\S]*\}/);
          if (jsonMatch) {
            ocrData = JSON.parse(jsonMatch[0]);
          } else {
            throw new Error('No valid JSON found in the response');
          }
        }
        
        // Extract image dimensions from OCR data if available
        if (ocrData.imageWidthPx && ocrData.imageHeightPx) {
          originalImageWidth = ocrData.imageWidthPx;
          originalImageHeight = ocrData.imageHeightPx;
          console.log(`Original image dimensions from OCR: ${originalImageWidth}x${originalImageHeight}`);
        }
        
        populateFormFields(ocrData);
      } catch (error) {
        console.error('Failed to parse OCR results:', error);
        // Display error message to user
        alert('OCR結果の解析に失敗しました。正しいJSON形式でない可能性があります。');
      }
    } else {
      console.error('No OCR results found');
    }
  });
});

// Populate form fields with OCR data
function populateFormFields(data) {
  if (!data) return;
  
  if (data.payeeName) {
    payeeNameInput.value = data.payeeName.value || '';
  }
  
  if (data.issueDate) {
    issueDateInput.value = data.issueDate.value || '';
  }
  
  if (data.amountIncludingTax) {
    amountInput.value = data.amountIncludingTax.value || '';
  }
  
  if (data.currency) {
    currencyInput.value = data.currency.value || '';
  }
  
  if (data.registrationNumber) {
    registrationInput.value = data.registrationNumber.value || '';
  }
  
  if (data.notes) {
    notesInput.value = data.notes.value || '';
  }
}

// Set up form field events to show crop highlights
function setupFormHighlighting() {
  const formFieldMap = {
    'payeeName': payeeNameInput,
    'issueDate': issueDateInput,
    'amountIncludingTax': amountInput,
    'currency': currencyInput,
    'registrationNumber': registrationInput,
    'notes': notesInput
  };
  
  // Add event listeners to all form fields
  for (const [key, element] of Object.entries(formFieldMap)) {
    element.addEventListener('focus', () => {
      showCropOverlay(key);
    });
    
    element.addEventListener('blur', () => {
      hideCropOverlay();
    });
  }
}

// Show crop overlay for a specific field
function showCropOverlay(fieldName) {
  if (!ocrData || !ocrData[fieldName]) return;
  
  // Skip fields with zero dimensions (like N/A fields)
  const field = ocrData[fieldName];
  if (field.width === 0 || field.height === 0) {
    console.log(`Skipping overlay for ${fieldName} - has zero dimensions`);
    return;
  }
  
  // Get the image dimensions
  const displayedWidth = originalImage.clientWidth;
  const displayedHeight = originalImage.clientHeight;
  
  console.log(`Displayed image dimensions: ${displayedWidth}x${displayedHeight}`);
  console.log(`OCR dimensions: ${originalImageWidth}x${originalImageHeight}`);
  
  // ===== SIMPLIFIED RELATIVE POSITIONING APPROACH =====
  // Calculate the scaling ratio between OCR dimensions and displayed dimensions
  let scaleX, scaleY;
  
  if (originalImageWidth > 0 && originalImageHeight > 0) {
    // Calculate the aspect ratios
    const ocrAspectRatio = originalImageWidth / originalImageHeight;
    const displayedAspectRatio = displayedWidth / displayedHeight;
    
    // Determine how scaling should be applied based on which dimension constrains the image
    if (Math.abs(ocrAspectRatio - displayedAspectRatio) < 0.01) {
      // Aspect ratios are virtually the same - simple scaling
      scaleX = displayedWidth / originalImageWidth;
      scaleY = displayedHeight / originalImageHeight;
    } else if (ocrAspectRatio > displayedAspectRatio) {
      // Width is the constraining dimension
      scaleX = displayedWidth / originalImageWidth;
      scaleY = scaleX; // Preserve aspect ratio
    } else {
      // Height is the constraining dimension
      scaleY = displayedHeight / originalImageHeight;
      scaleX = scaleY; // Preserve aspect ratio
    }
  } else {
    // Fallback if OCR dimensions aren't available
    const naturalWidth = originalImage.naturalWidth;
    const naturalHeight = originalImage.naturalHeight;
    scaleX = displayedWidth / naturalWidth;
    scaleY = displayedHeight / naturalHeight;
  }
  
  // Calculate the positions - scaled directly from OCR coordinates
  let x = Math.round(field.x * scaleX);
  let y = Math.round(field.y * scaleY);
  let width = Math.round(field.width * scaleX);
  let height = Math.round(field.height * scaleY);
  
  // Ensure minimum size for visibility
  if (width < 10) width = 10;
  if (height < 10) height = 10;
  
  // Position overlay directly using container-relative coordinates
  // No need for getBoundingClientRect or offsets - the container is the reference
  cropOverlay.style.left = `${x}px`;
  cropOverlay.style.top = `${y}px`;
  cropOverlay.style.width = `${width}px`;
  cropOverlay.style.height = `${height}px`;
  cropOverlay.style.display = 'block';
  
  // Debug logging
  console.log(`Field: ${fieldName}`);
  console.log(`OCR field data: x:${field.x}, y:${field.y}, w:${field.width}, h:${field.height}`);
  console.log(`Scaling factors: scaleX:${scaleX.toFixed(4)}, scaleY:${scaleY.toFixed(4)}`);
  console.log(`Final overlay position: left:${x}px, top:${y}px, width:${width}px, height:${height}px`);
}

// Hide crop overlay
function hideCropOverlay() {
  cropOverlay.style.display = 'none';
}

// Update OCR data with form values
function updateOcrData() {
  if (!ocrData) {
    ocrData = {};
  }
  
  // Preserve original image dimensions if they exist
  if (originalImageWidth > 0 && originalImageHeight > 0) {
    ocrData.imageWidthPx = originalImageWidth;
    ocrData.imageHeightPx = originalImageHeight;
  }
  
  // Only update values, not coordinates
  if (ocrData.payeeName) {
    ocrData.payeeName.value = payeeNameInput.value;
  } else if (payeeNameInput.value) {
    ocrData.payeeName = { value: payeeNameInput.value };
  }
  
  if (ocrData.issueDate) {
    ocrData.issueDate.value = issueDateInput.value;
  } else if (issueDateInput.value) {
    ocrData.issueDate = { value: issueDateInput.value };
  }
  
  if (ocrData.amountIncludingTax) {
    ocrData.amountIncludingTax.value = amountInput.value;
  } else if (amountInput.value) {
    ocrData.amountIncludingTax = { value: amountInput.value };
  }
  
  if (ocrData.currency) {
    ocrData.currency.value = currencyInput.value;
  } else if (currencyInput.value) {
    ocrData.currency = { value: currencyInput.value };
  }
  
  if (ocrData.registrationNumber) {
    ocrData.registrationNumber.value = registrationInput.value;
  } else if (registrationInput.value) {
    ocrData.registrationNumber = { value: registrationInput.value };
  }
  
  if (ocrData.notes) {
    ocrData.notes.value = notesInput.value;
  } else if (notesInput.value) {
    ocrData.notes = { value: notesInput.value };
  }
  
  return ocrData;
}

// Create formatted JSON output
function createFormattedJSON() {
  const updatedData = updateOcrData();
  
  // Create a simplified JSON structure with just the values
  const simplifiedData = {
    imageWidthPx: updatedData.imageWidthPx,
    imageHeightPx: updatedData.imageHeightPx,
    payeeName: updatedData.payeeName?.value || '',
    issueDate: updatedData.issueDate?.value || '',
    amountIncludingTax: updatedData.amountIncludingTax?.value || '',
    currency: updatedData.currency?.value || '',
    registrationNumber: updatedData.registrationNumber?.value || '',
    notes: updatedData.notes?.value || ''
  };
  
  // Return formatted JSON string
  return JSON.stringify(simplifiedData, null, 2);
}

// Save image button handler
saveImageBtn.addEventListener('click', () => {
  if (!originalImage.src) {
    alert('画像が読み込まれていません。');
    return;
  }
  
  try {
    // Create a temporary canvas to get the image data
    const ctx = hiddenCanvas.getContext('2d');
    hiddenCanvas.width = originalImage.naturalWidth;
    hiddenCanvas.height = originalImage.naturalHeight;
    ctx.drawImage(originalImage, 0, 0);
    
    const dataUrl = hiddenCanvas.toDataURL('image/png');
    const a = document.createElement('a');
    a.href = dataUrl;
    
    // Create filename with timestamp
    const now = new Date();
    const timestamp = now.toISOString().replace(/[:.]/g, '-');
    a.download = `receipt_${timestamp}.png`;
    a.click();
  } catch (error) {
    console.error('画像保存エラー:', error);
    alert('画像の保存中にエラーが発生しました');
  }
});

// Save JSON button handler
saveJsonBtn.addEventListener('click', () => {
  try {
    const jsonContent = createFormattedJSON();
    const blob = new Blob([jsonContent], { type: 'application/json' });
    const url = URL.createObjectURL(blob);
    const a = document.createElement('a');
    a.href = url;
    
    // Create filename with timestamp
    const now = new Date();
    const timestamp = now.toISOString().replace(/[:.]/g, '-');
    a.download = `ocr_result_${timestamp}.json`;
    a.click();
    URL.revokeObjectURL(url);
  } catch (error) {
    console.error('JSON保存エラー:', error);
    alert('JSONの保存中にエラーが発生しました');
  }
});

// Back button handler
backBtn.addEventListener('click', () => {
  // Save any edited data before going back
  const updatedData = updateOcrData();
  chrome.storage.local.set({ 'ocrResults': JSON.stringify(updatedData) });
  
  // Navigate back to the camera screen (popup.html)
  window.location.href = 'popup.html';
});

options.html / options.js

APIキー設定画面
3つのAIサービス（Google Gemini、Anthropic Claude、OpenAI ChatGPT）のAPIキー設定
APIキーの表示/非表示切り替え機能とマスキング機能
キー設定の保存機能

options.htmlを表示している様子

options.html

options.html
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>AI OCR Extension - Options</title>
    <style>
      body {
        font-family: sans-serif;
        margin: 0;
        padding: 20px;
      }
      
      .container {
        max-width: 800px;
        margin: 0 auto;
        padding: 20px;
        background-color: #f9f9f9;
        border-radius: 8px;
        box-shadow: 0 2px 10px rgba(0, 0, 0, 0.1);
      }
      
      h1 {
        margin: 0 0 20px 0;
        color: #333;
      }
      
      .form-group {
        margin-bottom: 20px;
      }
      
      label {
        display: block;
        margin-bottom: 8px;
        font-weight: bold;
      }
      
      input[type="text"] {
        width: 100%;
        padding: 10px;
        border: 1px solid #ccc;
        border-radius: 4px;
        font-size: 16px;
        box-sizing: border-box;
      }
      
      button {
        background-color: #4285f4;
        color: white;
        border: none;
        padding: 10px 20px;
        border-radius: 4px;
        cursor: pointer;
        font-size: 16px;
      }
      
      button:hover {
        background-color: #3367d6;
      }
      
      .info {
        margin-top: 20px;
        padding: 15px;
        background-color: #e8f0fe;
        border-left: 4px solid #4285f4;
        border-radius: 4px;
      }
      
      .api-section {
        margin-bottom: 30px;
        padding-bottom: 20px;
        border-bottom: 1px solid #ddd;
      }
      
      .api-section:last-child {
        border-bottom: none;
      }
      
      @media (max-width: 600px) {
        .container {
          padding: 15px;
        }
      }
      
      /* 追加スタイル: APIキー表示切替ボタン関連 */
      .input-with-toggle {
        display: flex;
        align-items: center;
        width: 100%;
      }
      
      .input-with-toggle input {
        flex: 1;
        margin-right: 8px;
      }
      
      .toggle-visibility {
        background-color: #f0f0f0;
        color: #333;
        border: 1px solid #ccc;
        border-radius: 4px;
        padding: 8px 12px;
        cursor: pointer;
        font-size: 14px;
        white-space: nowrap;
      }
      
      .toggle-visibility:hover {
        background-color: #e3e3e3;
      }
    </style>
  </head>
  <body>
    <div class="container">
      <h1>AI OCR APIキー設定</h1>
      
      <div class="api-section">
        <h2>Google Gemini</h2>
        <div class="form-group">
          <label for="geminiApiKey">API Key:</label>
          <div class="input-with-toggle">
            <input type="password" id="geminiApiKey" placeholder="Google Gemini API Keyを入力してください" />
            <button type="button" class="toggle-visibility" data-for="geminiApiKey">表示</button>
          </div>
        </div>
        <div class="info">
          <p>Gemini 2.0 Flash API Keyは<a href="https://aistudio.google.com/" target="_blank">Google AI Studio</a>から取得できます。</p>
        </div>
      </div>
      
      <div class="api-section">
        <h2>Anthropic Claude</h2>
        <div class="form-group">
          <label for="claudeApiKey">API Key:</label>
          <div class="input-with-toggle">
            <input type="password" id="claudeApiKey" placeholder="Anthropic Claude API Keyを入力してください" />
            <button type="button" class="toggle-visibility" data-for="claudeApiKey">表示</button>
          </div>
        </div>
        <div class="info">
          <p>Claude API Keyは<a href="https://console.anthropic.com/" target="_blank">Anthropic Console</a>から取得できます。</p>
        </div>
      </div>
      
      <div class="api-section">
        <h2>OpenAI ChatGPT</h2>
        <div class="form-group">
          <label for="openaiApiKey">API Key:</label>
          <div class="input-with-toggle">
            <input type="password" id="openaiApiKey" placeholder="OpenAI ChatGPT API Keyを入力してください" />
            <button type="button" class="toggle-visibility" data-for="openaiApiKey">表示</button>
          </div>
        </div>
        <div class="info">
          <p>OpenAI API Keyは<a href="https://platform.openai.com/api-keys" target="_blank">OpenAI Platform</a>から取得できます。</p>
        </div>
      </div>
      
      <button id="saveKey">保存</button>
      
      <div class="info">
        <p>入力したAPIキーはこの拡張機能内でのみ使用され、OCR処理のために必要です。</p>
      </div>
    </div>

    <script src="options.js"></script>
  </body>
</html>

options.js

options.js
// options.js
document.addEventListener('DOMContentLoaded', () => {
  // 要素への参照
  const saveButton = document.getElementById('saveKey');
  const geminiApiKeyInput = document.getElementById('geminiApiKey');
  const claudeApiKeyInput = document.getElementById('claudeApiKey');
  const openaiApiKeyInput = document.getElementById('openaiApiKey');
  const toggleButtons = document.querySelectorAll('.toggle-visibility');
  
  // マスク表示用のAPIキー隠蔽関数
  function maskApiKey(key) {
    if (!key) return '';
    // 最初と最後の4文字を表示し、間は*で隠す（キーが8文字以下の場合はすべて*）
    if (key.length <= 8) {
      return '*'.repeat(key.length);
    }
    return key.substring(0, 4) + '*'.repeat(key.length - 8) + key.substring(key.length - 4);
  }
  
  // 実際のAPIキー値を保持するオブジェクト
  const actualKeys = {
    geminiApiKey: '',
    claudeApiKey: '',
    openaiApiKey: ''
  };
  
  // 既存のAPIキーがあれば読み込む
  chrome.storage.local.get(['geminiApiKey', 'claudeApiKey', 'openaiApiKey'], (data) => {
    // 実際のキー値を保存
    if (data.geminiApiKey) {
      actualKeys.geminiApiKey = data.geminiApiKey;
      geminiApiKeyInput.value = maskApiKey(data.geminiApiKey);
      geminiApiKeyInput.dataset.masked = 'true';
    }
    
    if (data.claudeApiKey) {
      actualKeys.claudeApiKey = data.claudeApiKey;
      claudeApiKeyInput.value = maskApiKey(data.claudeApiKey);
      claudeApiKeyInput.dataset.masked = 'true';
    }
    
    if (data.openaiApiKey) {
      actualKeys.openaiApiKey = data.openaiApiKey;
      openaiApiKeyInput.value = maskApiKey(data.openaiApiKey);
      openaiApiKeyInput.dataset.masked = 'true';
    }
  });
  
  // 表示/非表示切替ボタンの処理
  toggleButtons.forEach(button => {
    const targetId = button.dataset.for;
    const targetInput = document.getElementById(targetId);
    
    button.addEventListener('click', () => {
      const isMasked = targetInput.type === 'password';
      
      if (isMasked) {
        // マスクを解除して表示
        targetInput.type = 'text';
        button.textContent = '隠す';
        
        // 表示するとき、マスク状態だった場合は実際の値を表示
        if (targetInput.dataset.masked === 'true') {
          targetInput.value = actualKeys[targetId];
          targetInput.dataset.masked = 'false';
        }
      } else {
        // マスクをかけて非表示
        targetInput.type = 'password';
        button.textContent = '表示';
        
        // 値が変更されている場合は、マスクをかけない（ユーザーが編集中の状態）
        if (targetInput.value !== actualKeys[targetId]) {
          targetInput.dataset.masked = 'false';
        } else {
          // 値が元のままなら、マスク表示に戻す
          targetInput.value = maskApiKey(actualKeys[targetId]);
          targetInput.dataset.masked = 'true';
        }
      }
    });
  });
  
  // 入力フィールドのフォーカス時の処理
  const handleInputFocus = (input, keyName) => {
    // マスク状態でフォーカスを得たら、実際の値を表示
    if (input.dataset.masked === 'true') {
      input.type = 'text';
      input.value = actualKeys[keyName];
      input.dataset.masked = 'false';
      
      // 対応するボタンのテキストを更新
      const button = document.querySelector(`.toggle-visibility[data-for="${keyName}"]`);
      if (button) {
        button.textContent = '隠す';
      }
    }
  };
  
  // 入力フィールドのフォーカスアウト時の処理
  const handleInputBlur = (input, keyName) => {
    // 値が変更されていなければ、マスク表示に戻す
    if (input.value === actualKeys[keyName]) {
      input.type = 'password';
      input.value = maskApiKey(actualKeys[keyName]);
      input.dataset.masked = 'true';
      
      // 対応するボタンのテキストを更新
      const button = document.querySelector(`.toggle-visibility[data-for="${keyName}"]`);
      if (button) {
        button.textContent = '表示';
      }
    }
  };
  
  // 各入力フィールドにフォーカスイベントを設定
  geminiApiKeyInput.addEventListener('focus', () => handleInputFocus(geminiApiKeyInput, 'geminiApiKey'));
  claudeApiKeyInput.addEventListener('focus', () => handleInputFocus(claudeApiKeyInput, 'claudeApiKey'));
  openaiApiKeyInput.addEventListener('focus', () => handleInputFocus(openaiApiKeyInput, 'openaiApiKey'));
  
  // 各入力フィールドにブラーイベントを設定
  geminiApiKeyInput.addEventListener('blur', () => handleInputBlur(geminiApiKeyInput, 'geminiApiKey'));
  claudeApiKeyInput.addEventListener('blur', () => handleInputBlur(claudeApiKeyInput, 'claudeApiKey'));
  openaiApiKeyInput.addEventListener('blur', () => handleInputBlur(openaiApiKeyInput, 'openaiApiKey'));
  
  // 保存ボタンのイベントリスナー
  saveButton.addEventListener('click', () => {
    // 実際の値を取得（マスク表示になっている場合は実際の値を使用）
    const geminiKey = geminiApiKeyInput.dataset.masked === 'true' ? 
                      actualKeys.geminiApiKey : 
                      geminiApiKeyInput.value.trim();
                      
    const claudeKey = claudeApiKeyInput.dataset.masked === 'true' ? 
                      actualKeys.claudeApiKey : 
                      claudeApiKeyInput.value.trim();
                      
    const openaiKey = openaiApiKeyInput.dataset.masked === 'true' ? 
                      actualKeys.openaiApiKey : 
                      openaiApiKeyInput.value.trim();
    
    if (!geminiKey && !claudeKey && !openaiKey) {
      alert('少なくとも1つのAPIキーを入力してください');
      return;
    }
    
    // 一時的に保存中の状態を表示
    const originalText = saveButton.textContent;
    saveButton.textContent = '保存中...';
    saveButton.disabled = true;
    
    // 全APIキーをストレージに保存
    chrome.storage.local.set({
      geminiApiKey: geminiKey,
      claudeApiKey: claudeKey,
      openaiApiKey: openaiKey
    }, () => {
      // 実際のキー値を更新
      actualKeys.geminiApiKey = geminiKey;
      actualKeys.claudeApiKey = claudeKey;
      actualKeys.openaiApiKey = openaiKey;
      
      // 入力フィールドをマスク表示に戻す
      geminiApiKeyInput.type = 'password';
      geminiApiKeyInput.value = maskApiKey(geminiKey);
      geminiApiKeyInput.dataset.masked = 'true';
      
      claudeApiKeyInput.type = 'password';
      claudeApiKeyInput.value = maskApiKey(claudeKey);
      claudeApiKeyInput.dataset.masked = 'true';
      
      openaiApiKeyInput.type = 'password';
      openaiApiKeyInput.value = maskApiKey(openaiKey);
      openaiApiKeyInput.dataset.masked = 'true';
      
      // ボタンテキストを元に戻す
      toggleButtons.forEach(button => {
        button.textContent = '表示';
      });
      
      // 保存成功のフィードバック
      saveButton.textContent = '✓ 保存しました';
      
      // 元のテキストに戻す
      setTimeout(() => {
        saveButton.textContent = originalText;
        saveButton.disabled = false;
      }, 2000);
    });
  });
  
  // 各入力フィールドでEnterキーを押したときに保存を実行
  const inputFields = [geminiApiKeyInput, claudeApiKeyInput, openaiApiKeyInput];
  inputFields.forEach(input => {
    input.addEventListener('keydown', (event) => {
      if (event.key === 'Enter') {
        saveButton.click();
      }
    });
  });
});

おわりに

いかがでしたか。このツールを応用すると今後新しい生成AIモデルが公開されたときも比較的素早くOCRの検証ができます。また、抽出データと画像の対応関係は、生成AIモデルの精度や恐らくChrome拡張の実装で改善の余地がありそうで、今後に期待かなと考えています。

Accenture Japan (有志)

アクセンチュア株式会社に所属する社員有志による運営です。アクセンチュアの社員による様々な発信をまとめています。なお、投稿内容は社員個人の見解であり、所属する組織を代表するものではありません。

Discussion

ログインするとコメントできます

はじめに

機能強化した結果のシステム全体概要

主な特徴：

マルチAIモデル対応：

カメラ撮影とOCR処理：

AIモデルによる画像認識で以下の情報を抽出：

結果の確認と編集：

データ保存：

APIキー管理：

処理フロー：

ソースコード

manifest.json

service-worker.js

popup.html / popup.js

comparison.html / comparison.js

options.html / options.js

おわりに

Discussion