iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🔏

Act 4: Building the Creators Dashboard — Preventing Image Swapping with SHA-256 Hashing

に公開

About this article

Claude Code drafted this based on AI conversation logs and experimental records, and I have added edits and corrections.
The content of this article represents decisions made for a personal development project during the PoC phase and does not claim to be the optimal solution for general cases.

What happens if an image confirmed as "safe" by SafeSearch turns into something else by the time it is displayed? As long as we handle external images, this issue is structurally unavoidable.

In my previous article, I introduced how I changed the method for embedding X posts three times. At the end of that article, I touched on the security risks of the pipeline that passes external data to the DOM. This time, I will discuss the security design of another type of external content — images.

On the Creators Dashboard, creators can specify images from their own repositories via URL to display as thumbnails on project cards. The image data itself is not stored on our server; the design references external URLs directly.

However, what if the image is replaced after being deemed "safe"? There is a possibility that an image confirmed as safe by a fetch job using Cloud Vision SafeSearch could be a different image by the time it is displayed.

This article documents the design decision to record the SHA-256 hash during the fetch process, recalculate it on the browser side upon display, and prevent display if they do not match, along with its implementation.


Introduction

Goals

  • Explain the risk of external image "swapping" and its countermeasures
  • Introduce the implementation of hash calculation and SafeSearch verification on the fetch job side (Python)
  • Introduce the implementation of hash recalculation and display logic on the browser side (JavaScript)
  • Introduce the blob URL design that allows external image display without polluting CSP

Non-Goals

  • Usage instructions for the Cloud Vision SafeSearch API
  • Cryptographic explanation of SHA-256
  • Fundamental explanation of CSP (Content Security Policy)

The Problem — No guarantee that a "verified" image remains verified

The image display flow for Creators Dashboard is as follows:

Creator's GitHub repository
  → fetch job (Python / periodic execution)
    → Safety check via Cloud Vision SafeSearch
    → If safe, record the URL in JSON
  → Browser display
    → Read URL from JSON, display image

The problem is the time lag between the fetch job execution and the browser display time. Several hours or days may pass from the moment the job deems it "safe" until the browser retrieves the image. If the image is swapped in the meantime, the SafeSearch verdict becomes invalid.

You might think, "Is such an attack realistic?" However, the following cases can easily occur:

  • A creator unintentionally swaps an image on GitHub (repository cleanup, screenshot updates)
  • The image is replaced with another after a CDN cache update
  • Image history changes due to a force push to the repository

Even without malicious intent, there is a non-zero possibility that an image is replaced with one that would trigger an NG verdict in SafeSearch. In such cases, the original SafeSearch result becomes completely invalid.


Design — Ensuring "the image is the same" with hashes

As a countermeasure, I adopted a method of recording the SHA-256 hash of the image during the fetch and recalculating/verifying it on the browser side during display.

Fetch job (Python)
  ① Download the image
  ② Calculate SHA-256 hash
  ③ Safety check with Cloud Vision SafeSearch
  ④ Record hash + verdict in JSON

Browser (JavaScript)
  ⑤ Fetch image via fetch API
  ⑥ Recalculate SHA-256 hash with crypto.subtle.digest
  ⑦ Compare with the recorded hash
  ⑧ If match → display via blob URL / If mismatch → use fallback

If the hashes match, it guarantees that the image verified by SafeSearch and the image displayed in the browser are byte-for-byte identical.

Implementation on the fetch job side — Hash calculation and SafeSearch

Data model

Image data is recorded in JSON with the following structure:

type CatalogImage = {
  url: string       // Image URL
  hash: string      // "sha256:abc123..."
  etag?: string     // HTTP ETag (for caching)
  verdict: number   // 0: OK, >0: Bit flag (reason for concern)
}

verdict is a bit flag representation of the Cloud Vision SafeSearch results.

Bit Value Meaning
0 1 adult/sexual
1 2 violence
2 4 racy
3 8 medical
4 16 spoof

verdict: 0 means "no issues." Images that return LIKELY or VERY_LIKELY verdicts in SafeSearch are excluded and not recorded in the JSON.

Hash calculation

def compute_hash(data: bytes) -> str:
    return "sha256:" + hashlib.sha256(data).hexdigest()

The raw byte sequence of the image is hashed using SHA-256 and recorded as a string with a sha256: prefix. This prefix is included to allow for identification should the algorithm be changed in the future.

Caching with etag

The Cloud Vision API is usage-based. It is wasteful to call the API every time if the image has not changed. I use the HTTP ETag header to reuse the previous result if the image has not been updated.

# Retrieve only the etag with a HEAD request (saves bandwidth)
head_resp = requests.head(url, timeout=10)
remote_etag = head_resp.headers.get("ETag", "")

# Cache hit → reuse the previous result
if remote_etag and remote_etag == cached_etag:
    return cached_result

The image is downloaded and the hash/SafeSearch is executed only when the etag has changed.

Output example

{
  "name": "my-app",
  "image": {
    "url": "https://raw.githubusercontent.com/.../screenshot.png",
    "hash": "sha256:a1b2c3d4e5f6...",
    "etag": "\"686897696a7c876b7e\"",
    "verdict": 0
  }
}

Implementation on the browser side — Hash recalculation and display logic

HTML output (SSR)

On the server side, the image URL and hash are embedded into the card as data-* attributes. The <img> tag is not output at this stage.

<div class="card-image"
     data-img-url="https://raw.githubusercontent.com/.../screenshot.png"
     data-img-hash="sha256:a1b2c3d4e5f6..."
     data-img-verdict="0">
  <span class="card-image-name">my-app</span>
</div>

Initially, only the project name and background color are displayed. The image is added to the DOM only after being verified via JavaScript.

Hash verification and blob URL display

The browser-side processing consists of four steps:

  1. Fetch the image as an ArrayBuffer using the fetch API
  2. Recalculate the hash with crypto.subtle.digest('SHA-256', buf) — The Web Crypto API is a browser standard; no extra libraries are required.
  3. Compare with the recorded hash — If it does not match, use a fallback (keeping the name + background color).
  4. If it matches, generate a blob URL with URL.createObjectURL and append the <img> tag to the DOM

The key point is not to write <img src="external_url"> directly. The image only appears in the DOM after JavaScript completes the fetch → verification → blob URL conversion pipeline. If the verification fails, the image element is never created.


Blob URL method that keeps CSP clean

When displaying external images, a common approach is to add the external domain to the img-src directive in the CSP:

img-src 'self' https://raw.githubusercontent.com https://example.com ...

However, images might be specified from different domains for each creator. Continuously adding domains will cause the CSP to bloat and become unmanageable.

The blob URL method avoids this issue:

img-src 'self' blob:

The image is retrieved by JavaScript via fetch(), converted to a blob URL using URL.createObjectURL(), and then set as the <img> source. Since a blob URL is treated as a "virtual resource of the same origin," data fetched from an external domain can be displayed with only 'self' or blob: permissions in the CSP.

The fetch → hash verification → blob URL display pipeline simultaneously solves both security verification and CSP management. Designing the process to retrieve byte sequences via fetch() for hash verification had the secondary benefit of resolving the CSP issue.

Trade-offs of this design

This approach comes with costs:

  • Computing the SHA-256 hash in the browser every time (CPU load)
  • Initial display delay (due to the fetch + digest process)
  • Normal browser image caching is less effective for blob URLs

Nevertheless, I prioritized "guaranteeing safety at the time of display." For this service, ensuring that inappropriate images are not shown is more important than image display speed.


Four display states

The image display splits into four states. In any state, it will not result in a "broken" visual where nothing appears.

State Condition Display
Verified display hash match + verdict=0 Image (blob URL)
Warning skip verdict > 0 Fallback (name + background color)
Hash mismatch hash differs Fallback (name + background color)
Network error fetch failure Fallback (name + background color)

In the fallback state, the <span class="card-image-name"> output during SSR remains as is. In environments where JavaScript does not run, network outages, or image swapping, the project name and background color will always be visible.


What this design prevents and does not prevent

What it prevents

  • Image swapping after review: If an image is replaced with an inappropriate one after being marked OK by SafeSearch, it will not display due to a hash mismatch.
  • Image tampering via Man-in-the-Middle (MITM) attacks: While HTTPS usually prevents tampering, the hash can still detect issues if abnormalities occur at the CDN or proxy layer.
  • CSP bloating: Allows displaying external images without adding external domains.

What it does not prevent

  • Inappropriate images at the time of the fetch job: If an inappropriate image was already present during the job execution, it will be excluded if SafeSearch returns an NG verdict, but this remains dependent on SafeSearch accuracy.
  • Different images with the same hash (SHA-256 collision): Theoretically possible, but not a realistic risk.

Summary

In this article, I introduced a design that prevents "swapping" of external images using hash verification.

  • Double-check design: Calculate and record the SHA-256 hash on the fetch job side and recalculate/verify it on the browser side using crypto.subtle.digest. If the hashes do not match, the image is not displayed.
  • CSP management via blob URLs: By displaying images via fetch() → blob URL, there is no need to add external domains to the CSP. Hash verification and CSP management are solved via the same path.
  • Fallback design: In all cases of verification failure, network errors, or SafeSearch warnings, the project name and background color are displayed. The absence of an image indicates that "defenses are functioning," not that it is "broken."

For services handling external content, safety at the time of review is different from safety at the time of display. The hash acts as a link between those two points, and the mechanism that guarantees "being the same" at the byte level supports user trust.

Remaining challenges

Safe image display has been solved. However, the design of referencing external URLs without self-hosting images has another constraint: Vercel's bandwidth limits. While the blob URL method does not consume Vercel bandwidth as the browser fetches the image directly, OGP images and card preview images must still be generated dynamically on the server side.

In Act Five, I will cover the design for generating dynamic images within Vercel Blob's bandwidth constraints.


Lastly, a brief introduction to the service. The overall picture of Creators Dashboard was introduced in Act One.


Performance Record (Development via docs × AI)

*In this series, I treat the trial-and-error process as a "performance" and describe the progress stages as "acts."

Part 7: Building Creators Dashboard (Planned for 5 acts)

Part 1: docs × AI (5 chapters)

Part 2: Organizing conversation logs (6 acts + interlude)

Part 3: Workflow and task management (4 acts)

Part 4: Building my first app (6 acts + curtain call)

Part 5: The aftermath of the SLM pipeline (3 acts)

Part 6: Using SLMs in production (Number of acts TBD)


Created: 2026-04-06
Source material: pg10 docs/architecture.md, docs/design.md, src/types.ts, src/app.tsx, static/js/card-stack.js, pg5-automation job1024 process.py / vision_safety.py, AI conversation summary slm.2026-03-28.claude.01

Article creation process

  • Plot creation: Claude Code (Material research + structural proposal)
  • Initial drafting: Claude Code
  • First review: Me
  • Editing: ChatGPT
  • Pre-posting: Zenn AI
  • Post-posting review: NotebookLM

*Uses the pg10 design docs (architecture.md, design.md, types.ts), implementation code (app.tsx, card-stack.js), pg5-automation's vision_safety.py / process.py, and 1 session of AI conversation summary as material.

Discussion