iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🛡️

Detecting Dangerous C Code with Static Analysis: A Revised 4-Layer Model

に公開

This is a revised version of my previous article, "Detecting Dangerous C Code with Static Analysis Tool creview". I received five essential points of feedback in the comments, which made me realize my organization was shallow, so I have rewritten it from scratch. Thank you to everyone who provided feedback.

Here are the points I have corrected:

  1. The section where I described static int g_initialized; as "uninitialized / CWE-457" → Changed to a design bug involving missing init() calls, categorized under CWE-665.
  2. The first choice for missing NULL checks → Changed from "NULL checks in the function body" to "Compile-time contracts with __attribute__((nonnull)) + preventing return value discard with warn_unused_result."
  3. Detection of L76 sizeof(buf) as pointer size → If it is a local array declaration, it is the array size. This was a false positive in creview.
  4. Magic number detection → Fixed the inconsistency of "picking up malloc arguments but ignoring array declarations" by adopting a unified rule: setting severity via flags and context for everything except {-1,0,1,2}.
  5. Article structure → Changed from starting with creview to a 4-layer model: L0: Compiler Warnings → L1: Mainstream OSS → L2: creview → L3: CSAF. The operational rule that PRs that do not result in zero warnings do not even enter the review process is the first priority.

All five of these were things I should have understood from the start based on my field experience (23 years in C/embedded freelance), and especially the 5th point—"not starting with compiler warnings"—was fatal to the structure. In this revised version, the order is straightened out.

SCQA

  • Situation: In C code reviews, 70-80% of comments given to junior developers converge on the same patterns (NULL / uninitialized / return value discard / recursion / goto).
  • Problem: Writing the same thing for every PR every week. If I miss something on a tiring day, it goes into production.
  • Question: How do we divide labor among compilers, existing OSS, and custom tools, and what should humans focus on?
  • Answer: Divide into 4 layers and delegate to different tools for each layer. PRs that fail to pass L0 are not reviewed at all.

L0: First, eliminate all compiler warnings (Most important)

Before getting into creview, we enforce a rule: code that cannot pass these checks will not proceed to review.

# GCC family
gcc -std=c11 -Wall -Wextra -Wpedantic -Werror -fanalyzer -O2 -c src/*.c

# Clang family
clang -std=c11 -Wall -Wextra -Wpedantic -Werror -Weverything -O2 -c src/*.c

With just this, many of the detection items I raised previously are caught at the compilation stage:

Detection Item Compiler Warning
Use of uninitialized auto variable -Wuninitialized / -Wmaybe-uninitialized
Passing NULL to nonnull argument -Wnonnull
Discarding return value of warn_unused_result function -Wunused-result
Trivial NULL dereference -Wnull-dereference (Clang) / -fanalyzer (GCC)
use-after-free / double-free -fanalyzer (GCC)
Buffer overflow (statically determinable) -Warray-bounds / -fanalyzer
sizeof(pointer) misuse for array length -Wsizeof-pointer-memaccess
Missing switch case (not all enums covered) -Wswitch-enum
Function without return type -Wreturn-type

Operational Rules (The core of this article)

PRs that do not result in zero L0 warnings are not subject to human review

If you include -Werror in your CI, you can physically enforce this. This eliminates more than half of the "not this again" cases before a human even looks at them.

At runtime, run -fsanitize=address,undefined (AddressSanitizer / UBSan) during the CI test phase to catch dynamic bugs that L0 cannot catch.

L1: clang-tidy / cppcheck — Static analysis by mainstream OSS

Once a PR has cleared L0, the next step is to apply mainstream OSS static analysis tools.

  • clang-tidy: clang-tidy --checks='cert-*,bugprone-*,readability-*,performance-*'
  • cppcheck: cppcheck --enable=all --inconclusive --std=c11 src/

These should also be integrated into the CI, and warnings of HIGH severity or above should be treated as -Werror to fail the build. The premise of modern C is that L0 + L1 can catch almost all of the "textbook standard bugs."

L2: creview — Project-specific + Japanese knowledge

This is where my custom tool, creview, finally comes into play. The significance of creview's existence is not as a replacement for L0/L1, but as a subsequent step, handling the following:

  • Detecting project-specific patterns (e.g., library calls prohibited by internal policy / handler naming convention violations / review policy violations).
  • Providing explanations in Japanese and knowledge for junior developers ("Why is this bad?" "How should I fix it?" within the business context).
  • Severity model (Critical / Design Unclear / Maintenance Risk) and CWE/MISRA mapping.
  • Scanning only the diff (creview --preset pr checks only the touched lines).
  • AI reinforcement (letting Claude judge ambiguous detections).
$ creview --local-only --preset pr --format json src/feature.c

"local-only" mode does not require a Claude API key and runs purely on a rule-based system.

L3: CSAF — Audit level

Finally, CSAF uses libclang AST + dependency graphs to automatically escalate risks to A/B/C. This is too heavy for individual development PR reviews, but it is used for pre-release audits, MISRA C compliance checks, and initial audits of old, large-scale codebases.

Verification Code (Revised Edition)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* --- REQ-S1: Missing set for initialization flag (CWE-665 Improper Initialization) ---
   Static storage is automatically initialized to 0 by ISO C (it is not indeterminate).
   The problem is a design bug where the semantic "initialized" state is not reached if init() is not called.
   This is not CWE-457 (reading indeterminate values from auto variables). */
static int g_initialized;

void init(void) {
    g_initialized = 1;  /* A classic design bug where calling it from main is forgotten */
}

/* --- REQ-M1a: NULL contract not stated ---
   Expected form:
     void process_data(char *data) __attribute__((nonnull(1)));
   By doing this, process_data(NULL) will trigger a compile-time warning via -Wnonnull. */
void process_data(char *data) {
    printf("Data: %s\n", data);
    int len = strlen(data);
    printf("Length: %d\n", len);
}

/* --- REQ-M1b: Lack of mandatory return value inspection ---
   Expected form:
     int validate_data(const char *data)
         __attribute__((warn_unused_result, nonnull(1)));
   If the return value is discarded by the caller, it will fail with -Werror=unused-result.
   MISRA C 2012 Dir 4.7 / CWE-252. */
int validate_data(const char *data) {
    if (data == NULL) return -1;
    return 0;
}

/* --- REQ-M2: True use of uninitialized auto variable (CWE-457) --- */
int calculate_sum(int count) {
    int total;  /* Indeterminate value = UB */
    for (int i = 0; i < count; i++) total += i;
    return total;
}

int main(void) {
    /* REQ-S1: Forgot to call init(). g_initialized remains 0 */
    if (g_initialized) printf("Already initialized\n");

    process_data(NULL);   /* REQ-M1a: No nonnull contract, so no warning */
    validate_data(NULL);  /* REQ-M1b: Compiler does not complain about discarding return value */
    return 0;
}

CWE/MISRA Mapping (Corrected)

Detection Pattern CWE / MISRA Explanation
Use of uninitialized auto variable (= reading indeterminate value = UB) CWE-457 Reading an auto variable without initialization results in an indeterminate value/undefined behavior.
Missing set for static storage flag CWE-665 Improper Initialization Values are auto-initialized to 0, but the flag operation process is missed—a design bug, not CWE-457.
NULL contract not stated (no nonnull attribute) CWE-476 related Contract not conveyed to the caller. Should be declared with __attribute__((nonnull)).
Lack of mandatory return value check CWE-252 + MISRA C 2012 Dir 4.7 Attach __attribute__((warn_unused_result)) to functions returning errors and forbid discarding return values with -Werror=unused-result.
NULL pointer dereference CWE-476 Crashing upon NULL deref.
Memory leak CWE-401 Forgetting to free.
Double free CWE-415 -fanalyzer catches many of these.
Use-after-free CWE-416 Referencing a pointer after it has been freed.
Stack overflow CWE-121 Unverified length in strcpy, etc.
Insufficient boundary check CWE-129 Unverified array index.
sizeof(pointer) misuse for array length CWE-467 Array types decay in function parameters. Returns correct array size for local array declarations.
Hard-coded constants (magic numbers) CWE-547 Severities managed by flag/context for all except allowlist {-1,0,1,2}.

How to use creview (CLI execution example — corrected)

L29 [Critical]: Used uninitialized variable total. Reading indeterminate value of auto variable = UB
L33 [Critical]: validate_data missing warn_unused_result attribute. Return value discard cannot be forbidden (CWE-252 / Dir 4.7)
L40 [Maintenance Risk]: process_data missing nonnull attribute. Cannot detect NULL arguments via -Wnonnull
L45 [Maintenance Risk]: g_initialized is auto-initialized to 0. Not CWE-457, but a CWE-665-style flag set omission
L52 [Critical]: Magic number 256. malloc argument (#define / enum recommended)
L62 [Critical]: Magic number 100. Array declaration size

I have removed L76 [Critical]: sizeof(buf) is pointer size from my previous article as it was a false positive. sizeof(buf) for a local array declaration like char buf[100]; returns the array size (=100), which is standard C behavior and not a warning target. It only results in the pointer size for function parameter array types (which decay) and explicit pointer declarations.

4-Level Usage Table

Scenario Tools Used Where to use
L0: Build-time (Start here) gcc / clang warnings + -Werror -Wall -Wextra -Wpedantic -Werror -fanalyzer (GCC) / -Weverything -Werror (Clang)
L1: PR Check clang-tidy / cppcheck Force mainstream OSS checks in CI
L2: PR Review creview Project-specific / Japanese / severity / CWE/MISRA / scan only diff via --preset pr
L2': Self-check c-review-ai Copy-paste in browser. No setup required
L3: Audit CSAF Escalating risk A/B/C automatically via libclang AST + dependency graph

Operational Rules (The core of this article)

PRs that do not result in zero L0 warnings are not subject to human review.

L1 (clang-tidy / cppcheck) should also fail the build with -Werror equivalent in CI.

L2 (creview) is a subsequent step. Do not run it as a replacement for L0/L1.

In teams where this rule is effective, the cost of "not this again" for human reviewers drops significantly. Starting work on custom tools like creview should only come after L0/L1 are fully integrated into operations.

Summary

  • It was a structural mistake to start writing about creview in the previous article. Forcing compiler warnings (L0) in CI is the first priority; creview comes after.
  • Auto-zero initialization of static storage (CWE-665) and reading indeterminate values of auto variables (CWE-457) are different things. Do not confuse them.
  • The first choice for missing NULL checks is compile-time contracts with __attribute__((nonnull)). Functions returning errors must always use __attribute__((warn_unused_result)).
  • Magic number detection rule: flag/context severities for everything except {-1,0,1,2}.
  • sizeof returns array size for local/file-scope array declarations, struct member arrays, and typedef arrays. It becomes pointer size only during function parameter decay or explicit pointer declaration.

Thank you once again to everyone who provided feedback in the comments.


Information is as of May 2026. Versions of creview / CSAF are updated periodically.

GitHubで編集を提案

Discussion