iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🤖

AI-Augmented Formal Methods: Dramatically Enhancing Instruction Precision for Claude Code

に公開

Introduction: Discoveries Born from Two Encounters

In December 2025, I was simultaneously immersed in two areas of learning.

One was studying formal methods in the "Requirements Engineering" course at The Open University of Japan Graduate School. The other was the development experience using Claude Desktop and Claude Code. Initially, I thought these two were separate things. However, when I began engaging deeply with the texts just before my exams, a certain conviction was born.

"Goal-Oriented Analysis (KAOS) and Formal Specification (Alloy) dramatically improve the instruction accuracy for Claude Code."

And I discovered that this methodology already has a name. It is called "AI-Augmented Formal Methods."

Why Formal Methods?

Limits of Natural Language

When giving instructions to Claude Code, have you ever had an experience like this?

"Please create a login function."
→ Session management is missing...

"Data is mandatory."
→ Which data? All of it? Part of it?

"Please process it as quickly as possible."
→ How many seconds is "quickly"?

Natural language has inherent ambiguity. When relying on descriptions based on "tacit understanding" between engineers, AI cannot accurately understand the intent.
In fact, it took about 3 hours of conversation for Claude to understand how to give instructions using KAOS and the formal specification method Alloy.

Formal Methods as a Solution

Formal Specification is a technique for strictly defining system specifications using mathematical logic and set theory (according to Wikipedia).

Just as bookkeeping records transactions in a double-entry system of "debits and credits," requirements can eliminate ambiguity by being expressed in both natural language and formal notation.

What I chose was:

  • KAOS: Goal-Oriented Analysis (expressed in a graph structure)
  • Alloy: Lightweight formal method (enables logic verification)

The Appeal of KAOS: Visualizing "Why" with Graphs

What is KAOS?

According to Wikipedia:

KAOS (Keep All Objectives Satisfied) focuses on the background (goals) of "why" a system is being built rather than just "what" it does, and systematizes this using a graph structure.

Why Graph Structures Are Superior

  1. Visualizable with NetworkX

    • Automatically generate diagrams with Python
    • Can be presented directly to the AI
  2. Organizable Hierarchically

   Root Goal: Improve agricultural productivity
     ├─ G1: Accurate instructions can be given
     │   ├─ G11: Formalize requirements
     │   └─ G12: Convert to prompts
     └─ G2: Data is accumulated
         ├─ G21: Optimize input
         └─ G22: Reminder function
  1. Clear AND/OR Logic
    • AND gate: All are mandatory
    • OR gate: Achievement of any one is OK

The Appeal of Alloy: Automatic Verification of Logical Consistency

What is Alloy?

It is a type of formal specification method, a tool that allows you to express constraints using mathematical logic and automatically verify them with the Alloy Analyzer.

Example: Mandatory Field Constraints

Natural Language (Ambiguous):

Work date, work type, and field are mandatory

Alloy (Strict):

fact MandatoryFields {
  all record: WorkRecord |
    some record.date and
    some record.workType and
    some record.field
}

Verification:

Execute → check MandatoryFields
✅ No counterexample found

Mathematically proving that there are no logical contradictions

What is AI-Augmented Formal Methods?

Problems with Traditional Formal Methods

Learning period: Six months to several years
Writer: Human (expert)
Scope: Limited (safety-critical fields only)

Learning costs are too high for practical business use.

Innovations of AI-Augmented Formal Methods

Learning period: A few hours to a few days
Writer: AI (Human verifies)
Scope: Any project

My Workflow

1. Human: Explains requirements in natural language
   "I want to create a farm work recording system"
   
2. AI (Claude): Generates KAOS diagram
   Outputs NetworkX code
   
3. Human: Reviews KAOS diagram and provides corrections
   "Both G21 and G22 are mandatory"
   
4. AI: Generates Alloy model
   Writes constraints as facts
   
5. Human: Verifies with Alloy Analyzer
   Executes check / run
   
6. AI: Generates Claude Code prompt
   Instructions for implementation based on verified specifications
   
7. Claude Code: Implementation
   Generates code without logical contradictions

Important Discovery: AI writes both KAOS and Alloy

Initially, I aimed to "become able to write KAOS and Alloy myself." However, in reality:

❌ Not necessary:

  • To be able to write Alloy perfectly
  • To assemble complex logical formulas yourself

✅ Necessary:

  • To judge whether the model generated by AI is correct
  • To interpret the verification results of Alloy Analyzer
  • To give correction instructions when an error occurs

In other words, what is required is "reading comprehension" and "verification skills," not "writing skills."

Changing the Role of Humans: Toward Logical Architects

What I Learned from Papers

Reading papers on AI-augmented formal methods, one passage was particularly striking:

Natural language has inherent ambiguities and "tacit understandings among engineers," and logical consistency cannot be fully guaranteed by AI auto-translation alone. There is a need for a mechanism involving human intervention.

(Summary from the Req2LTL paper)

The New Role: Human as Reviewer

Conventional:

Human = Worker who writes code

AI Era:

Human = Logical Architect (Reviewer)

Roles:
1. Verbalize business intent
2. Verify AI outputs
3. Judge logical consistency
4. Reconcile mathematical correctness with business intent

Importance of the Feedback Loop

Human: "Future dates are not allowed"

AI: "Generates Alloy model"

Verification: "❌ Counterexample found: 2026-12-31 is permitted"

Human: "Add a constraint that date <= today"

AI: "Fix complete"

Verification: "✅ No counterexample"

Toward implementation

The essence is not to just leave everything to the AI, but to improve accuracy through dialogue.

Concrete Example: Farm Work Recording System

Requirements (Natural Language)

I want to record daily farm work.
Work date, work type, and field are mandatory.
I want to make sure future dates cannot be entered.

KAOS Diagram (AI-Generated)

G2: Data is continuously accumulated
├─ G21: Optimize input items (Limit to mandatory items)
├─ G22: Input reminder is functional
└─ G23: Save with Neo4j MCP

Alloy Constraints (AI-Generated)

// Mandatory field constraints
fact MandatoryFields {
  all record: WorkRecord |
    some record.date and
    some record.workType and
    some record.field
}

// Prohibition of future dates
fact NoFutureDate {
  all record: WorkRecord |
    record.date <= today
}

Verification (Executed by Human)

$ alloy constraints.als

Execute check MandatoryFields
 No counterexample found

Execute check NoFutureDate  
 No counterexample found

Claude Code Prompt (AI-Generated)

# Farm Work Recording Input Form

## Constraints (Verified with Alloy)
1. Mandatory fields: date, workType, field
2. Future dates not allowed (date <= today)

## Validation
- Mandatory items missing → Display error
- Future date entered → "Select a date on or before today" error

## Implementation
React + TypeScript + Zod

Results

Implementations that are logically consistent and have clear test perspectives are generated.

Comparison of Learning Costs

Approach Learning Time Required Skills
Traditional Formal Methods 6 months to several years Predicate logic, set theory, Alloy notation
AI-Augmented 5–10 hours Alloy Analyzer operation, reading verification results

Reduction Rate: 95% or more

What I Learn

  1. How to use Alloy Analyzer (2–3 hours)

    • Opening model files
    • Executing check / run
    • How to interpret results
  2. Interpreting Verification Results (2–3 hours)

    • ✅ No counterexample → OK
    • ❌ Counterexample found → Identifying the problem
  3. How to Give Correction Instructions (1–2 hours)

    • Telling the AI "what is different"
    • Improving accuracy step-by-step

Total: 5–7 hours

Path to Practice

Phase 1: Foundations (1–2 weeks)

□ Install Alloy Analyzer
□ Verify operation with the model generated this time
□ Try executing check / run
□ Master the basic workflow of Claude Code

Phase 2: Integration (2–3 weeks)

□ Explain requirements to Claude Desktop
□ Request generation of KAOS diagrams
□ Request generation of Alloy models
□ Execute and confirm verification
□ Implement with Claude Code

Phase 3: Practical Application (Continuous)

□ Farm work recording system (Actual project)
□ Establish a feedback loop
□ Improve the workflow

Pros and Cons

Pros

Dramatic improvement in instruction accuracy for AI

  • Elimination of ambiguity
  • Guarantee of logical consistency

Reduction of rework

  • Verify constraints before implementation
  • Clear test perspectives

Automatic generation of documentation

  • KAOS diagrams serve as requirement documents
  • Alloy models serve as specifications

Dramatic reduction in learning costs

  • Practical use possible in 1/20th of the time compared to traditional methods

Cons and Limitations

⚠️ Risk of AI errors

  • Possibility that the generated model is incorrect
  • Human verification is mandatory

⚠️ Limits with complex constraints

  • Advanced mathematical constraints may be difficult
  • Not a silver bullet

⚠️ Tool dependency

  • Scalability limits of Alloy Analyzer
  • Ingenuity required for large-scale systems

Projects Suitable for This Method

Suitable

✅ Systems with clear constraints

  • Input validation
  • Workflows
  • Access control

✅ Systems where correctness is critical

  • Business systems
  • Data management
  • Automation tools

✅ Individual development / Small teams

  • Cannot spend much time on requirement definition
  • But want to ensure quality

Not Suitable

❌ UI/UX design-centric projects
❌ Numerical / Scientific computing
❌ Ultra-large-scale systems (requires ingenuity)

Conclusion: Toward a New Development Style

AI-Augmented Formal Methods represent the democratization of formal methods.

What used to be the domain of experts—rigorous verification—is now accessible to everyone through AI support.

The role of humans is changing:

  • Workers who write code → Logical Architects
  • Implementers → Reviewers/Verifiers

In the coming era, what will be required of engineers is:

  1. The ability to clearly verbalize requirements
  2. The ability to verify AI outputs
  3. Domain knowledge
  4. Logical thinking skills

Moving forward, I will master KAOS and Alloy to improve instruction accuracy for Claude Code.

I will implement this new workflow, apply it to actual farming operations, and give back the knowledge to the community.


References

  • "Requirements Engineering" textbook, The Open University of Japan Graduate School
  • KAOS - Wikipedia
  • Alloy Documentation
  • Req2LTL Paper (Research on AI-Augmented Formal Methods)
  • [My GitHub] (KAOS diagrams and Alloy models introduced here)
  • [Claude Code Official Documentation]

Coming Soon: I will write an article about the process of building a sample app using this method. Stay tuned!

#Claude #FormalMethods #KAOS #Alloy #AI #RequirementsEngineering #SoftwareDevelopment

Discussion