iTranslated by AI
Chapter 6: The Three-Layer Review Structure — Designing Quality and Safety Gates
Chapter 6: The Three-Layer Review Structure — Quality Gates and Safety Gates
AI-Native Development Series Table of Contents
Introduction / Chapter 1 / Chapter 2 / Chapter 3 / Chapter 4 / Chapter 5 / Chapter 6 (This Article) / Chapter 7
Designing Reviews in the Era of AI-Driven Reviewing
In AI-native development, code reviews are also executed by AI.
Upon hearing this, you might wonder, "Is it safe to leave reviews to AI?" That is a valid concern. This is precisely why we need to design the review structure.
In this methodology, we adopt a three-layer review structure.
Coding Agent → Outputs implementation
↓
Code Reviewer → Performs quality review based on 7 perspectives (Quality Gate)
↓
System Auditor → Independent audit for safety, stability, and availability (Safety Gate)
↓
Operator → Reads summaries from both, decides to approve or remand
The implementation only reaches the operator's decision after passing through two independent gates (Quality Gate and Safety Gate).
Why Two Gates?
The reason for separating the Quality Gate (Code Reviewer) and the Safety Gate (System Auditor) is, as stated in Chapter 2, the "difference in lenses."
- Quality Gate: Is it clean? Is it maintainable? Is the design intent readable? Is it resilient to changes?
- Safety Gate: Is it safe? Is it robust? Will it exhaust resources?
If you try to apply these two lenses simultaneously, one usually becomes weaker. A reviewer seeking beautiful code often overlooks security holes, while a reviewer strict on security tends to ignore readability issues.
Therefore, we separate them. Each validates independently with its specialized lens, and only those that pass both gates proceed further.
Quality Gate: 7 Review Perspectives
We explain the seven review perspectives used by the Code Reviewer.
1. Elimination of Redundancy
- Are there overlapping responsibilities? (Judge by responsibility, not line count)
- Whether to integrate or separate should be determined by the semantic boundaries of the function's roles.
What is important here is the point: "Do not judge by line count." Even if three-line code exists in two places, if they each bear different responsibilities, they should not be integrated. Conversely, even if there is one 50-line function, if it contains multiple mixed responsibilities, it should be separated.
2. Resistance to Change
- Have you confirmed the possibility of an increase in the types of branches?
- If unconfirmed, stop the implementation and have it verified.
- If it will increase, design it to operate with definitions like enums, avoiding fixed values.
"There are three types now, but it might increase in the future" is a signal that should be verified. Hardcoding three types with if-else without confirmation has low resistance to change.
3. Error Handling
- Is processing designed for cases where the user's intent cannot be fulfilled?
- Condition Not Met → Clarify reason + provide action guidance.
- System Error → Notify overview + suggest automatic recovery or human support.
- Is the design capable of capturing critical logs unconditionally, and warnings if possible?
Error handling design is directly linked to the top-level principle: "Fulfill the user's intent." When a user tries to do something and fails, just saying "An error occurred" does not fulfill their intent. It is necessary to show why it failed and what to do next.
4. Performance
- UI experience criteria: Initial display under 200ms, search results under 100ms.
- If exceeded, compensate for the experience with asynchronous processing + completion notifications.
The basis for the numbers is whether the user feels "kept waiting." Humans start perceiving latency once it exceeds 200ms. If it exceeds this standard, convey "processing in progress" via loading displays or asynchronous processing to compensate for the experience.
5. Resource Cost
- Impact on cohabiting systems, settings for CPU/memory occupancy.
- Risk of permanent file output increase, presence of rotation.
- Detection of memory leaks, infinite loops, and deadlock risks.
Often overlooked, a design where log files continue to increase indefinitely will exhaust disk space in a production environment. Batch processing that bloats without releasing memory. These "work" functionally, but are time bombs operationally.
6. Security
- Review depth according to data sensitivity (Personal information and payment information are at the deepest depth).
- Encryption of data storage, control of access paths.
- Credential exposure on the front end.
The depth of the review changes according to the data sensitivity identified in Phase 3. When handling personal or payment information, conduct verification at the deepest level.
7. Readability
- Prohibition of magic numbers (define with enums, etc.).
- Does the folder structure have design intent?
- Is there bloating in one method or one class?
- Can others read it and understand the intent?
Judge readability not by "Can I read it now?" but by "Can someone else read it six months from now and understand the intent?"
Safety Gate: Perspectives of the System Auditor
The System Auditor validates from perspectives independent of the Code Reviewer.
While the Quality Gate asks "Is it good code?", the Safety Gate asks "Is it dangerous code?"
- Safety: Security holes, injection, flaws in authentication/authorization.
- Stability: Behavior during failure, recovery procedures, guarantee of data integrity.
- Availability: Presence of single points of failure, assurance of scalability.
The System Auditor's points are applied with the highest priority to all roles. If a security issue is found, addressing it is mandatory, regardless of delays in progress or how beautiful the code is.
Operator's Review Decision
The operator does not perform a line-by-line review of the code. They read the summaries of the Quality Gate and Safety Gate to make a decision.
4 Points to Check
- Are all 7 perspectives included in the summary? Check for missing perspectives. If there is "no mention of performance," there is a possibility that verification from that perspective was not performed.
- Is the basis for each perspective's judgment concrete? "No issues" is insufficient. It must clearly state what was checked to judge it as having no issues.
- Is the criticality judgment of the identified issues appropriate? Pay special attention to whether problems with security or resource costs are being treated as "minor."
- Are there any overlooked viewpoints based on your knowledge of the business context? There is a possibility that the AI has overlooked industry-specific constraints or consistency with existing systems. This is where the operator's domain knowledge is applied.
Do Not Unconditionally Accept AI Review Results
This is the most important principle.
Even if the AI's review result is "All items OK," the operator evaluates "Is it really OK?" based on their own judgment. If there is any doubt about the contents of the summary, instruct a deep-dive review of the relevant part.
← Previous Article: Chapter 5: Phase 6–8: From Feedback to Full Scale
→ Next Article: Chapter 7: Asymmetry of Wall-Bouncing and Cross-Cutting Quality Principles
Discussion