iTranslated by AI
How to Improve Team Engineering Productivity in the AI Era
Introduction
Recently, I've finally been able to start using Claude Enterprise at work, and I've been trying it out hands-on.
There are certainly many situations where it feels convenient, such as code completion, suggesting fixes, creating draft test code, and identifying review perspectives.
Since it assists with or replaces some of the work I used to have to think about from scratch, it has significantly increased my work speed.
I have felt the potential of using it with my own personal Pro plan to try out automated reviews, automated PRs, and having agent teams complete tasks.
However, even if individual implementation speed increases, if the state is such that:
-
Specifications proceed while remaining ambiguous
-
Review perspectives vary from person to person
-
Test perspectives are insufficient
-
Large changes cannot be safely managed
-
There is no common understanding of how to handle code generated by AI
...then for the team as a whole, it can actually lead to increased confusion.
Ultimately, I believe AI is not development power itself, but a tool that amplifies existing development processes. If a team already has a good workflow, they will become stronger, but if the foundation remains ambiguous, that ambiguity will be amplified along with everything else.
In this article, I will consider what is still necessary to improve a team's development power, assuming the use of Claude.
1. Increased individual speed does not equal team development power
With Claude, tasks like:
-
Code generation
-
Test generation
-
Refactoring
-
Review assistance
...become faster.
However, for the team as a whole, if:
-
Specifications are ambiguous
-
There are no acceptance criteria
-
Review perspectives are not aligned
-
Design decisions differ by person
-
There is no testing strategy
-
The impact of changes is unreadable
...the team will end up stuck anyway.
In other words, the first point of discussion is that increased individual output speed does not equate to increased team development power.
2. Teams that can utilize Claude effectively have clear task units
Claude excels at work where "what to fix and how" is verbalized to some extent, such as:
-
Small, segmented tasks
-
Clear input/output
-
Well-defined constraints
-
Comparable expected results
For example, if the request is something like:
-
Adding error-case tests for this API
-
Extracting this branch into a Strategy pattern
-
Confirming whether this PR meets the acceptance criteria
-
Improving readability by separating the responsibilities of this method
...Claude demonstrates significant power.
On the other hand, when junior members make requests to AI, they sometimes haven't yet decomposed the task well and throw the whole thing at it:
-
It's hard to maintain, so please make it "better"
-
This feature is complex overall, so please organize it
-
Please fix the hard-to-use design
-
I want to solve the team's issues
Such requests seem natural at first glance. Indeed, when consulting with a human, it's common to start with this level of granularity.
However, for generative AI, this level of granularity is a bit too broad. Because it is ambiguous what the problem is, how far to fix it, and what constitutes an improvement, the content returned tends to be abstract.
While inexperienced, even if one can notice a sense of discomfort or the existence of a problem, one often cannot yet organize it to the point of knowing at what unit to carve it out.
That is precisely why I thought that teams that can utilize Claude effectively should value decomposing tasks into implementable and reviewable units. I feel that to use AI effectively, the ability to break down issues into clear units is even more important than writing clever prompts.
3. Turning reviews into "perspectives" rather than "impressions"
When trying to leverage Claude in a team, the next important aspect is the review.
Once AI can write code, the initial speed of implementation increases. However, the quality of "how to evaluate it" becomes more important than ever.
If reviews remain based on impressions, such as:
-
It feels off
-
It's not my preference
-
This is how I would write it
-
It feels a little uncomfortable
...then team development will not stabilize, even with the use of AI. This is because the criteria for feedback change from person to person, making it difficult for the person being reviewed to understand what to fix.
In particular, code used with Claude often looks reasonable at first glance. The syntax is natural, the naming is not too unnatural, and at a quick glance, there is no major discomfort. That is precisely why "reviews by atmosphere" are prone to missing things.
What should be seen in reviews is not impressions, but perspectives.
For example, at a minimum, I think it is better to align on perspectives such as the following:
-
Does this implementation meet the specifications?
-
Does it meet the acceptance criteria?
-
Are error cases and boundary values considered?
-
Is the placement of responsibilities appropriate?
-
Is the structure fragile against future changes?
-
Are the specifications guaranteed by tests?
When perspectives are verbalized in this way, it is harder for humans to waver when reviewing, and the precision increases when requesting review assistance from Claude.
For example, you can make requests such as:
-
Please check if this PR meets the acceptance criteria
-
Please point out error cases that might be missing from the specifications
-
Please look for unnatural points from the perspective of responsibility separation
-
Please identify missing test cases
Conversely, if you show Claude code without review perspectives, the following are likely to happen:
-
It talks only about code style
-
It focuses on trivial matters like naming or refactoring
-
It fails to pick up truly important specification gaps
-
The evaluation changes depending on the person
In other words, I felt that whether Claude helps with reviews depends not only on the performance of the AI, but greatly on whether the team has verbalized what it is reviewing.
By focusing on reviews from the perspectives of specifications, design, responsibility, and testing, I believe the AI's output will finally turn into the team's strength.
4. Things to standardize as a team
I felt that the key is what the team should align on, assuming the use of AI.
If this remains ambiguous, some people will use it well, others will fail to master it, and variations in reviews and quality will increase.
Conversely, in a team that has shared at least minimal rules and common understanding, Claude becomes a quite practical assistant.
Regarding what should be standardized, the following are particularly important:
How to write Issues / Requests
Claude is better at requests where the objective and constraints are clear rather than ambiguous requests.
Therefore, at the time of an Issue or task request, it is important to ensure that:
-
What you want to fix
-
Why you want to fix it
-
What the completion criteria are
-
Where the scope of impact is
...are clearly understood.
If the entry point of the work is ambiguous, subsequent implementation and reviews are likely to waver. Before AI utilization, this is also the foundation of team development.
Acceptance criteria
In a state where the criteria for "whether it's done" differ from person to person, quality will not stabilize even if you use Claude.
That is precisely why acceptance criteria should be clearly documented.
For example, if the following are shared:
-
Which cases are considered a success
-
Which error cases should be considered
-
What the compatibility with existing specifications is
-
What should be confirmed by tests
...then both implementation and reviews become significantly easier.
In short, I thought that to utilize Claude as a team, standardizing a foundation where quality does not waver even when using AI is more important than the introduction of AI itself.
Summary
What I strongly believe at the moment is that being able to use it conveniently as an individual and increasing the development power of a team are two different things.
If development proceeds with ambiguous specifications, unaligned review perspectives, weak tests, and large task units, simply introducing AI does not guarantee that the whole team will function well.
Rather, I think there is a possibility that the ambiguity will be amplified along with everything else.
If you are going to integrate it into your team from now on, I felt that you need to re-align on foundations such as:
-
In what units to divide tasks
-
How to clarify acceptance criteria
-
From what perspective to conduct reviews
-
What to fix with tests
-
How far to entrust to AI
Even for myself, I am still in the process of trial and error regarding how to use Claude and how to integrate it into the team.
However, I believe that what is important is not just using AI, but shaping it into a form that the team can reasonably leverage.
Discussion