iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📝

Using Takumi on OSS: Impressions and Introduction to Implemented Changes

に公開

I was selected for the Takumi Open Source Developer Support Program and have been given three months of free access to Takumi, which I am currently trying out.

Takumi is an AI that thoroughly reads source code to find security issues. I am very impressed by the fact that its reports require careful reading to fully understand the depth of the issues pointed out. On the other hand, since the repository I had diagnosed this time is a CLI tool, there were many instances where advice based on the assumption of a Web application didn't quite hit the mark.

I also felt it was a bit of a shame that it doesn't provide the actual fix methods (though it sometimes suggests simple changes). I understand the point it's trying to make, but I found that I or another AI needed to figure out exactly how to solve the problem.

If used in the right context, it is sufficiently useful, and it seems like it would work well when integrated into operations. Below, I will introduce the good points and the points that concerned me.

Points I liked

  • It thoroughly reads the source code and turns its observations into a report. It picks up on areas that are hard to notice without deep reading.
  • It's not just for security; even for non-security consultations, it starts by "reading" the code first, making it easy to use for various purposes.
    • I am not entirely sure if this is a primary selling point of the service.
  • It explicitly states that you can leave the web screen during the review execution, which reduces the stress of waiting.

Points of concern

  • The reports assume Web applications, so some low-priority suggestions for CLI tools get mixed in.
    • For CLI tools, it is important to "operate exactly as the user inputs," and excessive defense can lead to poor UX or breaking compatibility.
  • It identifies issues but does not present "how to fix them."
    • Explanations of fix directions, patch examples, and side effects are missing, making it hard to translate into implementation. It seems best to consult another AI for the actual implementation and then run a review through Takumi afterward.
  • You cannot cancel or re-execute in case of accidental operations.
  • It reported new Go features as critical errors, so it might not be keeping up with the latest Go specifications.
  • You can't predict in advance how many credits will be consumed.
    • Sometimes it finishes quickly without costing a single credit, while other times it consumes 10 or more credits for a deep dive.

Points of concern from an Open Source Developer Support Program user's perspective

  • Since execution generally takes a long time, there were times when I wanted to check the results of a running report while out and about. However, currently, it cannot be viewed on a smartphone.
  • While Slack integration is convenient for corporate use, individuals often do not use paid Slack, so logs do not remain in Slack. I want to be able to complete everything on Takumi where logs are kept, but the UI is simple and gives off an "assuming Slack use" vibe.

Introduction to PRs Actually Fixed

From here, I will introduce some of the changes I actually implemented based on Takumi's suggestions.

kekkai

A file tampering detection tool. By storing manifests in S3 to prevent self-tampering, it can easily detect source code tampering in applications (e.g., PHP).

https://github.com/catatsuy/kekkai

Special characters can be passed in the save path

https://github.com/catatsuy/kekkai/pull/64

  • Before: --app-name and --base-path were used for determining the S3 storage path, but special characters such as / could be passed to these.
  • Action: Restricted and normalized the accepted character types to ensure safe path control.

Renaming fixed tmp files can cause race conditions during multiple process execution

https://github.com/catatsuy/kekkai/pull/65

  • Before: When creating cache files, the implementation wrote to a fixed file named filename.tmp and then replaced it via rename. This caused conflicts when multiple processes started simultaneously and overwrote the same file.
  • Action: Changed to use os.CreateTemp to create unique temporary files that don't conflict. I've written another article about this approach, including my previous experiences.

https://zenn.dev/catatsuy/articles/20833f8b9d5b34

Self-DoS possible via workers specification

https://github.com/catatsuy/kekkai/pull/67

  • Before: The -workers flag allowed the number of parallel workers to be increased arbitrarily, meaning extreme values could place an excessive load on the system itself.
  • Action: Capped the maximum value to the number of CPUs in the execution environment.

Accepting non-existent cache-dir

https://github.com/catatsuy/kekkai/pull/68

  • Before: It was possible to pass a non-existent directory to -cache-dir.
  • Action: Modified to validate early and return an error.

notify_slack

A tool for easily sending notifications to Slack via the CLI. It has two main functions: piping output from other CLI tools and posting files.

https://github.com/catatsuy/notify_slack

Capacity and memory limits during file upload

https://github.com/catatsuy/notify_slack/pull/240

  • Before: It would read and upload the specified file as-is without considering Slack API rate limits or capacity restrictions. If a huge file was passed, it would attempt to load it all into memory, potentially crashing the whole machine.
  • Action: Introduced capacity limits in accordance with the Slack API. Used io.LimitReader to strictly limit the reading process to prevent memory exhaustion.

Issue with sensitive information appearing in debug logs

https://github.com/catatsuy/notify_slack/pull/241

  • Before: When -debug was enabled, sensitive information such as Slack tokens was included in the output.
  • Action: Organized the debug output to exclude or mask sensitive information.

private-isu

A repository that can be used for internal ISUCON or ISUCON practice. It contains implementations in multiple languages. While not a security issue, I identified and fixed several points where incompatibilities existed between different languages (though I am not certain if these align with the intended usage). If you are interested, please take a look at the commits.

https://github.com/catatsuy/private-isu

Acknowledgments

I am grateful for the opportunity to try this out for free as part of the Open Source Developer Support Program. I will continue to make the most of it during the program period. If you have a public repository on GitHub, I highly recommend applying.

https://group.gmo/security/oss-support/

Summary

  • Takumi is an AI that reads source code carefully, and it provides many welcome insights. On the other hand, it seems less proficient at making design decisions based on CLI environments or keeping up with the latest Go language features.
  • Since it does not provide "how to fix" instructions, it seems best to handle the implementation yourself or with another AI and then use Takumi for the final review.
  • The Open Source Developer Support Program is definitely worth applying for if you are involved in open source development.

Discussion