iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📌

Collecting Git Logs and Preventing Key Leakage with git-secrets

に公開

Introduction

This article is about Git operations.
I could have written these as separate posts, but since each topic's volume isn't that large, I've combined them into one article.
These are just some small tips, so I hope you enjoy reading it casually.

Created a shell script to get Git commit messages

Introduction

Have you ever suddenly wanted to retrieve commit messages from Git?
I have.
So, in this post, I will first introduce a command to retrieve Git commit messages, followed by a script to do it more efficiently.

Quick Result

By creating the following shell script, you can retrieve commit messages from any folder for any given date.

#!/bin/bash
CURRENT=$(pwd)
usage() {
    cat  1>&2 <<EOF
Usage: $(basename $0) [OPTIONS]
Options:
    -d YYYY-mm-dd       get logs after date
    -p DIR    Getting logs pass (default: .)
    -t Prompt message title
    -m Match Regex
EOF
    exit 1
}
pass='.'
date=$(date '+%Y/%m/%d')
promptTitle='Please summarize the following commit messages.'
match="^- Jwt"
while getopts 'p:d:t:m:h' opt; do
    case "${opt}" in
        d) date="${OPTARG}" ;;
        p) pass="${OPTARG}" ;;
        t) promptTitle="${OPTARG}" ;;
        m) match="${OPTARG}" ;;
        h) usage ;;
        ?) usage ;;
    esac
done
git -C $pass log --after "${date}" --pretty=format:"- %s" > $CURRENT/prompt.txt
sed -i  "1i${promptTitle}\n"  $CURRENT/prompt.txt
sed -i "/${match}/d" $CURRENT/prompt.txt
# This file is intended for use on Linux.
# Therefore, you need to have xsel installed beforehand via "sudo apt-get install xsel".
cat $CURRENT/prompt.txt
cat $CURRENT/prompt.txt | xsel --clipboard --input

Let's take a look at what it's doing from the top.
First, in the following code, the CURRENT variable stores the path of the directory where the shell is currently being executed.
After that, a function is created to display a message when an unexpected option or the -h option is used.

CURRENT=$(pwd)
usage() {
    cat  1>&2 <<EOF
Usage: $(basename $0) [OPTIONS]
Options:
    -d YYYY-mm-dd       get logs after date
    -p DIR    Getting logs pass (default: .)
    -t Prompt message title
    -m Match Regex
EOF
    exit 1
}

Let's look at the usage function in a bit more detail.
cat is a command that directs standard input to standard output.
Simply put, it's a command that outputs what you enter.
<< is an option that allows you to continue entering input until the specified delimiter appears.
While you can use any string for the delimiter, it's conventional to use EOF.
You can output without the << option, but if you use cat without it, the line breaks might not be handled correctly.
Therefore, you would need to manually add newline codes yourself, which reduces code readability.
For this reason, I used the << option here so that the layout in the shell file matches the actual output.
Since I'm using "EOF" as the delimiter, everything from "Usage" to "Regex" will be output to the screen.
1>&2 means redirecting the standard output to standard error.
The numbers are "file descriptors"—numbers that represent specific functions.
1 stands for standard output, and 2 stands for standard error.
And then, >& outputs the content of the left side to the right side.
Note that when using file descriptors, you use &, but when writing to a specific file, something like > test.txt is sufficient.
Essentially, you can think of 1>&2 as treating the displayed output as an error output.
If you just want to display it on the screen, 1>&2 isn't strictly necessary, but since this function isn't something intended to be part of the main execution flow, I've added 1>&2 so that it remains recorded as an error.
basename $0 is a command to get the name of the executable file, and I used exit 1 to terminate the script after the cat command completes because I don't want the subsequent processing to run.

Next, let's look at the following code.

pass='.'
date=$(date '+%Y/%m/%d')
promptTitle='Please summarize the following commit messages.'
match="^- Jwt"
while getopts 'p:d:t:m:h' opt; do
    case "${opt}" in
        d) date="${OPTARG}" ;;
        p) pass="${OPTARG}" ;;
        t) promptTitle="${OPTARG}" ;;
        m) match="${OPTARG}" ;;
        h) usage ;;
        ?) usage ;;
    esac
done

This part retrieves the option values when the shell file is executed and stores them in variables.
Regarding the variables, only date is specified as date '+%Y/%m/%d' to retrieve the date at the time of execution.
The while loop uses getopts to retrieve the option values while listing the cases to compare against next to it.
Note that by adding a colon (":") to the right of an option name, that option is recognized as one that takes an argument.
The option retrieved by getopts is set to opt, and then checked for matches within the case statement.
If an option matches d, p, t, or m, the argument's value is stored in OPTARG, which is then assigned to the target variable.
Specifying the -h option or a non-existent option will execute the usage function defined earlier.
With this, you can now configure which project, for which period, and how you want to retrieve the commit messages.

Finally, let's look at the following code.

git -C $pass log --after "${date}" --pretty=format:"- %s" > $CURRENT/prompt.txt
sed -i  "1i${promptTitle}\n"  $CURRENT/prompt.txt
sed -i "/${match}/d" $CURRENT/prompt.txt
# This file is intended for use on Linux.
# Therefore, you need to have xsel installed beforehand via "sudo apt-get install xsel".
cat $CURRENT/prompt.txt
cat $CURRENT/prompt.txt | xsel --clipboard --input

First, we retrieve the Git commit content with the git log command.
By specifying the -C option before log, you can specify the path of the project from which you want to retrieve logs.
Furthermore, by using the --after option, you can retrieve logs after a specific date.
Additionally, I'm using --pretty=format:"- %s" to retrieve only the commit message titles while prefixing them with "- ".
Finally, the retrieved content is written to a prompt.txt file.
The sed command allows you to append to or replace file content.
In this case, it's used to add a title to the prompt.txt file created earlier and to delete lines that match a regular expression.
Note that without the -i option, the file isn't overwritten and the contents of prompt.txt itself wouldn't change, so I've included the -i option.
Finally, the following cat commands display the contents of prompt.txt in the terminal and save the contents to the clipboard.

cat $CURRENT/prompt.txt
cat $CURRENT/prompt.txt | xsel --clipboard --input

The xsel command allows you to save content to the clipboard.
However, since xsel is not pre-installed on Ubuntu by default, please install it separately.
Now you can retrieve commit messages from a specified project and save them to your clipboard.
By running a command like ./create-prompt.sh -d 2023/11/10 -p /var/www/nestjs, you can paste text like the following using Ctrl + v:

Please summarize the following commit messages.
- Repository creation for Prisma verification
- Changes to devcontainer
- Appending to README and changing the version of a custom module
- Re-importing JwtAuthGuard.
- Creating a guard using passport-jwt
- Implementing JWT issuance process
- Moved guard to a separate file
- Implementation of LocalStrategy and guard completion
- Implementing class and methods for passport user authentication
- Project creation for passport verification
- Deleted unnecessary items
- Project creation for passport module verification

Why did I want to collect commit messages?

Finally, let me explain my motivation for wanting to collect Git commit messages in the first place.
It was because I wondered if I could use commit messages to summarize my work progress.
I was imagining that this would allow me to verbalize not just the final result, but also the current status and the process itself.

So, I created the shell script to collect commit messages and fed the results into a generative AI.
The result was that it mostly just echoed back what I provided.
To be honest, it wasn't the result I was hoping for, but that's expected. Generative AI won't magically do everything for you if your own goals for what you want to derive from the commit messages are vague.

For now, I've built the mechanism to collect the messages. Moving forward, I'll think about specifically what I want to achieve and how to translate that into a prompt.
If you have any ideas like "How about this?", I'd love to hear them in the comments!

References

https://scrapbox.io/nwtgck/gitでcdせずに指定したディレクトリ内でgitコマンド実行するには-Cオプションを使える
https://scrapbox.io/nwtgck/git標準だけで、logの情報をJSONにして取得する方法
https://it-ojisan.tokyo/sed-d/
https://qiita.com/kagami_t/items/84f6ec3142a8b370d908
https://maku77.github.io/p/2fyizgw/
https://wa3.i-3-i.info/word14383.html
http://to-developer.com/blog/?p=1001

Preventing Access Key Leakage when Publishing AWS Terraform on Git

Introduction

Previously, I used Terraform to convert my AWS configurations into IaC.
Since I had put it into code, I felt like I wanted to manage it with Git.
However, seeing articles like AWS Root Account Leakage and 150,000 Yen Charge Until Waiver or A Full Account of AWS Unauthorized Use Leading to a 3 Million Yen Bill Until Waiver makes me hesitate, even if the charges were eventually waived.
If I made a mistake, uploaded keys, and ended up in the same situation, I'd be in a cold sweat.

Therefore, to prevent accidentally pushing AWS keys, I will introduce git-secrets.
This significantly reduces the possibility of pushing environment variable files containing access keys.
Note that I'm setting up git-secrets on WSL in this instance.

Installing git-secrets

To install git-secrets in a Linux environment, first clone the git-secrets repository with git clone https://github.com/awslabs/git-secrets.git.
After cloning, enter the directory.
There, running sudo make install will make the git secrets command available.

Applying git-secrets to a project

Next, apply the installed git-secrets to the project you want to push.
First, go to the project you want to push and generate a .git directory with the git init command.
After that, enable git-secrets using git secrets --install.
Once the above is complete, execute git secrets --register-aws.

Now, let's verify the operation.
Create any file that is not targeted by .gitignore.

AWSAccessKeyId=AKIAXXXXXXXXXXXXXXXX
aws_secret_access_key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

After that, if you try to commit that file, the error shown in the image occurs, and the commit will fail.
2023-11-26_11h29_02.png
This confirms that git-secrets is active.
One thing to note is that, as mentioned in this article, access keys are fixed at 20 characters and secret keys at 40 characters, so trying to test with other values won't work correctly.
Therefore, a file like the following can be committed without any specific error:

aws_secret_access_key=test

I didn't realize this and struggled for about 30 minutes wondering why it wasn't working.

Viewing what git-secrets is inspecting

Up to this point, we have confirmed that git-secrets prevents AWS keys from being committed. We also observed that the access key must be 20 characters and the secret key must be 40 characters for the check to trigger. This is somewhat curious. So, in this section, we will look into what git-secrets is inspecting and see why it doesn't work correctly unless the keys have these specific lengths.

To check the inspection patterns configured in the project, execute git secrets --list. Then, a result like the following will be returned:

secrets.providers git secrets --aws-provider
secrets.patterns (A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}
secrets.patterns ("|')?(AWS|aws|Aws)?_?(SECRET|secret|Secret)?_?(ACCESS|access|Access)?_?(KEY|key|Key)("|')?\s*(:|=0E|=)\s*("|')?[A-Za-z0-9/\+=]{40}("|')?
secrets.patterns ("|')?(AWS|aws|Aws)?_?(ACCOUNT|account|Account)_?(ID|id|Id)?("|')?\s*(:|=0E|=)\s*("|')?[0-9]{4}\-?[0-9]{4}\-?[0-9]{4}("|')?
secrets.allowed AKIAIOSFODNN7EXAMPLE
secrets.allowed wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

The providers part is a setting to prohibit access key and secret key information stored in ~/.aws/credential from appearing in commits. This is also important, but in this case, we will particularly look at patterns.

Patterns are settings to prohibit commits containing matching strings. Regular expressions are used for the settings, and looking at these regular expressions, you can understand why the character counts are fixed. First, check the following pattern:

(A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}

It detects a string that starts with A3T followed by 'a single uppercase letter or a single digit from 0 to 9', or starts with AKIA, AGPA, AIDA, AROA, AIPA, ANPA, ANVA, or ASIA, and is followed by exactly 16 characters consisting of uppercase letters and digits. This matches the format of an AWS access key. Thus, this regular expression is specifically for prohibiting access keys.

Next is the following regular expression:

("|')?(AWS|aws|Aws)?_?(SECRET|secret|Secret)?_?(ACCESS|access|Access)?_?(KEY|key|Key)("|')?\s*(:|=\u003e|=)\s*("|')?[A-Za-z0-9/\+=]{40}("|')?

Since it's difficult to explain every part, please understand that things like the following would be blocked:

AWS_SECRET_ACCESS=40 characters consisting of uppercase letters and numbers
secret_access:40 characters consisting of uppercase letters and numbers
"Access:40 characters consisting of uppercase letters and numbers

By decoding the expressions configured in git-secrets, we can see why the lengths are fixed. Based on the regular expressions in the patterns, it turns out that the access key must be 20 characters and the secret key must be 40 characters.

I'll skip the remaining regular expression since it's not used this time.

Finally, I'll briefly touch upon secrets.allowed.

secrets.allowed is used to set exceptions that should not be detected by patterns. The values present in allowed in this case are AWS test keys. Therefore, they are configured not to trigger an error because there is no problem even if they are pushed.

That concludes the check details registered via git secrets --register-aws. In this instance, since I used the template provided by git-secrets, the check details were automatically applied. However, if you want to use custom check items, you can do so using the --add option. Additionally, if you want to perform AWS-related checks across all projects, you can do so by running git secrets --register-aws --global with the global option. There are many other options available, so please refer to GitHub.

References

https://qiita.com/kannkyo/items/465be766b5af0bc89749
https://qiita.com/michihito_t/items/ee1e7d73f11c6ede3f06
awslabs/git-secrets

Conclusion

In this post, I created a script to retrieve Git logs in one shot and wrote about git-secrets to prevent AWS key leaks.
Both help make daily work a bit easier and seem to have quite a bit of depth.
I hope this helps make your Git life more comfortable.
Thank you for reading this far!

Discussion