iTranslated by AI
Visual Regression Testing with Ebitengine
This is a note I made while considering how to properly test screen rendering operations when creating a game using Ebitengine (formerly Ebiten), a game engine for Go ✍
Note that since I am developing on Ubuntu 22.04, there might be different pitfalls in other environments like Windows or macOS.
Please bear with me 🦵
What I Investigated
Visual Regression Testing is a type of automated testing where you save the output results before a code change (such as the DOM structure for web apps or PDFs for report outputs) in a repository and compare them with the output from the implementation after the change to ensure there are no differences.
To implement this test, you need to generate a PNG image from ebiten.Image, which represents an image or the screen to be displayed in Ebitengine. However, if you simply execute the process from a unit test case, a panic: buffered: the command queue is not available yet at ~ error occurs when accessing the pixel columns within the Image.
As mentioned in the issue below, the solution seems to be calling ebiten.RunGame() within TestMain() and performing the processing inside it.
Also, since Ebitengine currently lacks a headless mode, just following the above consideration will cause a fatal error: X11/Xcursor/Xcursor.h: No such file or directory ~ error in CI environments like GitHub Actions.
For this, it seems good to start Xvfb, a virtual display that allows you to run applications headlessly on CI.
The following link provides a clear explanation of how to use Xvfb:
Test Implementation
Below is an implementation image based on the considerations above.
First, prepare a Game class for testing.
package test
import (
"errors"
"github.com/hajimehoshi/ebiten/v2"
"os"
"testing"
)
var regularTermination = errors.New("regular termination")
type game struct {
m *testing.M
code int
}
func (g *game) Update() error {
g.code = g.m.Run()
return regularTermination
}
func (*game) Draw(*ebiten.Image) {
}
func (g *game) Layout(int, int) (int, int) {
return 1, 1
}
func RunTestGame(m *testing.M) {
ebiten.SetWindowSize(128, 72)
ebiten.SetInitFocused(false)
ebiten.SetWindowTitle("Testing...")
g := &game{
m: m,
}
if err := ebiten.RunGame(g); err != nil && err != regularTermination {
panic(err)
}
os.Exit(g.code)
}
Next, define a test function that takes an ebiten.Image, creates a snapshot PNG file, and checks for differences.
Regarding the generation of difference images, someone created image-diff, which is perfect for this use case, so I used it here.
package test
import (
"errors"
"fmt"
"github.com/hajimehoshi/ebiten/v2"
diff "github.com/olegfedoseev/image-diff"
"image"
"image/png"
"log"
"os"
"path"
"runtime"
"strconv"
"strings"
"testing"
)
const (
SnapshotErrorThreshold = 0.0
)
func CheckSnapshot(t *testing.T, actualImage *ebiten.Image) error {
_, callerSourceFileName, _, ok := runtime.Caller(1)
if !ok {
log.Fatalf("failed to read filename: %v", t.Name())
}
basePath := path.Join(path.Dir(callerSourceFileName), "snapshot")
baseFileName := strings.ReplaceAll(t.Name(), "/", "_")
expectedFilePath := path.Join(basePath, fmt.Sprintf("%v.png", baseFileName))
actualFilePath := path.Join(basePath, fmt.Sprintf("%v_actual.png", baseFileName))
diffFilePath := path.Join(basePath, fmt.Sprintf("%v_diff.png", baseFileName))
err := os.MkdirAll(basePath, os.ModePerm)
if err != nil {
log.Fatal(err)
}
var expectedImage image.Image
foundExpectedImage := false
expectedFile, err := os.Open(expectedFilePath)
if err == nil {
expectedImage, _, err = image.Decode(expectedFile)
if err != nil {
log.Fatal(err)
}
foundExpectedImage = true
} else if !errors.Is(err, os.ErrNotExist) {
log.Fatal(err)
}
_ = os.Remove(diffFilePath)
_ = os.Remove(actualFilePath)
updateSnapshot, _ := strconv.ParseBool(os.Getenv("UPDATE_SNAPSHOT"))
if foundExpectedImage && !updateSnapshot {
diffImage, percent, err := diff.CompareImages(actualImage, expectedImage)
if err != nil {
log.Fatal(err)
}
if percent > SnapshotErrorThreshold {
f, _ := os.Create(diffFilePath)
defer func(f *os.File) {
err := f.Close()
if err != nil {
log.Fatal(err)
}
}(f)
err = png.Encode(f, diffImage)
if err != nil {
log.Fatal(err)
}
f, _ = os.Create(actualFilePath)
defer func(f *os.File) {
err := f.Close()
if err != nil {
log.Fatal(err)
}
}(f)
err = png.Encode(f, actualImage)
if err != nil {
log.Fatal(err)
}
return fmt.Errorf(
"snapshot test failed: diff = %v > %v, file = %v",
percent,
SnapshotErrorThreshold,
diffFilePath)
}
}
f, _ := os.Create(expectedFilePath)
defer func(f *os.File) {
err := f.Close()
if err != nil {
log.Fatal(err)
}
}(f)
err = png.Encode(f, actualImage)
if err != nil {
log.Fatal(err)
}
return nil
}
The test function looks something like this:
As a point to note, tests that execute RunGame currently cannot avoid having the display appear for a split second. Additionally, since the test behavior itself might be unstable depending on the environment, I've added a termtests tag so that it can be run optionally.
To run it, you can do the following:
go test -tags=termtests -v ./...
//go:build termtests
// +build termtests
package test_test
import (
"fmt"
"github.com/hajimehoshi/ebiten/v2"
"github.com/hajimehoshi/ebiten/v2/ebitenutil"
"image/color"
"log"
"testing"
// Change this to match your package
"github.com/org/repo/test"
)
func TestExample_PrintMessage(t *testing.T) {
const (
Width = 128
Height = 72
)
tests := []struct {
text string
}{
{text: ""},
{text: "TestABC"},
}
for i, tt := range tests {
t.Run(fmt.Sprintf("text_%v", i), func(t *testing.T) {
// Create test image
image := ebiten.NewImage(Width, Height)
image.Fill(color.Black)
// Call the process to be tested
PrintMessage(image, tt.text)
// Check result
err := test.CheckSnapshot(t, image)
if err != nil {
t.Error(err)
}
})
}
}
func TestMain(m *testing.M) {
test.RunTestGame(m)
}
func PrintMessage(image *ebiten.Image, str string) {
ebitenutil.DebugPrint(image, str)
}
When you run the above test, a directory named snapshot/ is created in the same directory as the test code, and the image files output from ebiten.Image after the test execution are stored there.
$ ls -l snapshot/
total 24
-rw-rw-r-- 1 tkhs tkhs 229 Oct 3 15:25 TestExample_PrintMessage_text_0.png
-rw-rw-r-- 1 tkhs tkhs 330 Oct 3 15:25 TestExample_PrintMessage_text_1.png
The content looks like this:

TestExample_PrintMessage_text_1.png
After generating the images, if you modify the function content and run the test again, an error will occur because the content of the actual image differs from the expected image.
For example, let's say you change the function as follows:
func PrintMessage(image *ebiten.Image, str string) {
ebitenutil.DebugPrint(image, "Hello, "+str)
}
The test fails:
=== RUN TestExample_PrintMessage/text_0
example_test.go:40: snapshot test failed: diff = 0.8572048611111112 > 0, file = /your-path/test/snapshot/TestExample_PrintMessage_text_0_diff.png
=== RUN TestExample_PrintMessage/text_1
example_test.go:40: snapshot test failed: diff = 2.528211805555556 > 0, file = /your-path/test/snapshot/TestExample_PrintMessage_text_1_diff.png
--- FAIL: TestExample_PrintMessage (0.02s)
--- FAIL: TestExample_PrintMessage/text_0 (0.01s)
--- FAIL: TestExample_PrintMessage/text_1 (0.01s)
As a result, new images named *.actual.png and *.diff.png are output under the snapshot/ directory.
The contents look something like this:

TestExample_PrintMessage_text_1_actual.png

TestExample_PrintMessage_text_1_diff.png
Since these images are for visual verification and should not be committed to the repository, it's a good idea to ignore them.
**/snapshot/*_actual.png
**/snapshot/*_diff.png
Once you confirm that the difference is as expected (in this case, it seems fine since we added Hello, to the beginning of the string), run the test again with a truthy value set in the UPDATE_SNAPSHOT environment variable, such as UPDATE_SNAPSHOT=1 go test ~.
The expected images will be updated, so you can commit them to the repository to complete the test.
In the future, an error will occur if you change the behavior while modifying unrelated parts, making it easier to perform refactoring and other tasks 🍮
Running on GitHub Actions
When running tests on GitHub Actions, I was able to achieve it using the method introduced below:
I believe the configuration would look something like this:
name: Check
on: push
jobs:
test:
timeout-minutes: 5
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-go@v3
with:
go-version: 1.19
- run: |
# https://ebitengine.org/ja/documents/install.html#Debian_/_Ubuntu
sudo apt install -y libc6-dev libglu1-mesa-dev libgl1-mesa-dev libxcursor-dev libxi-dev libxinerama-dev libxrandr-dev libxxf86vm-dev libasound2-dev pkg-config
- run: |
# https://stackoverflow.com/questions/63125480/running-a-gui-application-on-a-ci-service-without-x11
export DISPLAY=:99
sudo Xvfb -ac :99 -screen 0 1280x1024x24 > /dev/null 2>&1 &
go test -tags=termtests -v ./...
What I learned from trying this is that it can be quite a lot of work to do the "modify → launch → visual check" cycle, especially for something like a single component of a game. Being able to easily notice when existing behavior is broken provides a sense of security.
Also, this is a benefit of Test-Driven Development (TDD), but the cycle of "tweak implementation → test breaks → fix it" feels like a game itself and is fun, so I recommend it also because it makes you feel like it's okay even if the game itself never gets finished 😊
Discussion