iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
💻

Query-based JSON Parser

に公開

Go has a JSON parser in its standard library called encoding/json, but there are various third-party packages as well. For example, github.com/goccy/go-json is reportedly a fast encoding/json-compatible parser, which I have introduced previously.

https://zenn.dev/spiegel/articles/20210404-another-json-package

While the encoding/json standard package maps the entire JSON data into arbitrary structures or associative arrays of type map[string]interface{}, it would also be convenient to have a type that allows you to issue queries to retrieve values, similar to jq.

Two years ago, I created gjq as a side project, which uses the github.com/savaki/jq package as its parser. However, it seems about five years have passed since its last update, and it does not support modules, so I am reluctant to use it now.

Recently, I learned about the github.com/buger/jsonparser package. Although it is slightly different from jq, it also seems to parse JSON data by specifying elements. It looks like this:

sample1.go
// +build run

package main

import (
    "fmt"
    "os"

    "github.com/buger/jsonparser"
)

var jsondata = []byte(`{
  "person": {
    "name": {
      "first": "Leonid",
      "last": "Bugaev",
      "fullName": "Leonid Bugaev"
    },
    "github": {
      "handle": "buger",
      "followers": 109
    },
    "avatars": [
      { "url": "https://avatars1.githubusercontent.com/u/14009?v=3&s=460", "type": "thumbnail" }
    ]
  },
  "company": {
    "name": "Acme"
  }
}`)

func main() {
    v, err := jsonparser.GetString(jsondata, "person", "avatars", "[0]", "url")
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
        return
    }
    fmt.Println(v)
    // Output:
    // https://avatars1.githubusercontent.com/u/14009?v=3&s=460
}

Furthermore, it also provides higher-order functions[1] in a for-each style.

sample2.go
func main() {
    if err := jsonparser.ObjectEach(jsondata, func(key []byte, value []byte, dataType jsonparser.ValueType, offset int) error {
        fmt.Printf("Offset: %d\n\tKey: '%s'\n\tValue: '%s'\n\tType: %s\n", offset, string(key), string(value), dataType)
        return nil
    }, "person", "name"); err != nil {
        fmt.Fprintln(os.Stderr, err)
        return
    }
}

It can be written like this. Incidentally, when you run this:

$ go run sample2.go 
Offset: 53
    Key: 'first'
    Value: 'Leonid'
    Type: string
Offset: 77
    Key: 'last'
    Value: 'Bugaev'
    Type: string
Offset: 112
    Key: 'fullName'
    Value: 'Leonid Bugaev'
    Type: string

it outputs as shown above.

The github.com/buger/jsonparser package claims to be faster than the encoding/json standard package. According to the official benchmarks:

Each test processes a 24kb JSON record (based on Discourse API) It should read 2 arrays, and for each item in array get a few fields. Basically it means processing a full JSON file.

https://github.com/buger/jsonparser/blob/master/benchmark/benchmark_large_payload_test.go

Library time/op bytes/op allocs/op
encoding/json struct 748336 8272 307
encoding/json interface{} 1224271 215425 3395
a8m/djson 510082 213682 2845
pquerna/ffjson 312271 7792 298
mailru/easyjson 154186 6992 288
buger/jsonparser 85308 0 0

jsonparser now is a winner, but do not forget that it is way more lightweight parser than ffson or easyjson, and they have to parser all the data, while
jsonparser parse only what you need. All ffjson, easysjon and jsonparser have their own parsing code, and does not depend on encoding/json or interface{}, thats one of the reasons why they are so fast. easyjson also use a bit of unsafe package to reduce memory consuption (in theory it can lead to some unexpected GC issue, but i did not tested enough)

(via “buger/jsonparser: One of the fastest alternative JSON parser for Go that does not require schema”)

Thus, it can be seen that (under certain conditions) it performs quite fast processing without triggering allocations.

脚注
  1. Just to clarify, a "higher-order function" is a function that, in a language that supports first-class functions, either (1) takes a function as an argument or (2) returns a function. This is a common concept in functional programming languages, but it can be implemented in Go as well. However, in Go, which does not support generics (at least for now), it has to be said that the implementation feels quite clunky (lol). ↩︎

GitHubで編集を提案

Discussion