iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
💻

Handling Excel Files Like CSV

に公開

About half a year ago,

https://zenn.dev/spiegel/articles/20210302-reading-csv-data

I wrote the article above, but since then, several things came up and I wanted to handle Excel and LibreOffice Calc files in the same way. So, I decided to modify my github.com/goark/csvdata package. It's always worth keeping even trivial packages around (lol).

https://github.com/goark/csvdata/releases/tag/v0.3.0

Compared to the previous article, where I previously wrote:

rc := csvdata.New(strings.NewReader(planets), true)

I have now split the New() function into two stages:

rc := csvdata.NewRows(csvdata.New(strings.NewReader(planets)), true)

By making the argument of the outer csvdata.NewRows() function an interface type:

rows.go
//RowsReader is interface type for reading columns in a row.
type RowsReader interface {
    Read() ([]string, error)
    Close() error
}

func NewRows(rr RowsReader, headerFlag bool) *Rows { ... }

I intended to allow it to accept file formats other than CSV. This is what's known as dependency injection. In Go, you can easily design and implement dependency injection by inserting an interface type in between, without having to worry about inheritance relationships.

Well, it's a breaking change, but I don't see anyone else using it, so it should be fine (lol).

For Excel files:

exceldata/example_test.go
package exceldata_test

import (
    "fmt"

    "github.com/goark/csvdata"
    "github.com/goark/csvdata/exceldata"
)

func ExampleNew() {
    xlsx, err := exceldata.OpenFile("testdata/sample.xlsx", "")
    if err != nil {
        fmt.Println(err)
        return
    }
    r, err := exceldata.New(xlsx, "")
    if err != nil {
        fmt.Println(err)
        return
    }
    rc := csvdata.NewRows(r, true)
    defer rc.Close() //dummy

    if err := rc.Next(); err != nil {
        fmt.Println(err)
        return
    }
    fmt.Println(rc.Column("name"))
    // Output:
    // Mercury
}

You specify the path to the Excel file and its password in the arguments of the exceldata.OpenFile() function (an empty string is fine if it's not password-protected), and specify the Excel data instance and sheet name in the exceldata.New() function (if the sheet name is an empty string, the first sheet is used). After that, you can access the data using the same procedure as with CSV.

Similarly, for LibreOffice Calc files:

calcdata/example_test.go
package calcdata_test

import (
    "fmt"

    "github.com/goark/csvdata"
    "github.com/goark/csvdata/calcdata"
)

func ExampleNew() {
    ods, err := calcdata.OpenFile("testdata/sample.ods")
    if err != nil {
        fmt.Println(err)
        return
    }
    r, err := calcdata.New(ods, "")
    if err != nil {
        fmt.Println(err)
        return
    }
    rc := csvdata.NewRows(r, true)
    defer rc.Close() //dummy

    if err := rc.Next(); err != nil {
        fmt.Println(err)
        return
    }
    fmt.Println(rc.Column("name"))
    // Output:
    // Mercury
}

Something like that. In the case of Calc files, password protection and data encryption are not supported. My apologies.

Furthermore, for accessing Excel and Calc files, I'm using the following external packages:

https://github.com/qax-os/excelize
https://github.com/knieriem/odf

However, since both seem to expand the entire contents onto the heap, I don't think they can handle large files with hundreds of thousands of rows. Please keep this in mind.

With this, I can now handle them directly without having to convert to CSV every time. Mission accomplished.

GitHubで編集を提案

Discussion