📌

OpenAI Structured Outputsの動作原理

2024/10/19に公開

Structured Outputsとは与えたJSON Schemaに従ってGPTに出力させる機能。

動作原理について本家ブログに解説があった。Next tokenサンプル時に、与えられたJSON Schemaに従う範囲でサンプルすることで実現している。この方法をconstrained samplingと呼ぶ。

Introducing Structured Outputs in the API (August 6, 2024)

Our approach is based on a technique known as constrained sampling or constrained decoding. By default, when models are sampled to produce outputs, they are entirely unconstrained and can select any token from the vocabulary as the next output.
(中略)
In order to force valid outputs, we constrain our models to only tokens that would be valid according to the supplied schema, rather than all available tokens.

Discussion