👏

Pythonで再現する標準ベイズ統計学5章

stochi

2024/08/09に公開

はじめに

本記事では標準ベイズ統計学の５章で掲載されている図表やモデルをPythonで実装する方法に関して説明します。

5章：正規モデル

この章では、正規分布の特徴と、母集団平均と分散に関する事後推論の方法について説明しています。また、母集団平均のベイズ推定量と標本平均を比較し、データが正規分布に従わない場合のモデルの妥当性についても検討しています。

正規分布の密度関数

図5.1では以下の三つの正規分布がプロットされています。

N(\theta=2, \sigma^2=0.25) \\ N(\theta=5, \sigma^2=4) \\ N(\theta=7, \sigma^2=1)

Pythonでの実装は以下のとおりです。プログラムの解説は省略します。（簡単なため）

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm

# プロット設定
plt.figure(figsize=(7, 3.5))

# xの範囲
x = np.linspace(0, 10, 100)

# 各正規分布のパラメータ
params = [
    {'theta': 2, 'sigma': 0.5, 'color': 'gray'},
    {'theta': 5, 'sigma': 2, 'color': 'lightgray'},
    {'theta': 7, 'sigma': 1, 'color': 'darkgray'}
]

# 各正規分布のプロット
for param in params:
    plt.plot(x, norm.pdf(x, param['theta'], param['sigma']), color=param['color'], lw=2)

# 凡例
plt.legend([r'$\theta=2, \sigma^2=0.25$', r'$\theta=5, \sigma^2=4$', r'$\theta=7, \sigma^2=1$'])
plt.xlabel(r'$y$')
plt.ylabel(r'$p(y|\theta, \sigma^2)$')

plt.show()

正規分布の密度関数

身長のデータと $\theta=63.75$ かつ $\sigma=2.62$ の正規分布の密度

図5.2では1893年から1989年にかけて、イギリスの1100の家庭を対象に調査が行われ、18歳以上の女性 $n=1375$ 人の身長のデータのヒストグラムを表しています。

# CSVファイルの読み込み
heights = pd.read_csv('/content/heights.csv')

# 身長データの取得
y = heights.iloc[:, 1]

# ヒストグラムのプロット
plt.figure(figsize=(7, 3.5))
plt.hist(y, bins=17, density=True, color='gray', edgecolor='black')

# 平均と標準偏差の計算
y_av = y.mean()
y_sd = y.std()

# 正規分布のフィット
ys = np.linspace(min(y) * 0.9, max(y) * 1.1, 100)
plt.plot(ys, norm.pdf(ys, y_av, y_sd), color='black', lw=2)

# プロットの設定
plt.xlabel('Height in inches')
plt.ylabel('Probability Density')
plt.title('')
plt.show()

身長のデータと $\theta=63.75$ かつ $\sigma=2.62$ の正規分布の密度

ミッジの羽の長さの母集団平均に関する事前分布と条件付き事後分布

図5.3は、ミッジの羽の長さの母集団平均に関する事前分布と条件付き事後分布をプロットしています。この図の背景にある数学的モデルと使用されているデータについて説明します。
まず、分散 $\sigma^2$ が既知の場合の $\theta$ の条件付き事後分布のモデルを以下に示します：

{Y_1, \dots, Y_n | \theta, \sigma^2} \sim \text{i.i.d. } \mathcal{N}(\theta, \sigma^2) \\ \theta \sim \mathcal{N}(\mu_0, \tau^2_{0})

このモデルにおいて、 $p(\theta|\sigma^2, y_1, \dots, y_n)$ は以下の平均 $\mu_n$ 、分散 $\tau^2_n$ を持つ正規分布に従います：

\tau^2_n = \frac{1}{\frac{1}{\tau^2_0} + \frac{n}{\sigma^2}} \\

\mu_n = \frac{\frac{1}{\tau^2_0}\mu_0 + \frac{n}{\sigma^2}\bar{y}}{\frac{1}{\tau^2_0} + \frac{n}{\sigma^2}}

図5.3で使用されているデータは、ある種のミッジの羽の長さの測定値です。具体的には：

サンプルサイズ: $n = 9$
標本平均: $\bar{y} = 1.804$ mm
標本分散: $s^2 = 0.017$ mm²

事前分布のパラメータは以下のように設定されています：

$\mu_0 = 1.9$ mm（典型的な羽の長さ）
$\tau_0 = 0.95$ mm

図5.3では、この事前分布と、 $\sigma^2 = s^2$ と仮定した場合の事後分布がプロットされています。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# データを読み込む
midge = pd.read_csv('./csv/midge.csv')
y = midge[midge["Species"] == 'Af']["Wing.Length"]

# 初期パラメーター
mu0 = 1.9
tau0 = 0.5 * 1.9
t02 = tau0 ** 2

# 統計計算
ybar = y.mean()
s2 = y.var()
n = len(y)

mun = (mu0 / t02 + n * ybar / s2) / (1 / t02 + n / s2)
t2n = 1 / (1 / t02 + n / s2)

# 95% 信頼区間
ci = norm.interval(0.95, loc=mun, scale=np.sqrt(t2n))
print(ci)

# グラフの描画
ys = np.linspace(0, mu0 * 2, 500)
plt.plot(ys, norm.pdf(ys, mun, np.sqrt(t2n)), color='black', lw=2)
plt.plot(ys, norm.pdf(ys, mu0, np.sqrt(t02)), color='gray', lw=2)
plt.xlabel(r'$\theta$')
plt.ylabel(r'$p(\theta|y_1,...,y_n,\sigma^2=0.017)$')
plt.show()

ミッジの羽の長さの母集団平均に関する事前分布と条件付き事後分布

( $\theta$ , $\tilde{\sigma}^2$ )、( $\theta$ , $\sigma^2$ )の同時事後密度

図5.3の説明では $\sigma^2$ が既知の場合を取り扱いました。次に、 $\sigma^2$ が未知の場合に事後分布がどうなるか説明します。
$\theta$ と $\sigma^2$ の同時事後分布は

p(\theta, \sigma^2|y_1 \dots y_n) = p(y_1 \dots y_n |\theta, \sigma^2)p(\theta, \sigma^2)/p(y_1 \dots y_n)

と書くことができます。 $p(\theta, \sigma^2)$ は以下のように表現できます。

p(\theta, \sigma^2) = p(\theta|\sigma^2)p(\sigma^2)

$p(\theta|\sigma^2)$ は $\sigma^2$ が既知の時で見たように、 $normal(\mu_0, \tau_0^2)$ と表現できます。ただし、 $\tau_0$ は $\frac{\sigma}{\sqrt{k_0}}$ とします。パラメータ $\mu_0$ は事前に観測されたデータの平均、 $k_0$ は事前分布のサンプルサイズです。
$\sigma^2$ の事前分布は、

精度 = \frac{1}{\sigma^2} \sim gamma(a,b) \\ 分散 = \sigma^2 \sim inverse\_gamma(a,b)

となります。
事前分布と標本モデルをまとめると、以下のようになります。

1/\sigma^2 \sim gamma(\nu_0/2, \nu_0\sigma^2_{0}/2) \\ \theta|\sigma^2 \sim normal(\mu_0, \sigma^2/k_0) \\ Y_1 \dots Y_n|\theta, \sigma^2 \sim i.i.d. \ normal(\theta, \sigma^2)

そこで、 $p(\theta, \sigma^2) = p(\theta|\sigma^2)p(\sigma^2)$ のように、 $p(\theta, \sigma^2|y_1 \dots y_n)$ を以下のように分解します。

p(\theta, \sigma^2|y_1 \dots y_n) = p(\theta|\sigma^2, y_1 \dots y_n)p(\sigma^2|y_1 \dots y_n)

$p(\theta|\sigma^2, y_1 \dots y_n)$ の結果は既に解説したため、結果だけ以下に示します。

\{\theta|y_1 \dots y_n, \sigma^2\} \sim normal(\mu_n, \frac{\sigma^2}{k_n}) \\ k_n = k_0 + n \\ \mu_n = \frac{(k_0/\sigma^2)\mu_0 + (n\sigma^2)\bar{y}}{(k_0/\sigma^2) + (n/\sigma^2)} = \frac{k_0\mu_0 + n\bar{y}}{k_n}

$\sigma^2$ の事後分布は以下のようになります。詳細な式展開は書籍を参考にしてください。

\{\frac{1}{\sigma^2}|y_1 \dots y_n\} \sim gamma(\nu_n/2, \nu_n\sigma^2_n/2)\\ \nu_n = \nu_0 + n\\ \sigma_n^2 = \frac{1}{\nu_n}[\nu_0\sigma^2_0+(n-1)s^2+\frac{k_0n}{k_n}(\bar{y}-\mu_0)^2]

$\nu_0$ は事前の観測値のサンプルサイズ、 $\sigma^2_0$ は事前の標本分散です。

次に、本題の図5.4について解説します。図5.4ではミッジのデータを用いて $(\theta, \tilde{\sigma}^2=\frac{1}{\sigma^2})$ 、 $(\theta, \sigma^2)$ の同時事後密度をプロットしています。
この図5.4の例ではパラメータが以下のようになっています。

\mu_0 = 1.9 \\ \sigma^2_{0} = 0.01 \\ k_0 = \nu_0 = 1 \\ \bar{y} = 1.804 \\ s^2 = 0.0169 (s=0.130)

これらの情報をもとに、 $(\theta, \tilde{\sigma}^2)$ , $(\theta, \sigma^2)$ の同時事後密度をプロットしたものが図5.4です。Pythonでの実装は以下のとおりです。

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm, gamma, invgamma
from scipy.special import gammaln

# 逆ガンマ密度関数の定義
def dinvgamma(x, alpha, beta):
    return np.exp(alpha * np.log(beta) - gammaln(alpha) - (alpha + 1) * np.log(x) - beta/x)

# パラメータの設定
mu0 = 1.9
k0 = 1
s20 = 0.01
nu0 = 1

# データ
y = np.array([1.64, 1.70, 1.72, 1.74, 1.82, 1.82, 1.82, 1.90, 2.08])
n = len(y)
ybar = y.mean()
s2 = y.var(ddof=1)

# 事後推論
kn = k0 + n
nun = nu0 + n
mun = (k0 * mu0 + n * ybar) / kn
s2n = (nu0 * s20 + (n - 1) * s2 + k0 * n * (ybar - mu0)**2 / kn) / nun

# グリッドの設定
gs = 100
theta = np.linspace(1.6, 2.0, gs)
is2 = np.linspace(15, 160, gs)
s2g = np.linspace(0.001, 0.045, gs)

# 密度計算
ld_th_is2 = np.zeros((gs, gs))
ld_th_s2 = np.zeros((gs, gs))

for i in range(gs):
    for j in range(gs):
        ld_th_is2[j, i] = norm.logpdf(theta[i], mun, np.sqrt(1/(is2[j] * 10))) + \
                          gamma.logpdf(is2[j], a=10/2, scale=1/(10 * s2n/2))
        ld_th_s2[j, i] = norm.logpdf(theta[i], mun, np.sqrt(s2g[j]/10)) + \
                         np.log(dinvgamma(s2g[j], 10/2, 10 * s2n/2))

# log密度から実密度への変換
density_is2 = np.exp(ld_th_is2)
density_s2 = np.exp(ld_th_s2)

# プロット
plt.figure(figsize=(14, 7))

# 1/sigma^2のプロット
plt.subplot(1, 2, 1)
plt.contourf(theta, is2, density_is2, levels=50, cmap='Greys')
plt.xlabel('theta')
plt.ylabel('1/sigma^2')

# sigma^2のプロット
plt.subplot(1, 2, 2)
plt.contourf(theta, s2g, density_s2, levels=50, cmap='Greys')
plt.xlabel('theta')
plt.ylabel('sigma^2')

plt.tight_layout()
plt.show()

( $\theta$ , $\tilde{\sigma}^2$ )、( $\theta$ , $\sigma^2$ )の同時事後密度

以下で少しPythonの解説を行います。

グリッドの設定

gs = 100
theta = np.linspace(1.6, 2.0, gs)
is2 = np.linspace(15, 160, gs)
s2g = np.linspace(0.001, 0.045, gs)

np.linspaceは指定された区間を均等に分割し、指定された数のデータを生成します。
すなわち、np.linspace(1.6, 2.0, 100)というコードは、1.6から2.0までの範囲を100個の等間隔の点に分割します。
ここで変数定義されているthetaはパラメーター $\theta$ 、is2は $\frac{1}{\sigma^2}$ 、s2gは分散 $\sigma^2$ のことを指します。

密度計算

ld_th_is2 = np.zeros((gs, gs))
ld_th_s2 = np.zeros((gs, gs))

for i in range(gs):
    for j in range(gs):
        ld_th_is2[j, i] = norm.logpdf(theta[i], mun, np.sqrt(1/(is2[j] * 10))) + \
                          gamma.logpdf(is2[j], a=10/2, scale=1/(10 * s2n/2))
        ld_th_s2[j, i] = norm.logpdf(theta[i], mun, np.sqrt(s2g[j]/10)) + \
                         np.log(dinvgamma(s2g[j], 10/2, 10 * s2n/2))

この部分では、thetaとis2、およびthetaとs2gの各組み合わせに対して事後密度を計算しています。それぞれのグリッドポイントで、以下の計算を行います。

正規分布の確率密度関数（PDF）の対数：
norm.logpdfはパラメーターθに対する正規分布の対数密度を計算します。
ガンマ分布の対数PDF：
gamma.logpdfはパラメーター1/σ^2に対するガンマ分布の対数密度を計算します。
逆ガンマ分布の対数PDF：
dinvgammaは逆ガンマ分布の密度関数を呼び出し、その結果の対数を取ります。

最終的に、ld_th_is2とld_th_s2はそれぞれのグリッドポイントにおける対数事後密度を格納します。これらの密度は次にnp.expで指数変換を行いプロットに使用します。

母集団平均と分散の同時・周辺事後分布からのモンテカルロ標本および推定値

図5.5ではミッジの羽の長さに関するデータの母集団平均と分散の同時・周辺事後分布からのモンテカルロ標本と推定値を表しています。

import numpy as np
from scipy.stats import gamma, norm
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

# 乱数のシードを設定
np.random.seed(1)

# パラメータの設定
mu0 = 1.9
k0 = 1
s20 = 0.01
nu0 = 1

# データ
y = np.array([1.64, 1.70, 1.72, 1.74, 1.82, 1.82, 1.82, 1.90, 2.08])
n = len(y)
ybar = y.mean()
s2 = y.var(ddof=1)

# 事後推論
kn = k0 + n
nun = nu0 + n
mun = (k0 * mu0 + n * ybar) / kn
s2n = (nu0 * s20 + (n - 1) * s2 + k0 * n * (ybar - mu0)**2 / kn) / nun

# サンプルのサイズ
S = 10000

# ガンマ分布からの逆スケールサンプリング
s2_postsample = 1 / gamma.rvs((nu0 + n) / 2, scale=1 / (s2n * (nu0 + n) / 2), size=S)

# 正規分布からのサンプリング
theta_postsample = norm.rvs(mun, np.sqrt(s2_postsample / (k0 + n)), size=S)

# 分位数の計算
quantiles = np.quantile(theta_postsample, [0.025, 0.975])

# プロットの作成
fig = plt.figure(figsize=(10, 10))
gs = fig.add_gridspec(2, 2)

# サブプロット1: ヒートマップと散布図
ax1 = fig.add_subplot(gs[0, :])
theta = np.linspace(1.60, 2.0, 100)
s2g = np.linspace(0.001, 0.07, 100)
X, Y = np.meshgrid(theta, s2g)
positions = np.vstack([X.ravel(), Y.ravel()])
kernel = gaussian_kde(np.vstack([theta_postsample, s2_postsample]))
Z = np.reshape(kernel(positions).T, X.shape)
ax1.imshow(Z, extent=[1.60, 2.0, 0.001, 0.07], origin='lower', aspect='auto', cmap='Greys')
ax1.scatter(theta_postsample[:5000], s2_postsample[:5000], c='black', alpha=0.1, s=1)
ax1.set_xlabel(r'$\theta$')
ax1.set_ylabel(r'$\sigma^2$')
ax1.set_ylim(0, 0.07)

# サブプロット2: s2の密度プロット
ax2 = fig.add_subplot(gs[1, 0])
s2_density = gaussian_kde(s2_postsample)
s2_xs = np.linspace(0, 0.075, 200)
ax2.plot(s2_xs, s2_density(s2_xs))
ax2.set_xlabel(r'$\sigma^2$')
ax2.set_ylabel(r'$p(\sigma^2|y_1...y_n)$')
ax2.set_xlim(0, 0.075)

# サブプロット3: thetaの密度プロット
ax3 = fig.add_subplot(gs[1, 1])
theta_density = gaussian_kde(theta_postsample)
theta_xs = np.linspace(1.60, 2.0, 200)
ax3.plot(theta_xs, theta_density(theta_xs))
ax3.set_xlabel(r'$\theta$')
ax3.set_ylabel(r'$p(\theta|y_1...y_n)$')
ax3.set_xlim(1.60, 2.0)

# ベイズ信用区間
theta_quantiles = np.quantile(theta_postsample, [0.025, 0.975])
ax3.axvline(theta_quantiles[0], color='gray', linestyle='--', linewidth=2)
ax3.axvline(theta_quantiles[1], color='gray', linestyle='--', linewidth=2)

# t検定に基づく信頼区間
confidence_interval = ybar + t_critical * np.sqrt(s2/n)
ax3.axvline(x=confidence_interval[0], color="black", linestyle="--", linewidth=2)
ax3.axvline(x=confidence_interval[1], color="black", linestyle="--", linewidth=2)
ax3.legend()

plt.tight_layout()
plt.show()

母集団平均と分散の同時・周辺事後分布からのモンテカルロ標本および推定値

黒の点線がt統計量に基づく95%信頼区間です。灰色の垂直線が $\theta$ の事後分布の95%分位点に基づく区間です。これまでのPythonプログラムを見ていただければ何をしているかわかると思いますので、Pythonの実装について改めて解説することはしません。

IQの点数の母集団平均に関する異なる推定量の平均二乗誤差と標本分布

米国のある町でn人の個人を標本抽出し、サイズnの標本に基づいてその町のIQテストの点数の平均 $\theta$ を推定しています。IQテストの点数は平均が100、標準偏差が15になるように設計されています。町の情報について何も情報がないため $\mu_0 = 100$ としています。仮に町の真の平均が $\theta=112$ 、標準偏差が $\sigma=13$ とした際に、標本平均 $\hat{\theta}_e$ と標本モデルと事前分布を正規分布としたときの事後平均 $\hat{\theta}_b$ のMSEは

MSE[\hat{\theta}_e|\theta_0] = Var[\hat{\theta}_e] = \frac{\sigma^2}{n} = \frac{169}{n} \\ MSE[\hat{\theta}_b|\theta_0] = \omega^2\frac{169}{n}+(1-\omega)^2 144

となります。ただし、 $\omega=\frac{n}{k_0+n}$ です。図5.6の左図では $k_0 = 1,2,3$ の場合のMSEの比率 $MSE[\hat{\theta}_b|\theta_0]/MSE[\hat{\theta}_e|\theta_0]$ がプロットされています。右図では $k_0 = 1,2,3$ に対応する

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# 最初のプロットのパラメータ
b = (100 - 112) ** 2
s2 = 13 ** 2
n1 = np.arange(1, 51)

# 異なるkの値に対するMSEの比率の計算
brk1 = (n1 / (1 + n1)) ** 2 + n1 * (1 / (1 + n1)) ** 2 * b / s2
brk2 = (n1 / (2 + n1)) ** 2 + n1 * (2 / (2 + n1)) ** 2 * b / s2
brk3 = (n1 / (3 + n1)) ** 2 + n1 * (3 / (3 + n1)) ** 2 * b / s2

# 2番目のプロットのパラメータ
theta0 = 112
mu0 = 100
n2 = 10 
s2m = s2 / n2
x = np.linspace(theta0 - 4 * np.sqrt(s2m), theta0 + 4 * np.sqrt(s2m), 100)

# 2つのサブプロットを作成
fig, axs = plt.subplots(1, 2, figsize=(7, 3.5))

# Plot 1: MSEの比率
axs[0].axhline(y=1, color='black', lw=2, linestyle='-.', label=r'$\kappa_0=0$')
axs[0].plot(n1, brk1, color="grey", lw=2, label=r'$\kappa_0=1$')
axs[0].plot(n1, brk2, color="grey", lw=2, linestyle='--', label=r'$\kappa_0=2$')
axs[0].plot(n1, brk3, color="grey", lw=2, linestyle=':', label=r'$\kappa_0=3$')
axs[0].set_xlabel("sample size")
axs[0].set_ylabel("MSE ratio")
axs[0].legend()

# Plot 2: 確率密度関数
axs[1].plot(x, norm.pdf(x, theta0, np.sqrt(s2m)), color="black", lw=2, linestyle='-.', label=r'$\kappa_0=0$')
axs[1].axvline(x=theta0, color='black', label='True value')
for k in range(1, 4):
    w = n2 / (n2 + k)
    linestyle = ['-', '--', ':'][k-1]  # 異なる k_0 に対して異なる線スタイルを指定
    axs[1].plot(x, norm.pdf(x, w * theta0 + (1 - w) * mu0, np.sqrt(w ** 2 * s2m)),
                color="grey", lw=2, linestyle=linestyle, label=f'$\kappa_0={k}$')
axs[1].set_xlabel("IQ")
axs[1].set_ylabel("Probability Density")
# 凡例のフォントサイズを小さく調整
axs[1].legend(fontsize='small')

plt.tight_layout()
plt.show()

IQの点数の母集団平均に関する異なる推定量の平均二乗誤差と標本分布

非正規な分布とサイズ $n\in{5,15,45}$ の標本平均の分布

図5.7が掲載されている5.6節では母集団平均に対しては正規モデルに基づく事後分布でも良い推定ができるということが解説されています。それを解説するために、まずは図5.7の左図で正規分布ではない真の母集団分布をプロットしています。真ん中の図ではモンテカルロ近似を行い、nが大きくなれば母集団平均を近似できることを示しています。最後に、図5.7では $\bar{Y}$ と $s^2$ が相関関係にあることをプロットし、これにより正規分布を仮定した事後分布を非正規なデータに対して用いることは $\{\theta, \sigma^2\}$ の同時分布に対しては誤解を招くような結論を導きかねないことを示唆しています。

以下にPythonによる実装を記載します。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde

# データの読み込みと前処理
gss = pd.read_csv('./csv/gss.csv')
CHILDS = gss[(gss['FEMALE'] == 1) & (gss['YEAR'] == 1998) & (gss['AGE'] >= 40)]['CHILDS'].dropna().values

# シミュレーション
np.random.seed(1)
NY = []
N = [5, 15, 45]
for n in N:
    for sim in range(5000):
        y = np.random.choice(CHILDS, n)
        NY.append([n, np.mean(y), np.var(y)])
NY = np.array(NY)

# グレースケールの色を生成する関数
def gray(scale):
    return (scale, scale, scale)

# プロットの設定
plt.figure(figsize=(12, 4))
plt.subplots_adjust(wspace=0.4, bottom=0.15)

# プロット1: ヒストグラム
plt.subplot(1,3,1)
unique, counts = np.unique(CHILDS, return_counts=True)
plt.bar(unique, counts/sum(counts), width=0.2, color='black') 
plt.xlabel('number of children', fontsize=10)
plt.ylabel('p(y)', fontsize=10)
plt.ylim(0, 0.3)

# プロット2: 密度プロット
plt.subplot(1,3,2)
x = np.linspace(0, 6, 200)
for n in N:
    yb = NY[NY[:, 0] == n, 1]
    density = gaussian_kde(yb)
    plt.plot(x, density(x), color=gray(1 - (n / 5) / 9), linewidth=2)
plt.axvline(np.mean(CHILDS), color='black')
plt.xlabel('number of children', fontsize=10)
plt.ylabel('p(ȳ)', fontsize=10)
plt.ylim(0, 1.7)
plt.legend(['n=5', 'n=15', 'n=45'], loc='upper right', fontsize=8, frameon=False)

# プロット3: 2D密度プロット
plt.subplot(1,3,3)
data = NY[NY[:, 0] == 45, 1:]
x, y = data[:, 0], data[:, 1]
xmin, xmax = x.min(), x.max()
ymin, ymax = 1, 6

# カーネル密度推定
k = gaussian_kde(data.T)
xi, yi = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
zi = k(np.vstack([xi.flatten(), yi.flatten()])).reshape(xi.shape)

# 等高線プロットを使用
plt.contour(xi, yi, zi, levels=5, colors='black')
plt.xlabel('ȳ', fontsize=10)
plt.ylabel('s²', fontsize=10)
plt.ylim(1, 6)

# プロットの表示
plt.tight_layout()
plt.show()

非正規な分布とサイズ $n\in{5,15,45}$ の標本平均の分布

最後に

本記事では、標準ベイズ統計学の第5章で扱われている正規モデルに関する様々な概念をPythonで実装し、視覚化しました。第5章では正規分布の基本的な性質から始まり、母集団平均と分散の事後推論、モンテカルロ法による推定、そして非正規データに対する正規モデルの適用まで、幅広いトピックをカバーしています。次の記事では6章の「ギブスサンプラーによる事後分布の近似」を扱います。次回も記事を読んでいただけると幸いです。