🐥

Sufficient statistics

2024/01/21に公開

statistics

tech

A statistic $t=T(X)$ is sufficient for the parameter $\theta$ , if $p(X=x | T(X) = t)$ does not depend on $\theta$ .

This means that $t$ has the same information against the parameter $\theta$ .

Let's consider an example.

Let $X \overset{i.i.d}{\sim} \mathrm{Bern}(\theta)$ , the random variables $X = \{X_1, \dots, X_N\}$ , and the realizations $x = \{x_1, \dots, x_N\}$ .

\begin{align*} p(X=x) &= p(X_1=x_1, \dots, X_N=x_N) \\ &= \prod_{n=1}^N p(X_n = x_n; \theta) \\ &= \theta^{\sum_n^N x_n} (1 - \theta)^{N - \sum_n^N x_n} \end{align*}

In this case, let $T(X) = \sum_n^N x_n$ then $T(X) \overset{i.i.d}{\sim} \mathrm{Bin}(N, \theta) $ .

\begin{align*} p\left( T(X)=t; \theta \right) = \binom{N}{t} \theta^t (1 - \theta)^{N - t} \end{align*}

\begin{align*} p\left( X=x | T(x) = t \right) &= \frac{p\left( X=x, T(X) = t \right)}{p\left(T(x) = t\right)} \\ &= \frac{\theta^t (1 - \theta)^{N - t}}{\binom{N}{t} \theta^t (1 - \theta)^{N - t}} \\ &= \frac{1}{\binom{N}{t}} \\ \end{align*}

$p\left( X=x | T(x) = t \right)$ does not depend on $\theta$ , so $T(X) = \sum_n^N x_n$ is a sufficient statistic.

Discussion