iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
⛓️

Explanation and Python Implementation of the Metropolis-Hastings Algorithm

に公開

MCMC (Markov Chain Monte Carlo)

When you want to obtain samples from a certain distribution \pi(x), it is difficult to get samples effectively using direct methods if \pi(x) is a high-dimensional distribution or if it is analytically difficult to represent in the first place.

MCMC (Markov Chain Monte Carlo) is a way to handle such cases. This is an algorithm that simulates a Markov chain that has \pi(x) as its stationary distribution. By repeating this algorithm, the obtained values become closer to samples from \pi(x).

Metropolis Hastings Method

One of the methods to simulate a Markov chain that has \pi(x) as its stationary distribution is the Metropolis Hastings method.
In the Metropolis Hastings method, the user needs to specify a proposal distribution. A "proposal" indicates where the next sample y should be located when a sample is at x, and its distribution is denoted as Q(y|x).

As a simple example, let's say y is chosen randomly near the position of x. For instance, using a random number following a normal distribution, we can set:

y | x \sim N(x, 1)

In other words, the proposal distribution Q(y|x) becomes the equation:

Q(y|x) = \dfrac{1}{\sqrt{2 \pi}} \exp\left( - \dfrac{1}{2}(y-x)^2 \right)


Using this proposal distribution Q(y|x), the Metropolis Hastings method follows this algorithm:

  1. Set an appropriate initial value X_1 = x_1
  2. Repeat the following for t \geq 1 a sufficient number of times:
    • Sample a value y from Q(y|x_t) (Proposal)
    • Calculate A = \min \left(1, \dfrac{\pi(y)Q(x_t|y)}{\pi(x_t)Q(y|x_t)} \right)
    • Accept the proposal with probability A and set x_{t+1} = y. Reject the proposal with probability 1-A and remain at x_{t+1} = x_t

Implementation

As an example, suppose we want to obtain samples from the distribution:

\pi(x) = \begin{cases} \exp(-x) & (x \geq 0) \\ 0 & (x < 0) \end{cases}

Of course, for such a simple distribution, there is no need to go out of your way to use MCMC, but this is intended as a simple example.

The proposal distribution Q(y|x) is the movement using normal distribution random numbers as mentioned in the example above. In the case of normal distribution random numbers, Q(x_t|y) = Q(y|x_t) holds due to the form of the equation for Q(y|x), so A can be written more simply as:

A = \min \left(1, \dfrac{\pi(y)}{\pi(x_t)}\right)

The method where A is replaced with this equation is called the Metropolis method, which is the predecessor to the Metropolis-Hastings method (Hastings later extended it to a more general Q(y|x)).


Implementing this Metropolis method in Python looks like the following:

import numpy as np

def pi(x):
    if x >= 0:
        return np.exp(-x)
    else:
        return 0
	
xs = []
xs.append(10) # Set the initial value to 10 as an example

for i in range(10000):
    y = np.random.normal(xs[i], 1) # Proposed value
    a = min(1, pi(y) / pi(xs[i]))

    u = np.random.rand() # Generate a uniform random number between 0 and 1

    if u < a: # Accept
        xs.append(y)
    else: # Reject
        xs.append(xs[i])

Now, the xs calculated this way can be considered samples of \pi(x), so its histogram should have a shape close to \pi(x). Let's verify this.

import matplotlib.pyplot as plt
%matplotlib inline

plt.hist(xs, bins=20)

You can certainly see that it takes the form of a downward-sloping exponential distribution.

Also, let's compare the first 100 samples with the next 100 samples.

plt.hist(xs[0:100], bins=20)

plt.hist(xs[100:200], bins=20)

The latter seems to show the downward slope more clearly. You can see that by increasing the number of trials, it gets closer to a sample of \pi(x).

References

Discussion