从标题中即已看出,Metropolis-Hastings、Gibbs and slice sampling是MCMC(基于马尔科夫链的蒙特卡洛采样算法)的变体及改进算法。

With MCMC, we draw samples from a (simple) proposal distribution so that each draw depends only on the state of the previous draw (i.e. the samples form a Markov chain).

The nice thing is that this target distribution only needs to be proportional to the posterior distribution, which means we don’t need to evaluate the potentially intractable marginal likelihood, which is just a normalizing constant. We can find such a target distribution easily, since posterior ∝\propto likelihood ×\times prior.

Metropolis-Hastings

To carry out the Metropolis-Hastings algorithm, we need to draw random samples from the folllowing distributions:

  • 标准均匀分布∼U(0,1)\sim U(0, 1)

  • a proposal distriution p(x)p(x) that we choose to be N(0,σ)\mathcal{N}(0,\sigma)

  • the target distribution g(x)g(x) which is proportional to the posterior probability

Given an initial guess for θ\theta with positive probability of being drawn, the Metropolis-Hastings algorithm proceeds as follows

  • step 1:Choose a new proposed value (θp\theta_p) such that θp=θ+Δθ\theta_p=\theta+\Delta\theta where Δθ∼N(0,σ)\Delta\theta\sim \mathcal{N}(0,\sigma)

  • step 2: Caluculate the ratio

    ρ=g(θp|X)g(θ|X)

    \rho=\frac{g(\theta_p|X)}{g(\theta|X)}
    where gg is the posterior probability.

If the proposal distribution p(⋅)p(\cdot) is not symmetrical(p(θp|θ)≠p(θ|θp)p(\theta_p|\theta)\neq p(\theta|\theta_p)), we need to weight the accceptanc probablity to maintain detailed balance (reversibilty) of the stationary distribution, and instead calculate:

ρ=g(θp|X)p(θ|θp)g(θ|X)p(θp|θ)

\rho = \frac{g(\theta_p|X)p(\theta|\theta_p)}{g(\theta|X)p(\theta_p|\theta)}

Since we are taking ratios, the denominator cancels any distribution proporational to gg will also work - so we can use

ρ=p(X|θp)p(θp)p(X|θ)p(θ)

\rho = \frac{p(X|\theta_p)p(\theta_p)}{p(X|\theta)p(\theta)}

If ρ≥1\rho≥1, then set θ=θp\theta=\theta_p

If ρ<1\rho, then setθ=θp\theta=\theta_p with probability ρ\rho, otherwise set θ=θ\theta=\theta (this is where we use the standard uniform distribution∼U(0,1)\sim U(0, 1))

  • step3:也即,ρ=min{p(X|θp)p(θp)p(X|θ)p(θ),1}\rho=\min\{\frac{p(X|\theta_p)p(\theta_p)}{p(X|\theta)p(\theta)}, 1\},if u∼U(0,1)<ρu\sim U(0, 1), θ=θp\theta=\theta_p(进行跳转),else θ=θ\theta=\theta.

  • step 4:repeat the earlier steps

import numpy as np
import scipy.stats as st
import matplotlib.pyplot as pltn, k = 100, 61
a, b = 10, 10
like = st.binom
prior = st.beta(a, b)# poster = st.beta(a+k, b+n-k)
def target(prior, like, n, k, theta):return 0 if theta < 0 or theta > 1 else prior.pdf(theta)*like(n, theta).pmf(k)naccept = 0
niters = 10**4
theta = 0.001
samples = np.zeros(niters+1)
samples[0] = theta
sigma = .3
for i in range(niters):theta_p = theta + st.norm(0, sigma).rvs()rho = min(target(prior, like, n, k, theta_p)/target(prior, like, n, k, theta), 1)u = st.uniform(0, 1).rvs()if u < rho:naccept += 1theta = theta_psamples[i+1] = theta
nmcmc = len(samples)//2
print('efficiency', naccept/niters)# 0.1833
poster = st.beta(a+k, b+n-k)thetas = np.arange(0, 1, .001)plt.hist(prior.rvs(nmcmc), 40, histtype='step', c='b', label='distribution of prior samples')
plt.hist(samples[nmcmc:], 40, histtype='step', c='r', label='distribution of posterior samples')
plt.plot(thetas, poster.pdf(thetas), ls='--', c='r', label='True posterior')

Gibbs sampler

Suppose we have a vector of parameters θ=(θ1,θ2,…,θk)θ=(θ_1,θ_2,…,θ_k), and we want to estimate the joint posterior distribution p(θ|X)p(θ|X). Suppose we can find and draw random samples from all the conditional distributions.

p(θ1|θ2,…,θk,X)p(θ2|θ1,…,θk,X)…p(θk|θ1,θ2,…,X)

\begin{split} \begin{array}{c} p(\theta_1|\theta_2,\dots,\theta_k,X)\\ p(\theta_2|\theta_1,\dots,\theta_k,X)\\ \ldots\\ p(\theta_k|\theta_1,\theta_2,\dots,X)\\ \end{array} \end{split}
With Gibbs sampling, the Markov chain is constructed by sampling from the conditional distribution for each parameter θiθ_i in turn, treating all other parameters as observed. When we have finished iterating over all parameters, we are said to have completed one cycle of the Gibbs sampler. Where it is difficult to sample from a conditional distribution, we can sample using a Metropolis-Hastings algorithm instead - this is known as Metropolis wihtin Gibbs.

Gibbs sampling is a type of random walk through parameter space, and hence can be thought of as a Metroplish-Hastings algorithm with a special proposal distribution(被认为是一种具有特定proposal distribution的M-H算法). At each iteration in the cycle, we are drawing a proposal for a new value of a particular parameter, where the propsal distribution is the conditional posterior probability of that parameter. This means that the proposal move is always accepted. Hence, if we can draw samples from the conditional distributions, Gibbs sampling can be much more efficient than regular Metropolis-Hastings.

使用解析解,如果存在的话

from functools import reduce
from operator import mul
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stdef comb(N, k):return reduce(mul, range(N-k+1, N+1))//reduce(mul, range(1, k+1))# 组合数C_N^k的计算
def bern(theta, N, k):return comb(N, k)*theta**k*(1-theta)**(N-k)# 二项分布,或者叫n重伯努利试验
def bern2(theta1, theta2, N1, N2, k1, k2):return bern(theta1, N1, k1)*bern(theta2, N2, k2)# '二维'伯努利分布def make_thetas(xmin, xmax, n):xs = np.linspace(xmin, xmax, n)widths = (xs[1:]-xs[:-1])/2thetas = xs[:-1]+widthsreturn thetasdef make_plots(X, Y, prior, lik, poster, projection=None):fig, ax = plt.subplots(1, 3, subplot_kw=dict(projection=projection, aspect='equal'), figsize=(12, 9))if projection=='3d':# 需要引入# from mpl_toolkits.mplot3d import Axes3Dax[0].plot_surface(X, Y, prior, alpha=.3, cmap=plt.cm.jet)ax[1].plot_surface(X, Y, lik, alpha=.3, cmap=plt.cm.jet)ax[2].plot_surface(X, Y, poster, alpha=.3, cmap=plt.cm.jet)else:ax[0].contour(X, Y, prior)ax[1].contour(X, Y, lik)ax[2].contour(X, Y, poster)ax[0].set_title('Prior')ax[1].set_title('Lik')ax[2].set_title('Poster')plt.tight_layout()plt.show()thetas1 = make_thetas(0, 1, 101)
thetas2 = make_thetas(0, 1, 101)
X, Y = np.meshgrid(thetas1, thetas2)a, b = 2, 3
N1, k1 = 14, 11
N2, k2 = 14, 7prior = st.beta(a, b).pdf(X)*st.beta(a, b).pdf(Y)
lik = bern2(X, Y, N1, N2, k1, k2)
poster = st.beta(a+k1, b+N1-k1).pdf(X)*st.beta(a+k2, b+N2-k2).pdf(Y)
make_plots(X, Y, prior, lik, poster)
make_plots(X, Y, prior, lik, poster, '3d')


MCMC: Metropolis-Hastings, Gibbs and slice sampling相关推荐

  1. MCMC中的Metropolis–Hastings算法与吉布斯采样

    Metropolis–Hastings算法是一种具体的MCMC方法,而吉布斯采样(Gibbs Sampling)是Metropolis–Hastings算法的一种特殊形式.二者在机器学习中具有重要作用 ...

  2. Metropolis Hastings MCMC when the proposal and target have differing support

    参考: Metropolis Hastings MCMC when the proposal and target have differing support Examples of errors ...

  3. mh采样算法推导_科学网—MCMC中的Metropolis Hastings抽样法 - 张金龙的博文

    Metropolis Hastings抽样法示例 jinlongzhang01@gmail.com Metropolis Hasting(下面简称MH)是蒙特卡罗马尔科夫链中一种重要的抽样方法.本文简 ...

  4. Metropolis–Hastings算法

    1蒙特卡洛方法 蒙特卡罗方法也称统计模拟方法,是一种以概率统计理论为指导的数值计算方法.蒙特卡洛方法的基本思想是,当所求解问题是某种随机事件出现的概率,或者是某个随机变量的期望值时,通过某种" ...

  5. R语言实现MCMC中的Metropolis–Hastings算法与吉布斯采样

    创建测试数据 第一步,我们创建一些测试数据,用来拟合我们的模型.我们假设预测变量和因变量之间存在线性关系,所以我们用线性模型并添加一些噪音. trueA <- 5trueB <- 0tru ...

  6. 受限玻尔兹曼机准备知识——MCMC方法和Gibbs采样

    先点明几个名词 MCMC方法:马尔可夫链-蒙特卡洛方法  (千万别叫成梅特罗波利斯蒙特卡罗方法了) Metropolis-Hastings采样:梅特罗波利斯-哈斯廷斯采样 Gibbs采样:吉布斯采样 ...

  7. MCMC(三):Gibbs采样

    在<MCMC(二):MCMC采样和M-H采样>中,我们讲到了M-H采样已经可以很好的解决蒙特卡罗方法需要的任意概率分布的样本集的问题. 但是M-H采样有两个缺点:一是需要计算接受率,在高维 ...

  8. 也谈MCMC方法与Gibbs抽样

    个人博客传送门:点击打开链接 MCMC,即传说中的Markov Chain Mento Carlo方法.其主要用于统计推理中进行模拟抽样,尤其在贝叶斯推理中有着非常广泛的应用.如算法模型的后验参数估计 ...

  9. 【记录读论文时遇到的一些算法1】——MCMC sampling Gibbs sampling

    MCMC sampling & Gibbs sampling 1. 什么是采样 2. 为什么要采样 3. 采样的作用 4. 常用的采样方法 5. 基本采样方法 5.1 马尔科夫链蒙特卡洛采样( ...

最新文章

  1. c# byte char string转换
  2. WebService 基础
  3. 如何设计良好的viewcontroller
  4. 信息学奥赛一本通(2059:【例3.11】买笔)
  5. Mysql的数据库引擎 区别特点_mysql数据库存储引擎及区别
  6. python bp神经网络进行预测_python实现BP神经网络回归预测模型
  7. 浅谈Netty相关概念
  8. java架构师主要是干什么的,要注意什么?
  9. 用于目标检测的细粒度动态头
  10. C1能力认证训练题解析 _ 第一部分 _ 计算机通识
  11. 谷歌关闭中国地区音乐搜索服务与产品设计
  12. Memory testing 10----Fuctional RAM Modle------Recovery Fault (RF)
  13. 辉芒微IO单片机FT60F211-RB
  14. (三)DQL数据库指令
  15. 学习如逆水行舟,不进则退
  16. 【NIPS 2016图神经网络论文解读】Variational Graph Auto-Encoders (VGAE) 基于VAE的图变分自编码器
  17. idea设置代码提示
  18. 开源项目推荐:HandsFree机器人项目
  19. 反弹Shell命令一键生成工具
  20. 电磁干扰的屏蔽方法~金属屏蔽效率

热门文章

  1. Hive External Table of Doris(详细)
  2. Intellij IDEA的配置
  3. 计算机tlv简介_TLV编码格式详解
  4. 必须包含数字和字母,字符随意的正则表达式
  5. spark规范化读取数据
  6. mysql排序优化_Mysql 排序优化
  7. excel文件打不开怎么办_电脑设备管理器打不开怎么办
  8. adb dumpsys 查看手机内存
  9. 人工智能目标检测模型总结(二)——目标检测two-stage模型汇总
  10. 求解线性方程组(SVD,QR,Gauss,LU)