We start by loading the dataset that the mcmc
package
includes. We will use the logit
data set to obtain a
posterior distribution of the model parameters using the
MCMC
function.
library(fmcmc)
data(logit, package = "mcmc")
out <- glm(y ~ x1 + x2 + x3 + x4, data = logit, family = binomial, x = TRUE)
beta.init <- as.numeric(coefficients(out))
To use the Metropolis-Hastings MCMC algorithm, the function should be
(in principle) the log unnormalized posterior. The following block of
code, extracted from the mcmc
package vignette “MCMC
Package Example,” creates the function that we will be using:
lupost_factory <- function(x, y) function(beta) {
eta <- as.numeric(x %*% beta)
logp <- ifelse(eta < 0, eta - log1p(exp(eta)), - log1p(exp(- eta)))
logq <- ifelse(eta < 0, - log1p(exp(eta)), - eta - log1p(exp(- eta)))
logl <- sum(logp[y == 1]) + sum(logq[y == 0])
return(logl - sum(beta^2) / 8)
}
lupost <- lupost_factory(out$x, out$y)
Let’s give it the first try. In this case, we will use the beta
estimates from the estimated GLM model as a starting point for the
algorithm, and we will ask it to sample 1e4 points from the posterior
distribution (nsteps
).
# to get reproducible results
set.seed(42)
out <- MCMC(
initial = beta.init,
fun = lupost,
nsteps = 1e3
)
Since the resulting object is of class mcmc
(from the
coda
R package), we can use all the functions included in
coda
for model diagnostics:
So this chain has very poor mixing, so let’s try again by using a smaller scale for the normal kernel proposal moving it from 1 (the default value) to .2:
# to get reproducible results
set.seed(42)
out <- MCMC(
initial = beta.init,
fun = lupost,
nsteps = 1e3,
kernel = kernel_normal(scale = .2)
)
The kernel_normal
–the default kernel in the
MCMC
function–returns an object of class
fmcmc_kernel
. In principle, it consists of a list of two
functions that are used by the MCMC
routine:
proposal
, the proposal kernel function, and
logratio
, the function that returns the log of the
Metropolis-Hastings ratio. We will talk more about
fmcmc_kernel
objects later. Now, let’s look at the first
three variables of our model:
Better. Now, ideally we should only be using observations from the
stationary distribution. Let’s give it another try checking for
convergence every 1,000 steps using the
convergence_geweke
:
# to get reproducible results
set.seed(42)
out <- MCMC(
initial = beta.init,
fun = lupost,
nsteps = 1e4,
kernel = kernel_normal(scale = .2),
conv_checker = convergence_geweke(200)
)
## Convergence has been reached with 200 steps. avg Geweke's Z: 1.1854. (200 final count of samples).
A bit better. As announced by MCMC
, the
convergence_geweke
checker suggests that the chain reached
a stationary state. With this in hand, we can now rerun the algorithm
such that we start from the last couple of steps of the chain, this
time, without convergence monitoring as it is no longer necessary.
We will increase the number of steps (sample size), use two chains using parallel computing, and add some thinning to reduce autocorrelation:
# Now, we change the seed, so we get a different stream of
# pseudo random numbers
set.seed(112)
out_final <- MCMC(
initial = out, # Automagically takes the last 2 points
fun = lupost,
nsteps = 5e4, # Increasing the sample size
kernel = kernel_normal(scale = .2),
thin = 10,
nchains = 2L, # Running parallel chains
multicore = TRUE # in parallel.
)
Notice that, instead of specifying the starting points for each
chain, we passed the out
object to initial
. By
default, if initial
is of class mcmc
,
MCMC
will take the last nchains
points from
the chain as starting point for the new sequence. If
initial
is of class mcmc.list
, the number of
chains in initial
must match the nchains
parameter. We now see that the output posterior distribution appears to
be stationary
##
## Iterations = 10:50000
## Thinning interval = 10
## Number of chains = 2
## Sample size per chain = 5000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## par1 0.6575 0.3005 0.003005 0.004844
## par2 0.7998 0.3696 0.003696 0.006680
## par3 1.1719 0.3629 0.003629 0.006884
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## par1 0.0924 0.4509 0.6445 0.8535 1.275
## par2 0.1084 0.5457 0.7846 1.0470 1.560
## par3 0.5181 0.9193 1.1533 1.4079 1.939
fmcmc_kernel
objects are environments that are passed to
the MCMC
function. While the MCMC
function
only returns the mcmc
class object (as defined in the
coda
package), users can exploit the fact that the kernel
objects are environments to reuse them or inspect them once the
MCMC
function returns. For example,
fmcmc_kernel
objects can be useful with adaptive kernels as
users can review the covariance structure or other components.
To illustrate this, let’s re-do the MCMC chain of the previous example but using an adaptive kernel instead, in particular, Haario’s 2010 adaptive metropolis.
The MCMC function will update the kernel at every step
(freq = 1
), and the adaptation will start from step 500
(warmup = 500
). We can see that some of its components
haven’t been initialized or have a default starting value before the
call of the MCMC
function:
## [1] 0
## NULL
Let’s see how it works:
set.seed(12)
out_haario_1 <- MCMC(
initial = out,
fun = lupost,
nsteps = 1000, # We will only run the chain for 100 steps
kernel = khaario, # We passed the predefined kernel
thin = 1, # No thining here
nchains = 1L, # A single chain
multicore = FALSE # Running in serial
)
Let’s inspect the output and mark when the adaptation starts:
traceplot(out_haario_1[,1], main = "Traceplot of the first parameter")
abline(v = 500, col = "red", lwd = 2, lty=2)
If we look at the khaario
kernel, the
fmcmc_kernel
object, we can see that things changed from
the first time we ran it
## [1] 999
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.008220285 0.003821791 -0.002033491 -0.03193140 0.002977029
## [2,] 0.003821791 0.021120481 -0.002487103 -0.04318181 -0.003852918
## [3,] -0.002033491 -0.002487103 0.004459020 0.01008434 -0.002217338
## [4,] -0.031931404 -0.043181807 0.010084337 0.24419493 -0.026343668
## [5,] 0.002977029 -0.003852918 -0.002217338 -0.02634367 0.013265372
If we re-run the chain, using as starting point the last step of the first run, we can also continue using the kernel object:
out_haario_2 <- MCMC(
initial = out_haario_1,
fun = lupost,
nsteps = 2000, # We will only run the chain for 2000 steps now
kernel = khaario, # Same as before, same kernel.
thin = 1,
nchains = 1L,
multicore = FALSE,
seed = 555 # We can also specify the seed in the MCMC function
)
Let’s see again how does everything looks like:
traceplot(out_haario_2[,1], main = "Traceplot of the first parameter")
abline(v = 500, col = "red", lwd = 2, lty=2)
As shown in the plot, since the warmup period already passed for the kernel object, the adaptation process is happening at every step, so we don’t see a big break at step 500 as before. Let’s see the counts and the covariance matrix and compare it with the previous one:
# Number of iterations (absolute count, the counts equal the number of steps)
# This will have 1000 (first run) + 2000 (second run) steps
khaario$abs_iter
## [1] 2998
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.057599782 0.005541324 0.014625012 -0.005830144 0.017563415
## [2,] 0.005541324 0.091474926 -0.015691108 -0.023694701 -0.011853939
## [3,] 0.014625012 -0.015691108 0.091880184 0.026965029 -0.006407434
## [4,] -0.005830144 -0.023694701 0.026965029 0.218984142 -0.004821361
## [5,] 0.017563415 -0.011853939 -0.006407434 -0.004821361 0.089121970
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.049379498 -0.001719533 -0.016658503 -0.02610126 -0.014586386
## [2,] -0.001719533 -0.070354445 0.013204005 -0.01948711 0.008001021
## [3,] -0.016658503 0.013204005 -0.087421164 -0.01688069 0.004190096
## [4,] -0.026101260 -0.019487106 -0.016880692 0.02521078 -0.021522308
## [5,] -0.014586386 0.008001021 0.004190096 -0.02152231 -0.075856598
Things have changed since the last time we used the kernel, as
expected. Kernel objects in the fmcmc
package can also be
used with multiple chains and in parallel. The MCMC
function is smart enough to create independent copies of
fmcmc_kernel
objects when running multiple chains, and keep
the original kernel objects up-to-date even when using multiple cores to
run MCMC
. For more technical details on how
fmcmc_kernel
objects work see the manual
?fmcmc_kernel
or the vignette “User-defined kernels”
included in the package
vignette("user-defined-kernels", package = "fmcmc")
.
In some situations, you may want to access the computed unnormalized
log-posterior probabilities, the states proposed by the kernel, or other
process components. In those cases, the functions with the prefix
get_*
can help you.
Starting version 0.5-0, we replaced the family of functions
last_*
with get_*
; a re-design of this
“memory” component that gives users access to data generated during the
Markov process. After each run of the MCMC
function,
information regarding the last execution is stored in the environment
MCMC_OUTPUT
.
If you wanted to look at the logposterior of the last call and proposed states, you can do the following:
# Pretty figure showing proposed and accepted
plot(
get_draws()[,1:2], pch = 20, col = "gray",
main = "Haario's second run"
)
points(out_haario_2[,1:2], col = "red", pch = 20)
legend(
"topleft", legend = c("proposed", "accepted"),
col = c("gray", "red"),
pch = 20,
bty = "n"
)
The MCMC_OUTPUT
environment also contains the arguments
passed to MCMC()
:
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1000
## End = 1000
## Thinning interval = 1
## par1 par2 par3 par4 par5
## 1000 0.8557948 0.2572012 1.083914 0.3151828 0.6530384
## function (beta)
## {
## eta <- as.numeric(x %*% beta)
## logp <- ifelse(eta < 0, eta - log1p(exp(eta)), -log1p(exp(-eta)))
## logq <- ifelse(eta < 0, -log1p(exp(eta)), -eta - log1p(exp(-eta)))
## logl <- sum(logp[y == 1]) + sum(logq[y == 0])
## return(logl - sum(beta^2)/8)
## }
## <bytecode: 0x55fa19225e10>
## <environment: 0x55fa17f0c2e8>
##
## An environment of class fmcmc_kernel:
##
## Ik : num [1:5, 1:5] 1e-04 0e+00 0e+00 0e+00 0e+00 0e+00 1e-04 0e+00 0e+00 0e+00 ...
## Mean_t_prev : num [1, 1:5] 0.675 0.812 1.11 0.345 0.798
## Sd : num 1.15
## Sigma : num [1:5, 1:5] 0.0576 0.00554 0.01463 -0.00583 0.01756 ...
## abs_iter : int 2998
## bw : int 0
## eps : num 1e-04
## fixed : logi [1:5] FALSE FALSE FALSE FALSE FALSE
## freq : num 1
## k : int 5
## lb : num [1:5] -1.8e+308 -1.8e+308 -1.8e+308 -1.8e+308 -1.8e+308
## logratio : function (env)
## mu : num [1:5] 0 0 0 0 0
## proposal : function (env)
## ub : num [1:5] 1.8e+308 1.8e+308 1.8e+308 1.8e+308 1.8e+308
## until : num Inf
## warmup : num 500
## which. : int [1:5] 1 2 3 4 5