Title: | A friendly MCMC framework |
---|---|
Description: | Provides a friendly (flexible) Markov Chain Monte Carlo (MCMC) framework for implementing Metropolis-Hastings algorithm in a modular way allowing users to specify automatic convergence checker, personalized transition kernels, and out-of-the-box multiple MCMC chains using parallel computing. Most of the methods implemented in this package can be found in Brooks et al. (2011, ISBN 9781420079425). Among the methods included, we have: Haario (2001) <doi:10.1007/s11222-011-9269-5> Adaptive Metropolis, Vihola (2012) <doi:10.1007/s11222-011-9269-5> Robust Adaptive Metropolis, and Thawornwattana et al. (2018) <doi:10.1214/17-BA1084> Mirror transition kernels. |
Authors: | George Vega Yon [aut, cre] , Paul Marjoram [ctb, ths] , National Cancer Institute (NCI) [fnd] (Grant Number 5P01CA196569-02), Fabian Scheipl [rev] (JOSS reviewer, <https://orcid.org/0000-0001-8172-3603>) |
Maintainer: | George Vega Yon <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.5-2.9000 |
Built: | 2024-11-11 03:14:05 UTC |
Source: | https://github.com/uscbiostats/fmcmc |
Combines two or more MCMC runs into a single run. If runs have multiple chains, it will check that all have the same number of chains, and it will join chains using the rbind function.
append_chains(...)
append_chains(...)
... |
A list of |
If mcmc.list
, an object of class mcmc.list
, otherwise,
an object of class mcmc
.
# Appending two chains data("lifeexpect") logpost <- function(p) { sum(with(lifeexpect, dnorm( age - p[1] - smoke * p[2] - female * p[3], sd = p[4], log = TRUE) )) } # Using the RAM kernel kern <- kernel_ram(lb = c(-100, -100, -100, .00001)) init <- c( avg_age = 70, smoke = 0, female = 0, sd = 1 ) ans0 <- MCMC(initial = init, fun = logpost, nsteps = 1000, seed = 22, kernel = kern) ans1 <- MCMC(initial = ans0, fun = logpost, nsteps = 2000, seed = 55, kernel = kern) ans2 <- MCMC(initial = ans1, fun = logpost, nsteps = 2000, seed = 1155, kernel = kern) ans_tot <- append_chains(ans0, ans1, ans2) # Looking at the posterior distributions (see ?lifeexpect for info about # the model). Only the trace op <- par(mfrow = c(2,2)) for (i in 1:4) coda::traceplot(ans_tot[, i, drop=FALSE]) par(op)
# Appending two chains data("lifeexpect") logpost <- function(p) { sum(with(lifeexpect, dnorm( age - p[1] - smoke * p[2] - female * p[3], sd = p[4], log = TRUE) )) } # Using the RAM kernel kern <- kernel_ram(lb = c(-100, -100, -100, .00001)) init <- c( avg_age = 70, smoke = 0, female = 0, sd = 1 ) ans0 <- MCMC(initial = init, fun = logpost, nsteps = 1000, seed = 22, kernel = kern) ans1 <- MCMC(initial = ans0, fun = logpost, nsteps = 2000, seed = 55, kernel = kern) ans2 <- MCMC(initial = ans1, fun = logpost, nsteps = 2000, seed = 1155, kernel = kern) ans_tot <- append_chains(ans0, ans1, ans2) # Looking at the posterior distributions (see ?lifeexpect for info about # the model). Only the trace op <- par(mfrow = c(2,2)) for (i in 1:4) coda::traceplot(ans_tot[, i, drop=FALSE]) par(op)
This function is for internal use only.
check_initial(initial, nchains)
check_initial(initial, nchains)
initial |
Either a vector or matrix,. |
nchains |
Integer scalar. Number of chains. |
When initial
is a vector, the values are recycled to form a matrix of
size nchains * length(initial)
.
A named matrix.
init <- c(.4, .1) check_initial(init, 1) check_initial(init, 2) init <- matrix(1:9, ncol=3) check_initial(init, 3) # check_initial(init, 2) # Returns an error
init <- c(.4, .1) check_initial(init, 1) check_initial(init, 2) init <- matrix(1:9, ncol=3) check_initial(init, 3) # check_initial(init, 2) # Returns an error
Built-in set of functions to be used in companion with the argument
conv_checker
in MCMC. These functions are not intended to be used
in a context other than the MCMC
function.
The object LAST_CONV_CHECK
is an environment that holds
information regarding the convergence checker used. This information can be
updated every time that the conv_checker
function is called by MCMC
using
the functions convergence_data_set
and convergence_msg_set
. The function
convergence_data_get
is just a wrapper of get()
.
The msg
member of LAST_CONV_CHECK
is resetted before conv_checker
is
called.
LAST_CONV_CHECK convergence_data_set(x) convergence_data_get(x) convergence_msg_set(msg = NA_character_) convergence_msg_get() convergence_gelman(freq = 1000L, threshold = 1.1, check_invariant = TRUE, ...) convergence_geweke( freq = 1000L, threshold = 0.025, check_invariant = TRUE, ... ) convergence_heildel(freq = 1000L, ..., check_invariant = TRUE) convergence_auto(freq = 1000L)
LAST_CONV_CHECK convergence_data_set(x) convergence_data_get(x) convergence_msg_set(msg = NA_character_) convergence_msg_get() convergence_gelman(freq = 1000L, threshold = 1.1, check_invariant = TRUE, ...) convergence_geweke( freq = 1000L, threshold = 0.025, check_invariant = TRUE, ... ) convergence_heildel(freq = 1000L, ..., check_invariant = TRUE) convergence_auto(freq = 1000L)
x |
In the case of |
msg |
Character scalar. Personalized message to print. |
freq |
Integer scalar. Frequency of checking. |
threshold |
Numeric value. A Gelman statistic below the threshold
will return |
check_invariant |
Logical. When |
... |
Further arguments passed to the method. |
An object of class fmcmc_output_conv_check
(inherits from environment
) of length 1.
convergence_gelman
is a wrapper of coda::gelman.diag()
.
In the case of convergence_geweke
, threshold
sets the p-value
for the null , i.e. equal means between the first and last
chunks of the chain. See coda::geweke.diag. This implies that the higher
the threshold, the lower the probability of stopping the chain.
In the case that the chain has more than one parameter, the algorithm will return true if and only if the test fails to reject the null for all the parameters.
For the convergence_heildel
, see coda::heidel.diag for details.
The convergence_auto
function is the default and is just a wrapper
for convergence_gelman
and convergence_geweke
. This function returns a
convergence checker that will be either of the other two depending on whether
nchains
in MCMC
is greater than one–in which case it will use the Gelman
test–or not–in which case it will use the Geweke test.
A function passed to MCMC to check automatic convergence.
Convergence checkers are simply a function that receives as argument a matrix
(or list of them) with sampled values, and returns a logical scalar with the
value TRUE
if the chain converged. An example of a personalized convergence
checker is provided below. The frequency with which the check is performed is
retrieved from the attribute "freq"
from the convergence checker function,
i.e., attr(..., "freq")
. If missing, convergence will be checked halfway
the number of steps in the chain, i.e., floor(nsteps/2)
.
# Example 1: Presonalized conv checker -------------------------------------- # Dummy rule, if acceptance rate is near between .2 and .3. convergence_example <- function(x) { arate <- 1 - coda::rejectionRate(x) all( abs(arate - .25) < .05 ) } # Tell fmcmc::MCMC what is the frequency attr(convergence_example, "freq") <- 2e3 set.seed(223) x <- rnorm(1000) y <- x * 2 + rnorm(1000) logpost <- function(p) { sum(dnorm(y, mean = x * p, log = TRUE)) } ans <- MCMC( initial = 0, fun = logpost, nsteps = 5e4, kernel= kernel_ram(), conv_checker = convergence_example ) # Example 2: Adding information --------------------------------------------- # Here we do two things: Save a value and set a message for the user convergence_example_with_info <- structure(function(x) { arate <- 1 - coda::rejectionRate(x) # Saving a value if (!exists("arates", envir = LAST_CONV_CHECK, inherits = FALSE)) { convergence_data_set(list(arates = arate)) } else { convergence_data_set(list( arates = rbind(convergence_data_get("arates"), arate) )) } # Setting up the message convergence_msg_set( sprintf("Current Avg. Accept. Rate: %.2f", mean(arate)) ) all( abs(arate - .25) < .05 ) }, freq = 2000) ans <- MCMC( initial = 0, fun = logpost, nsteps = 5e4, kernel= kernel_ram(), conv_checker = convergence_example_with_info, seed = 112, progress = FALSE )
# Example 1: Presonalized conv checker -------------------------------------- # Dummy rule, if acceptance rate is near between .2 and .3. convergence_example <- function(x) { arate <- 1 - coda::rejectionRate(x) all( abs(arate - .25) < .05 ) } # Tell fmcmc::MCMC what is the frequency attr(convergence_example, "freq") <- 2e3 set.seed(223) x <- rnorm(1000) y <- x * 2 + rnorm(1000) logpost <- function(p) { sum(dnorm(y, mean = x * p, log = TRUE)) } ans <- MCMC( initial = 0, fun = logpost, nsteps = 5e4, kernel= kernel_ram(), conv_checker = convergence_example ) # Example 2: Adding information --------------------------------------------- # Here we do two things: Save a value and set a message for the user convergence_example_with_info <- structure(function(x) { arate <- 1 - coda::rejectionRate(x) # Saving a value if (!exists("arates", envir = LAST_CONV_CHECK, inherits = FALSE)) { convergence_data_set(list(arates = arate)) } else { convergence_data_set(list( arates = rbind(convergence_data_get("arates"), arate) )) } # Setting up the message convergence_msg_set( sprintf("Current Avg. Accept. Rate: %.2f", mean(arate)) ) all( abs(arate - .25) < .05 ) }, freq = 2000) ans <- MCMC( initial = 0, fun = logpost, nsteps = 5e4, kernel= kernel_ram(), conv_checker = convergence_example_with_info, seed = 112, progress = FALSE )
These algorithms are used in kernel_adapt()
to simplify variance-covariance
recalculation at every step of the algorithm.
cov_recursive( X_t, Cov_t, Mean_t_prev, t., Mean_t = NULL, eps = 0, Sd = 1, Ik = diag(ncol(Cov_t)) ) mean_recursive(X_t, Mean_t_prev, t.)
cov_recursive( X_t, Cov_t, Mean_t_prev, t., Mean_t = NULL, eps = 0, Sd = 1, Ik = diag(ncol(Cov_t)) ) mean_recursive(X_t, Mean_t_prev, t.)
X_t |
Last value of the sample |
Cov_t |
Covariance in t |
t. |
Sample size up to |
Mean_t , Mean_t_prev
|
Vectors of averages in time |
Sd , eps , Ik
|
See |
The variance covariance algorithm was described in Haario, Saksman and Tamminen (2002).
Haario, H., Saksman, E., & Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli, 7(2), 223–242. https://projecteuclid.org/euclid.bj/1080222083
# Generating random data (only four points to see the difference) set.seed(1231) n <- 3 X <- matrix(rnorm(n*4), ncol = 4) # These two should be equal mean_recursive( X_t = X[1,], Mean_t_prev = colMeans(X[-1,]), t. = n - 1 ) colMeans(X) # These two should be equal cov_recursive( X_t = X[1, ], Cov_t = cov(X[-1,]), Mean_t = colMeans(X), Mean_t_prev = colMeans(X[-1, ]), t = n-1 ) cov(X) # Speed example ------------------------------------------------------------- set.seed(13155511) X <- matrix(rnorm(1e3*100), ncol = 100) ans0 <- cov(X[-1,]) t0 <- system.time({ ans1 <- cov(X) }) t1 <- system.time(ans2 <- cov_recursive( X[1, ], ans0, Mean_t = colMeans(X), Mean_t_prev = colMeans(X[-1,]), t. = 1e3 - 1 )) # Comparing accuracy and speed range(ans1 - ans2) t0/t1
# Generating random data (only four points to see the difference) set.seed(1231) n <- 3 X <- matrix(rnorm(n*4), ncol = 4) # These two should be equal mean_recursive( X_t = X[1,], Mean_t_prev = colMeans(X[-1,]), t. = n - 1 ) colMeans(X) # These two should be equal cov_recursive( X_t = X[1, ], Cov_t = cov(X[-1,]), Mean_t = colMeans(X), Mean_t_prev = colMeans(X[-1, ]), t = n-1 ) cov(X) # Speed example ------------------------------------------------------------- set.seed(13155511) X <- matrix(rnorm(1e3*100), ncol = 100) ans0 <- cov(X[-1,]) t0 <- system.time({ ans1 <- cov(X) }) t1 <- system.time(ans2 <- cov_recursive( X[1, ], ans0, Mean_t = colMeans(X), Mean_t_prev = colMeans(X[-1,]), t. = 1e3 - 1 )) # Comparing accuracy and speed range(ans1 - ans2) t0/t1
The fmcmc
package provides a flexible framework for implementing MCMC models
using a lightweight in terms of dependencies. Among its main features, fmcmc
allows:
Implementing arbitrary transition kernels.
Incorporating convergence monitors for automatic stop.
Out-of-the-box parallel computing implementation for running multiple chains simultaneously.
For more information see the packages vignettes:
vignette("workflow-with-fmcmc", "fmcmc") vignette("user-defined-kernels", "fmcmc")
Vega Yon et al., (2019). fmcmc: A friendly MCMC framework. Journal of Open Source Software, 4(39), 1427, doi:10.21105/joss.01427
These functions will no longer be included starting version 0.6-0. Instead, use the functions in mcmc-output.
last_elapsed() LAST_MCMC last_nsteps() last_nchains() last_kernel() last_conv_checker() last_(x)
last_elapsed() LAST_MCMC last_nsteps() last_nchains() last_kernel() last_conv_checker() last_(x)
x |
Name of the object to retrieve. |
An object of class fmcmc_last_mcmc
of length 0.
The function last_elapsed
returns the elapsed time of the last call
to MCMC. In particular, the MCMC
function records the running time of R
at the beginning and end of the function using proc.time()
. So this function
returns the difference between the two (time_end - time_start
).
Implementation of Haario et al. (2001)'s Adaptive Metropolis.
kernel_adapt( mu = 0, bw = 0L, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, freq = 1L, warmup = 500L, Sigma = NULL, Sd = NULL, eps = 1e-04, fixed = FALSE, until = Inf ) kernel_am( mu = 0, bw = 0L, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, freq = 1L, warmup = 500L, Sigma = NULL, Sd = NULL, eps = 1e-04, fixed = FALSE, until = Inf )
kernel_adapt( mu = 0, bw = 0L, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, freq = 1L, warmup = 500L, Sigma = NULL, Sd = NULL, eps = 1e-04, fixed = FALSE, until = Inf ) kernel_am( mu = 0, bw = 0L, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, freq = 1L, warmup = 500L, Sigma = NULL, Sd = NULL, eps = 1e-04, fixed = FALSE, until = Inf )
mu |
Either a numeric vector or a scalar. Proposal mean. If scalar, values are recycled to match the number of parameters in the objective function. |
bw |
Integer scalar. The bandwidth, is the number of observations to include in the computation of the variance-covariance matrix. |
lb , ub
|
Either a numeric vector or a scalar. Lower and upper bounds for bounded kernels. When of length 1, the values are recycled to match the number of parameters in the objective function. |
freq |
Integer scalar. Frequency of updates. How often the variance-covariance matrix is updated. The implementation is different from that described in the original paper (see details). |
warmup |
Integer scalar. The number of iterations that the algorithm has to wait before starting to do the updates. |
Sigma |
The variance-covariance matrix. By default this will be an identity matrix during the warmup period. |
Sd |
Overall scale for the algorithm. By default, the variance-covariance
is scaled to |
eps |
Double scalar. Default size of the initial step (see details). |
fixed |
Logical scalar or vector of length |
until |
Integer scalar. Last step at which adaptation takes place (see details). |
While it has been shown that under regular conditions this transition kernel generates ergodic chains even when the adaptation does not stop, some practitioners may want to stop adaptation at some point.
kernel_adapt
Implements the adaptive Metropolis (AM) algorithm of Haario
et al. (2001). If the value of bw is greater than zero, then the algorithm
folds back AP, a previous version which is known to have ergodicity problems.
The parameter eps
has two functions. The first one is to set the initial
scale for the multivariate normal kernel, which is replaced after warmup
steps with the actual variance-covariance computed by the main algorithm.
The second usage is in the equation that ensures that the variance-covariance
is greater than zero, this is, the parameter in the
original paper.
The update of the covariance matrix is done using cov_recursive()
function,
which makes the updates faster. The freq
parameter, besides of indicating the
frequency with which the updates are done, it specifies what are the samples
included in each update, in other words, like a thinning parameter, only every
freq
samples will be used to compute the covariance matrix. Since this
implementation uses the recursive formula for updating the covariance, there is
no practical need to set freq != 1
.
kernel_am
is just an alias for kernel_adapt
.
An object of class fmcmc_kernel. fmcmc_kernel
objects are intended
to be used with the MCMC()
function.
Haario, H., Saksman, E., & Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli, 7(2), 223–242. https://projecteuclid.org/euclid.bj/1080222083
Other kernels:
kernel_mirror
,
kernel_new()
,
kernel_normal()
,
kernel_ram()
,
kernel_unif()
# Update every-step and wait 1,000 steps before starting to adapt kern <- kernel_adapt(freq = 1, warmup = 1000) # Two parameters model, the second parameter with a restricted range, i.e. # a lower bound of 1 kern <- kernel_adapt(lb = c(-.Machine$double.xmax, 0))
# Update every-step and wait 1,000 steps before starting to adapt kern <- kernel_adapt(freq = 1, warmup = 1000) # Two parameters model, the second parameter with a restricted range, i.e. # a lower bound of 1 kern <- kernel_adapt(lb = c(-.Machine$double.xmax, 0))
NMirror and UMirror transition kernels described in Thawornwattana et al. (2018).
kernel_nmirror( mu = 0, scale = 1, warmup = 500L, nadapt = 4L, arate = 0.4, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" ) kernel_umirror( mu = 0, scale = 1, warmup = 500L, nadapt = 4L, arate = 0.4, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" )
kernel_nmirror( mu = 0, scale = 1, warmup = 500L, nadapt = 4L, arate = 0.4, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" ) kernel_umirror( mu = 0, scale = 1, warmup = 500L, nadapt = 4L, arate = 0.4, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" )
mu , scale
|
Either a numeric vector or a scalar. Proposal mean and scale. If scalar, values are recycled to match the number of parameters in the objective function. |
warmup |
Integer. Number of steps required before starting adapting the chains. |
nadapt |
Integer. Number of times the scale is adjusted for adaptation during the warmup (burn-in) period. |
arate |
Double. Target acceptance rate used as a reference during the adaptation process. |
lb , ub
|
Either a numeric vector or a scalar. Lower and upper bounds for bounded kernels. When of length 1, the values are recycled to match the number of parameters in the objective function. |
fixed , scheme
|
For multivariate functions, sets the update plan.
See |
The kernel_nmirror
and kernel_umirror
functions implement simple symmetric
transition kernels that pivot around an approximation of the asymptotic mean.
In the multidimensional case, this implementation just draws a vector of independent draws from the proposal kernel, instead of using, for example, a multivariate distribution of some kind. This will be implemented in the next update of the package.
During the warmup period (or burnin as described in the paper), the algorithm
adapts both the scale and the reference mean of the proposal distribution.
While the mean is adapted continuously, the scale is updated only a handful
of times, in particular, nadapt
times during the warmup time. The adaptation
is done as proposed by Yang and Rodriguez (2013) in which the
scale is adapted four times.
An object of class fmcmc_kernel. fmcmc_kernel
objects are intended
to be used with the MCMC()
function.
Thawornwattana, Y., Dalquen, D., & Yang, Z. (2018). Designing Simple and Efficient Markov Chain Monte Carlo Proposal Kernels. Bayesian Analysis, 13(4), 1037–1063. doi:10.1214/17-BA1084
Yang, Z., & Rodriguez, C. E. (2013). Searching for efficient Markov chain Monte Carlo proposal kernels. Proceedings of the National Academy of Sciences, 110(48), 19307–19312. doi:10.1073/pnas.1311790110
Other kernels:
kernel_adapt()
,
kernel_new()
,
kernel_normal()
,
kernel_ram()
,
kernel_unif()
# Normal mirror kernel with 5 adaptations and 1000 steps of warmup (burnin) kern <- kernel_nmirror(nadapt = 5, warmup = 1000) # Same as before but using a uniform mirror and choosing a target acceptance # rate of 24 % kern <- kernel_umirror(nadapt = 5, warmup = 1000, arate = .24)
# Normal mirror kernel with 5 adaptations and 1000 steps of warmup (burnin) kern <- kernel_nmirror(nadapt = 5, warmup = 1000) # Same as before but using a uniform mirror and choosing a target acceptance # rate of 24 % kern <- kernel_umirror(nadapt = 5, warmup = 1000, arate = .24)
The function kernel_new
is a helper function that allows creating
fmcmc_kernel
objects which are passed to the MCMC()
function.
kernel_new(proposal, ..., logratio = NULL, kernel_env = new.env(hash = TRUE))
kernel_new(proposal, ..., logratio = NULL, kernel_env = new.env(hash = TRUE))
proposal , logratio
|
Functions. Both receive a single argument, an environment. This functions are called later within MCMC (see details). |
... |
In the case of |
kernel_env |
Environment. This will be used as the main container of the
kernel's components. It is returned as an object of class |
The objects fmcmc_kernels
are environments that in general contain the
following objects:
proposal
: The function used to propose changes in the chain based
on the current state. The function must return a vector of length equal
to the number of parameters in the model.
logratio
: This function is called after a new state has been proposed,
and is used to compute the log of the Hastings ratio.
In the case that the logratio
function is not specified, then it is assumed
that the transition kernel is symmetric, this is, log-ratio is then implemented
as function(env) {env$f1 - env$f0}
...
: Further objects that are used within those functions.
Both functions, proposal
and logratio
, receive a single argument, an
environment, which is passed by the MCMC()
function during each step using
the function environment()
.
The passed environment is actually the environment in which the MCMC
function is running, in particular, this environment contains the following
objects:
Object | Description | |
i |
Integer. The current iteration. | |
theta1 |
Numeric vector. The last proposed state. | |
theta0 |
Numeric vector. The current state | |
f
|
The log-unnormalized posterior function (a wrapper of fun passed
to MCMC). |
|
f1 |
The last value of f(theta1) |
|
f0 |
The last value of f(theta0) |
|
kernel |
The actual fmcmc_kernel object. |
|
ans |
The matrix of samples defined up to i - 1 .
|
These are the core component of the MCMC
function. The following block
of code is how this is actually implemented in the package:
for (i in 1L:nsteps) { # Step 1. Propose theta1[] <- kernel$proposal(environment()) f1 <- f(theta1) # Checking f(theta1) (it must be a number, can be Inf) if (is.nan(f1) | is.na(f1) | is.null(f1)) stop( "fun(par) is undefined (", f1, ")", "Check either -fun- or the -lb- and -ub- parameters.", call. = FALSE ) # Step 2. Hastings ratio if (R[i] < kernel$logratio(environment())) { theta0 <- theta1 f0 <- f1 } # Step 3. Saving the state ans[i,] <- theta0 }
For an extensive example on how to create new kernel objects see the vignette
vignette("user-defined-kernels", "fmcmc")
.
An environment of class fmcmc_kernel
which contains the following:
proposal
A function that receives a single argument, an environment. This
is the proposal function used within MCMC()
.
logratio
A function to compute log ratios of the current vs the proposed
step of the chain. Also used within MCMC()
.
...
Further arguments passed to kernel_new
.
In some cases, calls to the proposal()
and logratio()
functions in
fmcmc_kernels
can trigger changes or updates of variables stored within them.
A concrete example is with adaptive kernels.
Adaptive Metropolis and Robust Adaptive Metropolis implemented in the functions
kernel_adapt()
and kernel_ram()
both update a covariance matrix used
during the proposal stage, and furthermore, have a warmup
stage that sets
the point at which both will start adapting. Because of this, both kernels
have internal counters of the absolute step count which allows them activating,
scaling, etc. the proposals correctly.
When running multiple chains, MCMC
will create independent copies of a
baseline passed fmcmc_kernel
object. These are managed together in a
fmcmc_kernel_list
object.
Even if the chains are run in parallel, if a predefined kernel object is
passed it will be updated to reflect the last state of the kernels
before the MCMC
call returns.
Brooks, S., Gelman, A., Jones, G. L., & Meng, X. L. (2011). Handbook of Markov Chain Monte Carlo. Handbook of Markov Chain Monte Carlo.
Other kernels:
kernel_adapt()
,
kernel_mirror
,
kernel_normal()
,
kernel_ram()
,
kernel_unif()
# Example creating a multivariate normal kernel using the mvtnorm R package # for a bivariate normal distribution library(mvtnorm) # Define your Sigma sigma <- matrix(c(1, .2, .2, 1), ncol = 2) # How does it looks like? sigma # [,1] [,2] # [1,] 1.0 0.2 # [2,] 0.2 1.0 # Create the kernel kernel_mvn <- kernel_new( proposal = function(env) { env$theta0 + as.vector(mvtnorm::rmvnorm(1, mean = 0, sigma = sigma.)) }, sigma. = sigma ) # As you can see, in the previous call we passed sigma as it will be used by # the proposal function # The logaratio function was not necesary to be passed since this kernel is # symmetric.
# Example creating a multivariate normal kernel using the mvtnorm R package # for a bivariate normal distribution library(mvtnorm) # Define your Sigma sigma <- matrix(c(1, .2, .2, 1), ncol = 2) # How does it looks like? sigma # [,1] [,2] # [1,] 1.0 0.2 # [2,] 0.2 1.0 # Create the kernel kernel_mvn <- kernel_new( proposal = function(env) { env$theta0 + as.vector(mvtnorm::rmvnorm(1, mean = 0, sigma = sigma.)) }, sigma. = sigma ) # As you can see, in the previous call we passed sigma as it will be used by # the proposal function # The logaratio function was not necesary to be passed since this kernel is # symmetric.
Gaussian Transition Kernel
kernel_normal(mu = 0, scale = 1, fixed = FALSE, scheme = "joint") kernel_normal_reflective( mu = 0, scale = 1, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" )
kernel_normal(mu = 0, scale = 1, fixed = FALSE, scheme = "joint") kernel_normal_reflective( mu = 0, scale = 1, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, scheme = "joint" )
mu , scale
|
Either a numeric vector or a scalar. Proposal mean and scale. If scalar, values are recycled to match the number of parameters in the objective function. |
fixed , scheme
|
For multivariate functions, sets the update plan.
See |
lb , ub
|
Either a numeric vector or a scalar. Lower and upper bounds for bounded kernels. When of length 1, the values are recycled to match the number of parameters in the objective function. |
The kernel_normal
function provides the canonical normal kernel
with symmetric transition probabilities.
The kernel_normal_reflective
implements the normal kernel with reflective
boundaries. Lower and upper bounds are treated using reflecting boundaries, this is,
if the proposed is greater than the
ub
, then
is subtracted from
. At the same time, if it is less than
lb
, then
is added to
lb
iterating until is within
[lb, ub]
.
In this case, the transition probability is symmetric (just like the normal kernel).
An object of class fmcmc_kernel. fmcmc_kernel
objects are intended
to be used with the MCMC()
function.
Other kernels:
kernel_adapt()
,
kernel_mirror
,
kernel_new()
,
kernel_ram()
,
kernel_unif()
# Normal kernel with a small scale (sd) of 0.05 kern <- kernel_normal(scale = 0.05) # Using boundaries for the second parameter of a two parameter chain # to have values in [0, 100]. kern <- kernel_normal_reflective( ub = c(.Machine$double.xmax, 100), lb = c(-.Machine$double.xmax, 0) )
# Normal kernel with a small scale (sd) of 0.05 kern <- kernel_normal(scale = 0.05) # Using boundaries for the second parameter of a two parameter chain # to have values in [0, 100]. kern <- kernel_normal_reflective( ub = c(.Machine$double.xmax, 100), lb = c(-.Machine$double.xmax, 0) )
Implementation of Vihola (2012)'s Robust Adaptive Metropolis.
kernel_ram( mu = 0, eta = function(i, k) min(c(1, i^(-2/3) * k)), qfun = function(k) stats::rt(k, k), arate = 0.234, freq = 1L, warmup = 0L, Sigma = NULL, eps = 1e-04, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, until = Inf, constr = NULL )
kernel_ram( mu = 0, eta = function(i, k) min(c(1, i^(-2/3) * k)), qfun = function(k) stats::rt(k, k), arate = 0.234, freq = 1L, warmup = 0L, Sigma = NULL, eps = 1e-04, lb = -.Machine$double.xmax, ub = .Machine$double.xmax, fixed = FALSE, until = Inf, constr = NULL )
mu |
Either a numeric vector or a scalar. Proposal mean. If scalar, values are recycled to match the number of parameters in the objective function. |
eta |
A function that receives the MCMC environment. This is to calculate the scaling factor for the adaptation. |
qfun |
Function. As described in Vihola (2012)'s, the |
arate |
Numeric scalar. Objective acceptance rate. |
freq |
Integer scalar. Frequency of updates. How often the variance-covariance matrix is updated. |
warmup |
Integer scalar. The number of iterations that the algorithm has to wait before starting to do the updates. |
Sigma |
The variance-covariance matrix. By default this will be an identity matrix during the warmup period. |
eps |
Double scalar. Default size of the initial step (see details). |
lb , ub
|
Either a numeric vector or a scalar. Lower and upper bounds for bounded kernels. When of length 1, the values are recycled to match the number of parameters in the objective function. |
fixed |
Logical scalar or vector of length |
until |
Integer scalar. Last step at which adaptation takes place (see details). |
constr |
Logical lower-diagonal square matrix of size |
While it has been shown that under regular conditions this transition kernel generates ergodic chains even when the adaptation does not stop, some practitioners may want to stop adaptation at some point.
The idea is similar to that of the Adaptive Metropolis algorithm (AM implemented
as kernel_adapt()
here) with the difference that it takes into account a
target acceptance rate.
The eta
function regulates the rate of adaptation. The default implementation
will decrease the rate of adaptation exponentially as a function of the iteration
number.
And the matrix is updated according to the following equation:
An object of class fmcmc_kernel.
Vihola, M. (2012). Robust adaptive Metropolis algorithm with coerced acceptance rate. Statistics and Computing, 22(5), 997–1008. doi:10.1007/s11222-011-9269-5
Other kernels:
kernel_adapt()
,
kernel_mirror
,
kernel_new()
,
kernel_normal()
,
kernel_unif()
# Setting the acceptance rate to 30 % and deferring the updates until # after 1000 steps kern <- kernel_ram(arate = .3, warmup = 1000)
# Setting the acceptance rate to 30 % and deferring the updates until # after 1000 steps kern <- kernel_ram(arate = .3, warmup = 1000)
Uniform Transition Kernel
kernel_unif(min. = -1, max. = 1, fixed = FALSE, scheme = "joint") kernel_unif_reflective( min. = -1, max. = 1, lb = min., ub = max., fixed = FALSE, scheme = "joint" )
kernel_unif(min. = -1, max. = 1, fixed = FALSE, scheme = "joint") kernel_unif_reflective( min. = -1, max. = 1, lb = min., ub = max., fixed = FALSE, scheme = "joint" )
min. , max.
|
Passed to |
fixed , scheme
|
For multivariate functions, sets the update plan.
See |
lb , ub
|
Either a numeric vector or a scalar. Lower and upper bounds for bounded kernels. When of length 1, the values are recycled to match the number of parameters in the objective function. |
The kernel_unif
function provides a uniform transition kernel. This (symmetric)
kernel function by default adds the current status values between [-1,1].
The kernel_unif_reflective
is similar to kernel_unif
with the
main difference that proposals are bounded to be within [lb, ub]
.
An object of class fmcmc_kernel. fmcmc_kernel
objects are intended
to be used with the MCMC()
function.
Other kernels:
kernel_adapt()
,
kernel_mirror
,
kernel_new()
,
kernel_normal()
,
kernel_ram()
# Multivariate setting with 4 parameters in which we set the kernel to make # proposals one parameter at-a-time in a random ordering. kern <- kernel_unif(scheme = "random")
# Multivariate setting with 4 parameters in which we set the kernel to make # proposals one parameter at-a-time in a random ordering. kern <- kernel_unif(scheme = "random")
A simulated data set based on statistics of life expectancy in the US at birth.
lifeexpect
lifeexpect
A 1,000 rows data frame with three columns:
age
: Years lived.
smoke
: 0/1 variable equal to 1 if the individual smokes.
female
: 0/1 variable equal to 1 if the individual is a female at birth.
The data was generated using official statistics from the CDC and a study of life expectancy of smokers in the US published in the New England Journal of Medicine (see references).
According to the CDC, data from 2020 indicates that the average life expectancy of females in the US is 80.5 years vs 75.1 years for males (which declined with respect to 2019 after COVID hit the US). In Jha et al. (2013), evidence is presented indicating that individuals who smoke have at least ten years left of life expectancy compared to non-smokers.
The parameter estimates for the data generating process where:
An average of life expectancy of 80.1.
Smokers live 10 years less than non-smokers.
Females live 5.4 years longer than males.
Jha, P., Ramasundarahettige, C., Landsman, V., Rostron, B., Thun, M., Anderson, R. N., ... Peto, R. (2013). 21st-Century Hazards of Smoking and Benefits of Cessation in the United States. New England Journal of Medicine, 368(4), 341–350. doi:10.1056/NEJMsa1211128
Arias, E., Tejada-Vera, B., & Ahmad, F. (2021). Provisional life expectancy estimates for January through June, 2020. https://www.cdc.gov/nchs/data/vsrr/VSRR10-508.pdf
A flexible implementation of the Metropolis-Hastings MCMC algorithm, users can utilize arbitrary transition kernels as well as set-up an automatic stop criterion using a convergence check test.
MCMC( initial, fun, nsteps, ..., seed = NULL, nchains = 1L, burnin = 0L, thin = 1L, kernel = kernel_normal(), multicore = FALSE, conv_checker = NULL, cl = NULL, progress = interactive() && !multicore, chain_id = 1L ) MCMC_without_conv_checker( initial, fun, nsteps, ..., nchains = 1L, burnin = 0L, thin = 1L, kernel = kernel_normal(), multicore = FALSE, conv_checker = NULL, cl = NULL, progress = interactive() && !multicore, chain_id = 1L ) MCMC_OUTPUT
MCMC( initial, fun, nsteps, ..., seed = NULL, nchains = 1L, burnin = 0L, thin = 1L, kernel = kernel_normal(), multicore = FALSE, conv_checker = NULL, cl = NULL, progress = interactive() && !multicore, chain_id = 1L ) MCMC_without_conv_checker( initial, fun, nsteps, ..., nchains = 1L, burnin = 0L, thin = 1L, kernel = kernel_normal(), multicore = FALSE, conv_checker = NULL, cl = NULL, progress = interactive() && !multicore, chain_id = 1L ) MCMC_OUTPUT
initial |
Either a numeric matrix or vector, or an object of class coda::mcmc or coda::mcmc.list (see details). initial values of the parameters for each chain (See details). |
fun |
A function. Returns the log-likelihood. |
nsteps |
Integer scalar. Length of each chain. |
... |
Further arguments passed to |
seed |
If not null, passed to set.seed. |
nchains |
Integer scalar. Number of chains to run (in parallel). |
burnin |
Integer scalar. Length of burn-in. Passed to
coda::mcmc as |
thin |
Integer scalar. Passed to coda::mcmc. |
kernel |
An object of class fmcmc_kernel. |
multicore |
Logical. If |
conv_checker |
A function that receives an object of class coda::mcmc.list,
and returns a logical value with |
cl |
A |
progress |
Logical scalar. When set to |
chain_id |
Integer scalar (internal use only). This is an argument
passed to the kernel function and it allows it identify in which of the
chains the process is taking place. This could be relevant for some kernels
(see |
This function implements Markov Chain Monte Carlo (MCMC) using the Metropolis-Hastings ratio with flexible transition kernels. Users can specify either one of the available transition kernels or define one of their own (see kernels). Furthermore, it allows easy parallel implementation running multiple chains in parallel. The function also allows using convergence diagnostics tests to set-up a criterion for automatically stopping the algorithm (see convergence-checker).
The canonical form of the Metropolis Hastings algorithm consists on accepting
a move from state to state
based on the Hastings ratio
:
where is the unnormalized density of the specified distribution (
the posterior probability), and
has the conditional probability of
moving from state
to
(the proposal density). The move
is then accepted with probability
Observe that, in the case that is symmetric, meaning
,
the Hastings ration reduces to
. Starting version 0.5-0, the value
of the log unnormalized density and the proposed states
y
can be accessed using
the functions get_logpost()
and get_draws()
.
We now give details of the various options included in the function.
The function MCMC_without_conv_checker
is for internal use
only.
MCMC
returns an object of class coda::mcmc from the coda
package. The mcmc
object is a matrix with one column per parameter,
and nsteps
rows. If nchains > 1
, then it returns a coda::mcmc.list.
While the main output of MCMC
is the mcmc
(.list
) object, other information
and intermediate outputs of the process are stored in MCMC_OUTPUT
. For information
about how to retrieve/set data, see mcmc-output.
By default, if initial
is of class mcmc
, MCMC
will take the last nchains
points from the chain as starting point for the new sequence. If initial
is
of class mcmc.list
, the number of chains in initial
must match the nchains
parameter.
If initial
is a vector, then it must be of length equal to the number of
parameters used in the model. When using multiple chains, if initial
is not
an object of class mcmc
or mcmc.list
, then it must be a numeric matrix
with as many rows as chains, and as many columns as parameters in the model.
When nchains > 1
, the function will run multiple chains. Furthermore,
if cl
is not passed, MCMC
will create a PSOCK
cluster
using makePSOCKcluster
with
parallel::detectCores
clusters and attempt to execute using multiple cores. Internally, the function does
the following:
# Creating the cluster ncores <- parallel::detectCores() ncores <- ifelse(nchains < ncores, nchains, ncores) cl <- parallel::makePSOCKcluster(ncores) # Loading the package and setting the seed using clusterRNGStream invisible(parallel::clusterEvalQ(cl, library(fmcmc))) parallel::clusterSetRNGStream(cl, .Random.seed)
When running in parallel, objects that are
used within fun
must be passed through ...
, otherwise the cluster
will return with an error.
The user controls the initial value of the parameters of the MCMC algorithm
using the argument initial
. When using multiple chains, i.e., nchains > 1
,
the user can specify multiple starting points, which is recommended. In such a
case, each row of initial
is use as a starting point for each of the
chains. If initial
is a vector and nchains > 1
, the value is recycled, so
all chains start from the same point (not recommended, the function throws a
warning message).
By default, no automatic stop is implemented. If one of the functions in
convergence-checker is used, then the MCMC is done by bulks as specified
by the convergence checker function, and thus the algorithm will stop if,
the conv_checker
returns TRUE
. For more information see convergence-checker.
Brooks, S., Gelman, A., Jones, G. L., & Meng, X. L. (2011). Handbook of Markov Chain Monte Carlo. Handbook of Markov Chain Monte Carlo.
Vega Yon, G., & Marjoram, P. (2019). fmcmc: A friendly MCMC framework. Journal of Open Source Software, 4(39), 1427. doi:10.21105/joss.01427
get_logpost()
, get_logpost()
(mcmc-output) for post execution of MCMC
, and
ith_step()
for accessing objects within an MCMC
call.
# Univariate distributed data with multiple parameters ---------------------- # Parameters set.seed(1231) n <- 1e3 pars <- c(mean = 2.6, sd = 3) # Generating data and writing the log likelihood function D <- rnorm(n, pars[1], pars[2]) fun <- function(x) { x <- log(dnorm(D, x[1], x[2])) sum(x) } # Calling MCMC, but first, loading the coda R package for # diagnostics library(coda) ans <- MCMC( fun, initial = c(mu=1, sigma=1), nsteps = 2e3, kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0) ) # Ploting the output oldpar <- par(no.readonly = TRUE) par(mfrow = c(1,2)) boxplot(as.matrix(ans), main = expression("Posterior distribution of"~mu~and~sigma), names = expression(mu, sigma), horizontal = TRUE, col = blues9[c(4,9)], sub = bquote(mu == .(pars[1])~", and"~sigma == .(pars[2])) ) abline(v = pars, col = blues9[c(4,9)], lwd = 2, lty = 2) plot(apply(as.matrix(ans), 1, fun), type = "l", main = "LogLikelihood", ylab = expression(L("{"~mu,sigma~"}"~"|"~D)) ) par(oldpar) # In this example we estimate the parameter for a dataset with ---------------- # With 5,000 draws from a MVN() with parameters M and S. # Loading the required packages library(mvtnorm) library(coda) # Parameters and data simulation S <- cbind(c(.8, .2), c(.2, 1)) M <- c(0, 1) set.seed(123) D <- rmvnorm(5e3, mean = M, sigma = S) # Function to pass to MCMC fun <- function(pars) { # Putting the parameters in a sensible way m <- pars[1:2] s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(dmvnorm(D, m, s))) } # Calling MCMC ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2 ) # Checking out the outcomes plot(ans) summary(ans) # Multiple chains ----------------------------------------------------------- # As we want to run -fun- in multiple cores, we have to # pass -D- explicitly (unless using Fork Clusters) # just like specifying that we are calling a function from the # -mvtnorm- package. fun <- function(pars, D) { # Putting the parameters in a sensible way m <- pars[1:2] s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(mvtnorm::dmvnorm(D, m, s))) } # Two chains ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, nchains = 2, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D ) summary(ans) # Example using a user-defined cl object ------------------------------------- # A silly function that gets the first two parameters of the # vector. Using the default multicore example will cause and error get_m <- function(pars) { pars[1:2] } fun <- function(pars, D) { # Putting the parameters in a sensible way m <- get_m(pars) s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(mvtnorm::dmvnorm(D, m, s))) } if (FALSE) { # Thi will fail with the error # Error in checkForRemoteErrors(val) : # 4 nodes produced errors; first error: could not find function "get_m" ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D, # Multiple chains using parallel computing multicore = TRUE, nchains = 4 ) summary(ans) } # To solve the error, we need to build the cluster object library(parallel) cl <- makePSOCKcluster(4) # Export the function `get_m` to the cluster. The function `fun` and data `D` # are automatically exported by the `MCMC` function. clusterExport(cl, "get_m") # Run the MCMC ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D, # Multiple chains using parallel computing multicore = TRUE, nchains = 4, # Use the cluster object cl = cl ) summary(ans)
# Univariate distributed data with multiple parameters ---------------------- # Parameters set.seed(1231) n <- 1e3 pars <- c(mean = 2.6, sd = 3) # Generating data and writing the log likelihood function D <- rnorm(n, pars[1], pars[2]) fun <- function(x) { x <- log(dnorm(D, x[1], x[2])) sum(x) } # Calling MCMC, but first, loading the coda R package for # diagnostics library(coda) ans <- MCMC( fun, initial = c(mu=1, sigma=1), nsteps = 2e3, kernel = kernel_normal_reflective(scale = .1, ub = 10, lb = 0) ) # Ploting the output oldpar <- par(no.readonly = TRUE) par(mfrow = c(1,2)) boxplot(as.matrix(ans), main = expression("Posterior distribution of"~mu~and~sigma), names = expression(mu, sigma), horizontal = TRUE, col = blues9[c(4,9)], sub = bquote(mu == .(pars[1])~", and"~sigma == .(pars[2])) ) abline(v = pars, col = blues9[c(4,9)], lwd = 2, lty = 2) plot(apply(as.matrix(ans), 1, fun), type = "l", main = "LogLikelihood", ylab = expression(L("{"~mu,sigma~"}"~"|"~D)) ) par(oldpar) # In this example we estimate the parameter for a dataset with ---------------- # With 5,000 draws from a MVN() with parameters M and S. # Loading the required packages library(mvtnorm) library(coda) # Parameters and data simulation S <- cbind(c(.8, .2), c(.2, 1)) M <- c(0, 1) set.seed(123) D <- rmvnorm(5e3, mean = M, sigma = S) # Function to pass to MCMC fun <- function(pars) { # Putting the parameters in a sensible way m <- pars[1:2] s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(dmvnorm(D, m, s))) } # Calling MCMC ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2 ) # Checking out the outcomes plot(ans) summary(ans) # Multiple chains ----------------------------------------------------------- # As we want to run -fun- in multiple cores, we have to # pass -D- explicitly (unless using Fork Clusters) # just like specifying that we are calling a function from the # -mvtnorm- package. fun <- function(pars, D) { # Putting the parameters in a sensible way m <- pars[1:2] s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(mvtnorm::dmvnorm(D, m, s))) } # Two chains ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, nchains = 2, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D ) summary(ans) # Example using a user-defined cl object ------------------------------------- # A silly function that gets the first two parameters of the # vector. Using the default multicore example will cause and error get_m <- function(pars) { pars[1:2] } fun <- function(pars, D) { # Putting the parameters in a sensible way m <- get_m(pars) s <- cbind( c(pars[3], pars[4]), c(pars[4], pars[5]) ) # Computing the unnormalized log likelihood sum(log(mvtnorm::dmvnorm(D, m, s))) } if (FALSE) { # Thi will fail with the error # Error in checkForRemoteErrors(val) : # 4 nodes produced errors; first error: could not find function "get_m" ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D, # Multiple chains using parallel computing multicore = TRUE, nchains = 4 ) summary(ans) } # To solve the error, we need to build the cluster object library(parallel) cl <- makePSOCKcluster(4) # Export the function `get_m` to the cluster. The function `fun` and data `D` # are automatically exported by the `MCMC` function. clusterExport(cl, "get_m") # Run the MCMC ans <- MCMC( initial = c(mu0=5, mu1=5, s0=5, s01=0, s2=5), fun, kernel = kernel_normal_reflective( lb = c(-10, -10, .01, -5, .01), ub = 5, scale = 0.01 ), nsteps = 1e3, thin = 10, burnin = 5e2, D = D, # Multiple chains using parallel computing multicore = TRUE, nchains = 4, # Use the cluster object cl = cl ) summary(ans)
You can use these functions to read variables, store, and retrieve data during the MCMC process.
ith_step(x) set_userdata(...) get_userdata()
ith_step(x) set_userdata(...) get_userdata()
x |
Name of the element to retrieve. If missing, it will return the entire environment in which the main MCMC loop is running. |
... |
Named values to be appended to the user data. |
The function ith_step()
provides access to the following elements:
i
: (int) Step (iteration) number.
nsteps
: (int) Number of steps.
chain_id
: (int) Id of the chain (goes from 1 to -nchains-)
theta0
: (double vector) Current state of the chain.
theta1
: (double vector) Proposed state of the chain.
ans
: (double matrix) Set of accepted states (it will be NA for rows >= i).
draws
: (double matrix) Set of proposed states (it will be NA for rows >= i).
logpost
: (double vector) Value of -fun- (it will be NA for elements >= i).
R
: (double vector) Random values from U(0,1). This is used with the Hastings ratio.
thin
: (int) Thinning (applied after the last step).
burnin
: (int) Burn-in (applied after the last step).
conv_checker
: (function) Convergence checker function.
kernel
: (fmcmc_kernel) Kernel object.
fun
: (function) Passed function to MCMC.
f
: (function) Wrapper of -fun-.
initial
: (double vector) Starting point of the chain.
The following objects always have fixed values (see ?ith_step): nchains, cl, multicore
Other available objects: cnames, funargs, MCMC_OUTPUT, passedargs, progress
The function set_userdata()
returns invisible()
. The only side
effect is appending the information by row.
The function ith_step()
is a convenience function that provides
access to the environment within which the main loop of the MCMC call is
being evaluated. This is a wrapper of MCMC_OUTPUT$loop_envir
that will
either return the value x
or, if missing, the entire environment. If
ith_step()
is called outside of the MCMC
call, then it will return with
an error.
For example, if you wanted to print information if the current value of logpost is greater than the previous value of logpost, you could define the objective function as follows:
f <- function(p) { i <- ith_step("i") logpost_prev <- ith_step("logpost")[i - 1L] logpost_curr <- sum(dnorm(y - x*p, log = TRUE)) if (logpost_prev < logpost_curr) cat("At a higher point!\n") return(logpost_curr) }
In the case of the objects nchains
, cl
, and multicore
, the function will
always return the default values 1
, NULL
, and FALSE
, respectively. Thus, the
user shouldn't rely on these objects to provide information regarding runs
using multiple chains. More examples below.
#' # Getting the logpost ------------------------------------------------------- set.seed(23133) x <- rnorm(200) y <- x*2 + rnorm(200) f <- function(p) { sum(dnorm(y - x*p, log = TRUE)) } ans <- MCMC(fun = f, initial = c(0), nsteps=2000) plot(get_logpost(), type = "l") # Plotting the logpost from the last run # Printing information every 500 step --------------------------------------- # for this we use ith_step() f <- function(p) { # Capturing info from within the loop i <- ith_step("i") nsteps <- ith_step("nsteps") if (!(i %% 500)) { cat( "////////////////////////////////////////////////////\n", "Step ", i, " of ", nsteps,". Values in the loop:\n", "theta0: ", ith_step("theta0"), "\n", "theta1: ", ith_step()$theta1, "\n", sep = "" ) } sum(dnorm(y - x*p, log = TRUE)) } ans0 <- MCMC(fun = f, initial = c(0), nsteps=2000, progress = FALSE, seed = 22) # //////////////////////////////////////////////////// # Step 500 of 2000. Values in the loop: # theta0: 2.025379 # theta1: 1.04524 # //////////////////////////////////////////////////// # Step 1000 of 2000. Values in the loop: # theta0: 2.145967 # theta1: 0.2054037 # //////////////////////////////////////////////////// # Step 1500 of 2000. Values in the loop: # theta0: 2.211691 # theta1: 2.515361 # //////////////////////////////////////////////////// # Step 2000 of 2000. Values in the loop: # theta0: 1.998789 # theta1: 1.33034 # Printing information if the current logpost is greater than max ----------- f <- function(p) { i <- ith_step("i") logpost_prev <- max(ith_step("logpost")[1:(i-1)]) logpost_curr <- sum(dnorm(y - x*p, log = TRUE)) # Only worthwhile after the first step if ((i > 1L) && logpost_prev < logpost_curr) cat("At a higher point!:", logpost_curr, ", step:", i,"\n") return(logpost_curr) } ans1 <- MCMC(fun = f, initial = c(0), nsteps=1000, progress = FALSE, seed = 22) # At a higher point!: -357.3584 , step: 2 # At a higher point!: -272.6816 , step: 6 # At a higher point!: -270.9969 , step: 7 # At a higher point!: -269.8128 , step: 24 # At a higher point!: -269.7435 , step: 46 # At a higher point!: -269.7422 , step: 543 # At a higher point!: -269.7421 , step: 788 # Saving extra information -------------------------------------------------- data("lifeexpect") # Defining the logposterior logpost <- function(p) { # Reconding the sum of the parameters (just because) # and the number of step. set_userdata(i = ith_step("i"), sum_of_p = sum(p)) with(lifeexpect, { sum(dnorm(age - (p[1] + p[2]*smoke + p[3]*female), sd = p[4], log = TRUE)) }) } # A kernel where sd is positive, the first is average age, so we # make it positive too kern <- kernel_ram(lb = c(10, -20, -20, .0001), eps = .01) ans <- MCMC( initial = c(70, -2, 2, 1), fun = logpost, kernel = kern, nsteps = 1000, seed = 1 ) # Retrieving the data head(get_userdata()) # It also works using multiple chains ans_two <- MCMC( initial = c(70, -2, 2, 1), fun = logpost, kernel = kern, nsteps = 1000, seed = 1, nchains = 2 ) user_dat <- get_userdata() lapply(user_dat, head)
#' # Getting the logpost ------------------------------------------------------- set.seed(23133) x <- rnorm(200) y <- x*2 + rnorm(200) f <- function(p) { sum(dnorm(y - x*p, log = TRUE)) } ans <- MCMC(fun = f, initial = c(0), nsteps=2000) plot(get_logpost(), type = "l") # Plotting the logpost from the last run # Printing information every 500 step --------------------------------------- # for this we use ith_step() f <- function(p) { # Capturing info from within the loop i <- ith_step("i") nsteps <- ith_step("nsteps") if (!(i %% 500)) { cat( "////////////////////////////////////////////////////\n", "Step ", i, " of ", nsteps,". Values in the loop:\n", "theta0: ", ith_step("theta0"), "\n", "theta1: ", ith_step()$theta1, "\n", sep = "" ) } sum(dnorm(y - x*p, log = TRUE)) } ans0 <- MCMC(fun = f, initial = c(0), nsteps=2000, progress = FALSE, seed = 22) # //////////////////////////////////////////////////// # Step 500 of 2000. Values in the loop: # theta0: 2.025379 # theta1: 1.04524 # //////////////////////////////////////////////////// # Step 1000 of 2000. Values in the loop: # theta0: 2.145967 # theta1: 0.2054037 # //////////////////////////////////////////////////// # Step 1500 of 2000. Values in the loop: # theta0: 2.211691 # theta1: 2.515361 # //////////////////////////////////////////////////// # Step 2000 of 2000. Values in the loop: # theta0: 1.998789 # theta1: 1.33034 # Printing information if the current logpost is greater than max ----------- f <- function(p) { i <- ith_step("i") logpost_prev <- max(ith_step("logpost")[1:(i-1)]) logpost_curr <- sum(dnorm(y - x*p, log = TRUE)) # Only worthwhile after the first step if ((i > 1L) && logpost_prev < logpost_curr) cat("At a higher point!:", logpost_curr, ", step:", i,"\n") return(logpost_curr) } ans1 <- MCMC(fun = f, initial = c(0), nsteps=1000, progress = FALSE, seed = 22) # At a higher point!: -357.3584 , step: 2 # At a higher point!: -272.6816 , step: 6 # At a higher point!: -270.9969 , step: 7 # At a higher point!: -269.8128 , step: 24 # At a higher point!: -269.7435 , step: 46 # At a higher point!: -269.7422 , step: 543 # At a higher point!: -269.7421 , step: 788 # Saving extra information -------------------------------------------------- data("lifeexpect") # Defining the logposterior logpost <- function(p) { # Reconding the sum of the parameters (just because) # and the number of step. set_userdata(i = ith_step("i"), sum_of_p = sum(p)) with(lifeexpect, { sum(dnorm(age - (p[1] + p[2]*smoke + p[3]*female), sd = p[4], log = TRUE)) }) } # A kernel where sd is positive, the first is average age, so we # make it positive too kern <- kernel_ram(lb = c(10, -20, -20, .0001), eps = .01) ans <- MCMC( initial = c(70, -2, 2, 1), fun = logpost, kernel = kern, nsteps = 1000, seed = 1 ) # Retrieving the data head(get_userdata()) # It also works using multiple chains ans_two <- MCMC( initial = c(70, -2, 2, 1), fun = logpost, kernel = kern, nsteps = 1000, seed = 1, nchains = 2 ) user_dat <- get_userdata() lapply(user_dat, head)
MCMC
callThis environment holds a copy of the last call to MCMC, including the start
and end time (to compute total elapsed time) of the call. Since the resulting
object of MCMC
is an object of class coda::mcmc, this is a way to capture
more information in case the user needs it.
get_(x) get_logpost() get_draws() get_elapsed() get_initial() get_fun() get_nsteps() get_seed() get_nchains() get_burnin() get_thin() get_kernel() get_multicore() get_conv_checker() get_cl() get_progress() get_chain_id()
get_(x) get_logpost() get_draws() get_elapsed() get_initial() get_fun() get_nsteps() get_seed() get_nchains() get_burnin() get_thin() get_kernel() get_multicore() get_conv_checker() get_cl() get_progress() get_chain_id()
x |
Character scalar. Name of an argument to retrieve. If |
The function get_logpost
returns the logposterior
value at each
iteration. The values correspond to a named numeric vector. If nchains > 1
then it will return a list of length nchains
with the corresponding logpost
values for each chain.
The function get_draws()
retrieves the proposed states from the
kernel function.
# Getting the logpost ------------------------------------------------------- set.seed(23133) x <- rnorm(200) y <- -4 + x*2 + rnorm(200) f <- function(p) { sum(dnorm(y - p[1] - x*p[2], log = TRUE)) } # Setting a RAM kernel kern <- kernel_am(eps = 1e-2) ans <- MCMC(fun = f, initial = c(0, 1), nsteps = 2000, kernel = kern) plot( # Plotting the logpost from the last run -get_logpost(), # Getting the number of chains main = paste0("nchains: ", get_nchains()), # And the elapsed time sub = sprintf("Run time: %.4f(s)", get_elapsed()[3]), type = "l", log = "y" ) # This also works using multiple chains ans <- MCMC(fun = f, initial = c(0, 0), nsteps=2000, nchains = 2, kernel = kern) # In this case, just like -ans-, draws <- get_draws() # Plotting proposed points vs accepted plot( draws[[1]], pch = 20, col = adjustcolor("gray", alpha = .5), main = "Accepted vs proposed states\n(chain 1)" ) lines(ans[[1]], pch = 20, col = "tomato", lwd = 2) legend( "topleft", legend = c("Accepted", "Proposed"), pch = c(NA, 20), col = c("tomato", "black"), lty = c(1, NA), lwd = c(2, NA) )
# Getting the logpost ------------------------------------------------------- set.seed(23133) x <- rnorm(200) y <- -4 + x*2 + rnorm(200) f <- function(p) { sum(dnorm(y - p[1] - x*p[2], log = TRUE)) } # Setting a RAM kernel kern <- kernel_am(eps = 1e-2) ans <- MCMC(fun = f, initial = c(0, 1), nsteps = 2000, kernel = kern) plot( # Plotting the logpost from the last run -get_logpost(), # Getting the number of chains main = paste0("nchains: ", get_nchains()), # And the elapsed time sub = sprintf("Run time: %.4f(s)", get_elapsed()[3]), type = "l", log = "y" ) # This also works using multiple chains ans <- MCMC(fun = f, initial = c(0, 0), nsteps=2000, nchains = 2, kernel = kern) # In this case, just like -ans-, draws <- get_draws() # Plotting proposed points vs accepted plot( draws[[1]], pch = 20, col = adjustcolor("gray", alpha = .5), main = "Accepted vs proposed states\n(chain 1)" ) lines(ans[[1]], pch = 20, col = "tomato", lwd = 2) legend( "topleft", legend = c("Accepted", "Proposed"), pch = c(NA, 20), col = c("tomato", "black"), lty = c(1, NA), lwd = c(2, NA) )
A simple progress bar. This function is used in MCMC when the function has been called with a single processor.
new_progress_bar( n, probs = c(0, 0.25, 0.5, 0.75, 1), width = getOption("width", 80), symbol = "/", ... )
new_progress_bar( n, probs = c(0, 0.25, 0.5, 0.75, 1), width = getOption("width", 80), symbol = "/", ... )
n |
Integer. Number of steps. |
probs |
Double vector. Quantiles where to put marks |
width |
Integer. Width of the bar in characters. |
symbol |
Character. Single character symbol to print each bar. |
... |
Further arguments passed to |
A function that can be included at the interior of a loop to
mark the progress of the loop. It receives a single argument, i
,
which is the number of the current step.
x <- new_progress_bar(20) for (i in 1:20) { Sys.sleep(2/20) x(i) }
x <- new_progress_bar(20) for (i in 1:20) { Sys.sleep(2/20) x(i) }
Parameters' update sequence
plan_update_sequence(k, nsteps, fixed, scheme)
plan_update_sequence(k, nsteps, fixed, scheme)
k |
Integer. Number of parameters |
nsteps |
Integer. Number of steps. |
fixed |
Logical scalar or vector of length |
scheme |
Scheme in which the proposals are made (see details). |
The parameter scheme
present on the currently available kernels sets the way
in which proposals are made. By default, scheme = "joint"
, proposals are done
jointly, this is, at each step of the chain we are proposing new states for
each parameter of the model. When scheme = "ordered"
, a sequential update schema
is followed, in which, at each step of the chain, proposals are made one
variable at a time, If scheme = "random"
, proposals are also made one
variable at a time but in a random scheme.
Finally, users can specify their own sequence of proposals for the variables
by passing a numeric vector to scheme
, for example, if the user wants to make
sequential proposals following the scheme 2, 1, 3, then scheme must be set to
be scheme = c(2, 1, 3)
.
A logical vector of size nsteps
x k
.
Adjust a proposal according to its support by reflecting it. This is the workhorse of kernel_normal_reflective and kernel_unif_reflective. It is intended for internal use only.
reflect_on_boundaries(x, lb, ub, which)
reflect_on_boundaries(x, lb, ub, which)
x |
A numeric vector. The proposal |
lb , ub
|
Numeric vectors of length |
which |
Integer vector. Index of variables to be updated. |
An adjusted proposal vector.