Package 'methcon5'

Title: Identify and Rank CpG DNA Methylation Conservation Along the Human Genome
Description: Identify and rank CpG DNA methylation conservation along the human genome. Specifically it includes bootstrapping methods to provide ranking which should adjust for the differences in length as without it short regions tend to get higher conservation scores.
Authors: Emil Hvitfeldt [aut, cre]
Maintainer: Emil Hvitfeldt <[email protected]>
License: MIT + file LICENSE
Version: 0.1.0.9000
Built: 2024-09-23 02:57:42 UTC
Source: https://github.com/uscbiostats/methcon5

Help Index


Simple simulated methylation dataset

Description

Simple simulated methylation dataset

Usage

fake_methylation

Format

A data frame with 2771 rows and 3 variables: gene, cons_level and meth.

Details

This dataset is for example use only. It contains 500 genes identified by gene each with one of 3 types of conservation levels "low", "medium" and "high". The methylation values are independent randomly distributed within each gene. Thus no spacial correlation is assumed.


Calculate region wise summary statistics

Description

Will take a data.frame and apply a function ('fun') to 'value' within the groups defined by the 'id' column.

Usage

meth_aggregate(data, id, value, fun = mean, ...)

Arguments

data

a data.frame.

id

variable name, to be aggregated around.

value

variable name, contains the value to take mean over. Must be a single column.

fun

function, summary statistic function to be calculated. Defaults to 'mean'.

...

Additional arguments for the function given to the argument fun.

Details

Please note the ordering of the data will matter depending on the choice of aggregation function.

Value

A methcon object. Contains the aggregated data along with original data.frame and variable selections.

Examples

meth_aggregate(fake_methylation, id = gene, value = meth, fun = mean)

meth_aggregate(fake_methylation, id = gene, value = meth, fun = var)

# custom functions can be used as well
mean_diff <- function(x) {
  mean(diff(x))
}

meth_aggregate(fake_methylation, id = gene, value = meth, fun = mean_diff)

Bootstrapped randomly samples values

Description

"perm_v1" (the default method) will sample the variables the rows independently. "perm_v2" will sample regions of same size while allowing overlap between different regions. "perm_v3" will sample regions under the constraint that all sampled regions are contained in the region they are sampled in.

Usage

meth_bootstrap(data, reps, method = c("perm_v1", "perm_v2", "perm_v3"))

Arguments

data

a methcon data.frame output from 'meth_bootstrap'.

reps

Number of reps, defaults to 1000.

method

Character, determining which method to use. See details for information about methods. Defaults to "perm_v1".

Details

Note that you can apply 'meth_bootstrap' multiple times to get values for different methods.

Value

A methcon object. Contains the aggregated data along with original data.frame and variable selections and bootstrapped values.

Examples

# Note that you likely want to do more than 10 repitions.
# rep = 10 was chosen to have the examples run fast.

fake_methylation %>%
  meth_aggregate(id = gene, value = meth, fun = mean) %>%
  meth_bootstrap(10)

fake_methylation %>%
  meth_aggregate(id = gene, value = meth, fun = mean) %>%
  meth_bootstrap(10, method = "perm_v2")

# Get multiple bootstraps
fake_methylation %>%
  meth_aggregate(id = gene, value = meth, fun = mean) %>%
  meth_bootstrap(10, method = "perm_v1") %>%
  meth_bootstrap(10, method = "perm_v2") %>%
  meth_bootstrap(10, method = "perm_v3")