| Title: | Tools for Analyzing Mixed Effect Regression Models |
|---|---|
| Description: | Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models. |
| Authors: | Jared E. Knowles [aut, cre], Carl Frederick [aut], Alex Whitworth [ctb] |
| Maintainer: | Jared E. Knowles <[email protected]> |
| License: | GPL (>=2) |
| Version: | 1.0.0 |
| Built: | 2026-05-30 20:56:17 UTC |
| Source: | https://github.com/jknowles/mertools |
Extract a data frame of a single row that represents the average observation in a merMod object. This function also allows the user to pass a series of conditioning argument to calculate the average observation conditional on other characteristics.
averageObs(merMod, varList = NULL, origData = NULL, ...)averageObs(merMod, varList = NULL, origData = NULL, ...)
merMod |
a merMod object |
varList |
optional, a named list of conditions to subset the data on |
origData |
(default=NULL) a data frame containing the original, untransformed data used to call the model. This MUST be specified if the original variables used in formula function calls are NOT present as 'main effects'. |
... |
not used currently |
Each character and factor variable in the data.frame is assigned to the modal category and each numeric variable is collapsed to the mean. Currently if mode is a tie, returns a "." Uses the collapseFrame function.
For models with a scalar left-hand side (e.g. lmer(y ~ ...)), the
response column is included in the output and is set to the mean of the
observed response. For models with a matrix-valued left-hand side –
most commonly two-column cbind() specifications in binomial GLMMs
such as glmer(cbind(successes, failures) ~ ..., family = binomial)
– the response column is omitted from the output. A matrix response
cannot be meaningfully collapsed to a single "average" value, and
averageObs() is primarily intended to produce newdata for
predict / predictInterval, both of which
ignore the response column in newdata. Callers that iterate over
names(averageObs(merMod)) or compare against merMod@frame
should not assume column parity for matrix-LHS models.
a data frame with a single row for the average observation, but with full factor levels. See details for more.
Take an entire dataframe and summarize it in one row by using the mean and mode.
collapseFrame(data)collapseFrame(data)
data |
a data.frame |
Each character and factor variable in the data.frame is assigned to the modal category and each numeric variable is collapsed to the mean. Currently if mode is a tie, returns a "."
a data frame with a single row
Draw is used to select a single observation out of an R object. Additional parameters allow the user to control how that observation is chosen in order to manipulate that observation later. This is a generic function with methods for a number of objects.
draw(object, type = c("random", "average"), varList = NULL, seed = NULL, ...) ## S3 method for class 'merMod' draw(object, type = c("random", "average"), varList = NULL, seed = NULL, ...)draw(object, type = c("random", "average"), varList = NULL, seed = NULL, ...) ## S3 method for class 'merMod' draw(object, type = c("random", "average"), varList = NULL, seed = NULL, ...)
object |
the object to draw from |
type |
what kind of draw to make. Options include random or average |
varList |
a list specifying filters to subset the data by when making the draw |
seed |
numeric, optional argument to set seed for simulations, ignored if type="average" |
... |
additional arguments required by certain methods |
In cases of tie, ".", may be substituted for factors.
a data.frame with a single row representing the desired observation
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) # Random case draw(fm1, type = "random") # Average draw(fm1, type = "average") # Subset draw(fm1, type = "average", varList = list("Subject" = "308"))fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) # Random case draw(fm1, type = "random") # Average draw(fm1, type = "average") # Subset draw(fm1, type = "average", varList = list("Subject" = "308"))
expectedRank calculates the expected rank and the percentile expected
rank of any random term in a merMod object. A simple ranking of the estimated
random effects (as produced by ranef) is not satisfactory
because it ignores any amount of uncertainty.
expectedRank(merMod, groupFctr = NULL, term = NULL)expectedRank(merMod, groupFctr = NULL, term = NULL)
merMod |
An object of class merMod |
groupFctr |
An optional character vector specifying the name(s) the grouping factor(s)
over which the random coefficient of interest varies. This is the
variable to the right of the pipe, |
term |
An optional character vector specifying the name(s) of the random coefficient of interest. This is the
variable to the left of the pipe, |
Inspired by Lingsma et al. (2010, see also Laird and Louis 1989), expectedRank sums the probability that each level of the grouping factor is greater than every other level of the grouping factor, similar to a two-sample t-test.
The formula for the expected rank is:
where is the standard normal distribution function,
is the estimated random effect and is the posterior
variance of the estimated random effect. We add one to the sum so that the
minimum rank is one instead of zero so that in the case where there is no
overlap between the variances of the random effects (or if the variances are
zero), the expected rank equals the actual rank. The ranks are ordered such
that the winners have ranks that are greater than the losers.
The formula for the percentile expected rank is:
where is the number of grouping factor levels. The percentile
expected rank can be interpreted as the fraction of levels that score at or
below the given level.
NOTE: expectedRank will only work under conditions that lme4::ranef
will work. One current example of when this is not the case is for
models when there are multiple terms specified per factor (e.g. uncorrelated random
coefficients for the same term, e.g.
lmer(Reaction ~ Days + (1 | Subject) + (0 + Days | Subject), data = sleepstudy))
A data.frame with the following five columns:
a character representing name of the grouping factor
a character representing the level of the grouping factor
a character representing the formula term for the group
effect estimate from lme4::ranef(, condVar=TRUE)).
the posterior variance of the estimate random effect
(from lme4::ranef(, condVar=TRUE)); named "term"_var.
The expected rank.
The percentile expected rank.
Laird NM and Louis TA. Empirical Bayes Ranking Methods. Journal of Education Statistics. 1989;14(1)29-46. Available at doi:10.2307/1164724.
Lingsma HF, Steyerberg EW, Eijkemans MJC, et al. Comparing and ranking hospitals based on outcome: results from The Netherlands Stroke Survey. QJM: An International Journal of Medicine. 2010;103(2):99-108. doi:10.1093/qjmed/hcp169
#For a one-level random intercept model m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) (m1.er <- expectedRank(m1)) #For a one-level random intercept model with multiple random terms m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) #ranked by the random slope on Days (m2.er1 <- expectedRank(m2, term="Days")) #ranked by the random intercept (m2.er2 <- expectedRank(m2, term="int")) #For a two-level model with random intercepts m3 <- lmer(y ~ service * dept + (1|s) + (1|d), InstEval) #Ranked by the random intercept on 's' (m3.er1 <- expectedRank(m3, groupFctr="s", term="Intercept"))#For a one-level random intercept model m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) (m1.er <- expectedRank(m1)) #For a one-level random intercept model with multiple random terms m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) #ranked by the random slope on Days (m2.er1 <- expectedRank(m2, term="Days")) #ranked by the random intercept (m2.er2 <- expectedRank(m2, term="int")) #For a two-level model with random intercepts m3 <- lmer(y ~ service * dept + (1|s) + (1|d), InstEval) #Ranked by the random intercept on 's' (m3.er1 <- expectedRank(m3, groupFctr="s", term="Intercept"))
Find link function family
famlink(object, resp = object@resp)famlink(object, resp = object@resp)
object |
a merMod object |
resp |
the response vector |
the link function and family
Display model fit summary of x or x like objects, fast
fastdisp(x, ...) ## S3 method for class 'merMod' fastdisp(x, ...) ## S3 method for class 'merModList' fastdisp(x, ...)fastdisp(x, ...) ## S3 method for class 'merMod' fastdisp(x, ...) ## S3 method for class 'merModList' fastdisp(x, ...)
x |
a model object |
... |
additional arguments to pass to |
Faster than the implementation in the arm package because it avoids refitting
The time saving is only noticeable for large, time-consuming (g)lmer fits.
A list with model summary components (call, coef,
se, ngrps, AIC, n, and fit statistics),
returned invisibly. The summary is also printed to the console.
#Compare the time for displaying this modest model require(arm) m1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) system.time(display(m1)) system.time(fastdisp(m1))#Compare the time for displaying this modest model require(arm) m1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) system.time(display(m1)) system.time(fastdisp(m1))
FEsim simulates fixed effects from merMod object posterior distributionsSimulate fixed effects from merMod
FEsim simulates fixed effects from merMod object posterior distributions
FEsim(merMod, n.sims = 200, oddsRatio = FALSE, seed = NULL)FEsim(merMod, n.sims = 200, oddsRatio = FALSE, seed = NULL)
merMod |
a merMod object from the lme4 package |
n.sims |
number of simulations to use |
oddsRatio |
logical, should parameters be converted to odds ratios? |
seed |
numeric, optional argument to set seed for simulations |
Use the Gelman sim technique to build fixed effect estimates and confidence intervals. Uses the sim function in the arm package
a data frame with the following columns
termName of fixed term (intercept/coefficient)
meanMean of the simulations
medianMedian of the simulations
sdStandard deviation of the simulations, NA if oddsRatio=TRUE
require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) fe2 <- FEsim(m2, 25) head(fe2)require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) fe2 <- FEsim(m2, 25) head(fe2)
Extract all warning msgs from a merMod object
fetch.merMod.msgs(x)fetch.merMod.msgs(x)
x |
a merMod object |
findFormFuns used by averageObs to calculate proper
averagesThe purpose is to properly derive data for the average observation in the
data by being 'aware' of formulas that contain interactions and/or function
calls. For example, in the old behavior, if the formula contained a square
term specified as I(x^2), we were returning the mean of x(^2) not the
square of mean(x).
findFormFuns(merMod, origData = NULL)findFormFuns(merMod, origData = NULL)
merMod |
the merMod object from which to draw the average observation |
origData |
(default=NULL) a data frame containing the original, untransformed data used to call the model. This MUST be specified if the original variables used in formula function calls are NOT present as 'main effects'. |
Matrix-valued response columns (e.g. the cbind(successes, failures)
left-hand side of a binomial GLMM) are detected and dropped from the
working frame before averaging, since they cannot be collapsed to a
single scalar. The returned frame therefore has no response column for
matrix-LHS models; see averageObs for rationale.
a data frame with a single row for the average observation, but with full factor levels. See details for more.
Extract fixed-effects estimates for a merModList
## S3 method for class 'merModList' fixef(object, add.dropped = FALSE, ...)## S3 method for class 'merModList' fixef(object, add.dropped = FALSE, ...)
object |
any fitted model object from which fixed effects estimates can be extracted. |
add.dropped |
for models with rank-deficient design
matrix, reconstitute the full-length parameter vector by
adding |
... |
optional additional arguments. Currently none are used in any methods. |
Extract the estimates of the fixed-effects parameters from a list of
fitted merMod models. Takes the mean of the individual fixef
objects for each of the component models in the merModList.
a named, numeric vector of fixed-effects estimates.
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) fixef(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) fixef(mod)
Identify if a merMod has weights
hasWeights(merMod)hasWeights(merMod)
merMod |
the merMod object to test for weights |
TRUE or FALSE for whether the model has weights
A key example dataset used for examples in the HLM software manual. Included here for use in replicating HLM analyses in R.
hsbhsb
A data frame with 7,185 observations on the following 8 variables.
schida numeric vector, 160 unique values
mathacha numeric vector for the performance on a standardized math assessment
femalea numeric vector coded 0 for male and 1 for female
sesa numeric measure of student socio-economic status
minoritya numeric vector coded 0 for white and 1 for non-white students
schtypea numeric vector coded 0 for public and 1 for private schools
meansesa numeric, the average SES for each school in the data set
sizea numeric for the number of students in the school
The data file used for this presentation is a subsample from the 1982 High School and Beyond Survey and is used extensively in Hierarchical Linear Models by Raudenbush and Bryk. It consists of 7,185 students nested in 160 schools.
Data made available by UCLA Institute for Digital Research and Education (IDRE) online: https://stats.oarc.ucla.edu/other/hlm/hlm-mlm/introduction-to-multilevel-modeling-using-hlm/
Stephen W. Raudenbush and Anthony S. Bryk (2002). Hierarchical Linear Models: Applications and Data Analysis Methods (2nd ed.). SAGE.
data(hsb) head(hsb)data(hsb) head(hsb)
Calculate the intraclass correlation using mixed effect models
ICC(outcome, group, data, subset = NULL)ICC(outcome, group, data, subset = NULL)
outcome |
a character representing the variable of the outcome |
group |
a character representing the name of the grouping term |
data |
a data.frame |
subset |
an optional subset |
a numeric for the intraclass correlation
data(sleepstudy) ICC(outcome = "Reaction", group = "Subject", data = sleepstudy)data(sleepstudy) ICC(outcome = "Reaction", group = "Subject", data = sleepstudy)
Apply a multilevel model to a list of data frames
Apply a Bayesian multilevel model to a list of data frames
Apply a generalized linear multilevel model to a list of data frames
Apply a Bayesian generalized linear multilevel model to a list of data frames
lmerModList(formula, data, parallel = FALSE, ...) blmerModList(formula, data, parallel = FALSE, ...) glmerModList(formula, data, parallel = FALSE, ...) bglmerModList(formula, data, parallel = FALSE, ...)lmerModList(formula, data, parallel = FALSE, ...) blmerModList(formula, data, parallel = FALSE, ...) glmerModList(formula, data, parallel = FALSE, ...) bglmerModList(formula, data, parallel = FALSE, ...)
formula |
a formula to pass through compatible with merMod |
data |
a list object with each element being a data.frame |
parallel |
logical, should the models be run in parallel? Default FALSE. If so, the 'future_lapply' function from the 'future.apply' package is used. See details. |
... |
additional arguments to pass to the estimating function |
Parallel computing is provided by the 'futures' package, and its extension the 'future.apply' package to provide the 'future_lapply' function for easy parallel computations on lists. To use this package, simply register a parallel backend using the 'plan()' function from 'futures' - an example is to use 'plan(multisession)'
a list of fitted merMod objects of class merModList
a merModList
a merModList
a merModList
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) summary(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) summary(mod)
Extract averaged fixed effect parameters across a list of merMod objects
modelFixedEff(modList, ...)modelFixedEff(modList, ...)
modList |
an object of class merModList |
... |
additional arguments to pass to |
The Rubin correction for combining estimates and standard errors from Rubin (1987) is applied to adjust for the within and between imputation variances.
a data.frame of the averaged fixed effect parameters
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelFixedEff(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelFixedEff(mod)
Extract model information from a merMod
modelInfo(object)modelInfo(object)
object |
a merMod object |
Simple summary information about the object, number of observations, number of grouping terms, AIC, and residual standard deviation
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelInfo(mod[[1]]) lapply(mod, modelInfo)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelInfo(mod[[1]]) lapply(mod, modelInfo)
Extract data.frame of random effect statistics from merMod List
modelRandEffStats(modList)modelRandEffStats(modList)
modList |
a list of multilevel models |
a data.frame
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelRandEffStats(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) modelRandEffStats(mod)
Extract all warning msgs from a merMod object
plot_sim_error_chks( type = c("FE", "RE"), level = 0.95, stat = c("mean", "median"), sd = TRUE, sigmaScale = NULL, oddsRatio = FALSE, labs = FALSE, facet = TRUE )plot_sim_error_chks( type = c("FE", "RE"), level = 0.95, stat = c("mean", "median"), sd = TRUE, sigmaScale = NULL, oddsRatio = FALSE, labs = FALSE, facet = TRUE )
type |
check a fixed or random effect |
level |
the width of the confidence interval |
stat |
a character value indicating the variable name in data of the midpoint of the estimated interval, e.g. "mean" or "median" |
sd |
a logical indicating whether or not to plot error bars around
the estimates (default is TRUE). Calculates the width of the error bars
based on |
sigmaScale |
a numeric value to divide the estimate and the standard deviation by in the case of doing an effect size calculation |
oddsRatio |
logical, should the parameters be converted to odds ratios before plotting |
labs |
logical, include the labels of the groups on the x-axis |
facet |
Accepts either logical ( |
Plot the simulated fixed effects on a ggplot2 chart
plotFEsim( data, level = 0.95, stat = "median", sd = TRUE, intercept = FALSE, sigmaScale = NULL, oddsRatio = FALSE )plotFEsim( data, level = 0.95, stat = "median", sd = TRUE, intercept = FALSE, sigmaScale = NULL, oddsRatio = FALSE )
data |
a data.frame generated by |
level |
the width of the confidence interval |
stat |
a character value indicating the variable name in data of the midpoint of the estimated interval, e.g. "mean" or "median" |
sd |
logical, indicating whether or not to plot error bars around
the estimates (default is TRUE). Calculates the width of the error bars
based on |
intercept |
logical, should the intercept be included, default is FALSE |
sigmaScale |
a numeric value to divide the estimate and the standard deviation by in the case of doing an effect size calculation |
oddsRatio |
logical, should the parameters be converted to odds ratios before plotting |
a ggplot2 plot of the coefficient effects
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) (p1 <- plotFEsim(FEsim(fm1)))fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) (p1 <- plotFEsim(FEsim(fm1)))
Plot the output of one or more REimpact calls on a
single chart. Each point is the weighted-average fitted value for a bin of
the expected-rank distribution of a grouping factor, with a confidence
interval derived from the weighted standard error. Supplying a named list of
REimpact results overlays them on the same axes – for example to
compare the influence of two different grouping factors, or the same factor
across two models – which previously required hand-assembling the data
frames (#84).
plotREimpact(data, level = 0.95, facet = TRUE, point_size = 2.5)plotREimpact(data, level = 0.95, facet = TRUE, point_size = 2.5)
data |
either a single data.frame produced by |
level |
the width of the confidence interval (default 0.95). |
facet |
logical, facet the plot by |
point_size |
numeric size of the plotted central points. |
a ggplot2 object.
m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) imp <- REimpact(m1, newdata = sleepstudy[1, ], groupFctr = "Subject", breaks = 4) plotREimpact(imp) # Compare two grouping factors on the same chart g1 <- lmer(y ~ lectage + studage + (1 | d) + (1 | s), data = InstEval) d_eff <- REimpact(g1, newdata = InstEval[1, ], groupFctr = "d", breaks = 4) s_eff <- REimpact(g1, newdata = InstEval[1, ], groupFctr = "s", breaks = 4) plotREimpact(list("Instructor (d)" = d_eff, "Student (s)" = s_eff))m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) imp <- REimpact(m1, newdata = sleepstudy[1, ], groupFctr = "Subject", breaks = 4) plotREimpact(imp) # Compare two grouping factors on the same chart g1 <- lmer(y ~ lectage + studage + (1 | d) + (1 | s), data = InstEval) d_eff <- REimpact(g1, newdata = InstEval[1, ], groupFctr = "d", breaks = 4) s_eff <- REimpact(g1, newdata = InstEval[1, ], groupFctr = "s", breaks = 4) plotREimpact(list("Instructor (d)" = d_eff, "Student (s)" = s_eff))
Plot the simulated random effects on a ggplot2 chart. Points that
are distinguishable from zero (i.e. the confidence band based on level
does not cross the red line) are highlighted. Currently, the plots are ordered
according to the grouping factor.
plotREsim( data, level = 0.95, stat = "median", sd = TRUE, sigmaScale = NULL, oddsRatio = FALSE, labs = FALSE, facet = TRUE )plotREsim( data, level = 0.95, stat = "median", sd = TRUE, sigmaScale = NULL, oddsRatio = FALSE, labs = FALSE, facet = TRUE )
data |
a data.frame generated by |
level |
the width of the confidence interval |
stat |
a character value indicating the variable name in data of the midpoint of the estimated interval, e.g. "mean" or "median" |
sd |
a logical indicating whether or not to plot error bars around
the estimates (default is TRUE). Calculates the width of the error bars
based on |
sigmaScale |
a numeric value to divide the estimate and the standard deviation by in the case of doing an effect size calculation |
oddsRatio |
logical, should the parameters be converted to odds ratios before plotting |
labs |
logical, include the labels of the groups on the x-axis |
facet |
Accepts either logical ( |
a ggplot2 plot of the coefficient effects
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) (p1 <- plotREsim(REsim(fm1))) #Plot just the random effects for the Days slope (p2 <- plotREsim(REsim(fm1), facet= list(groupFctr= "Subject", term= "Days")))fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) (p1 <- plotREsim(REsim(fm1))) #Plot just the random effects for the Days slope (p2 <- plotREsim(REsim(fm1), facet= list(groupFctr= "Subject", term= "Days")))
This function provides a way to capture model uncertainty in
predictions from multi-level models fit with lme4. By drawing a sampling
distribution for the random and the fixed effects and then estimating the fitted
value across that distribution, it is possible to generate a prediction interval
for fitted values that includes all variation in the model except for variation
in the covariance parameters, theta. This is a much faster alternative than
bootstrapping for models fit to medium to large datasets.
predictInterval( merMod, newdata = NULL, which = c("full", "fixed", "random", "all"), level = 0.8, n.sims = 1000, stat = c("median", "mean"), type = c("linear.prediction", "probability"), include.resid.var = TRUE, returnSims = FALSE, seed = NULL, .parallel = FALSE, fix.intercept.variance = FALSE, ignore.fixed.terms = NULL, new.levels = c("zero", "draw") )predictInterval( merMod, newdata = NULL, which = c("full", "fixed", "random", "all"), level = 0.8, n.sims = 1000, stat = c("median", "mean"), type = c("linear.prediction", "probability"), include.resid.var = TRUE, returnSims = FALSE, seed = NULL, .parallel = FALSE, fix.intercept.variance = FALSE, ignore.fixed.terms = NULL, new.levels = c("zero", "draw") )
merMod |
a merMod object from lme4 |
newdata |
a data.frame of new data to predict |
which |
a character specifying what to return, by default it returns the
full interval, but you can also select to return only the fixed variation or
the random component variation. If full is selected the resulting data.frame
will be |
level |
the width of the prediction interval |
n.sims |
number of simulation samples to construct |
stat |
take the median or mean of simulated intervals |
type |
type of prediction to develop |
include.resid.var |
logical, include or exclude the residual variance for linear models |
returnSims |
logical, should all n.sims simulations be returned? |
seed |
numeric, optional argument to set seed for simulations |
.parallel |
logical, use parallel computation (default FALSE) |
fix.intercept.variance |
logical; should the variance of the intercept term be adjusted downwards to roughly correct for its covariance with the random effects, as if all the random effects are intercept effects? |
ignore.fixed.terms |
a numeric or string vector of indexes or names of fixed effects which should be considered as fully known (zero variance). |
new.levels |
character, how to treat grouping levels in |
a data.frame with three columns: fit, lwr and upr. If 'returnSims' is TRUE the attribute 'sim.results' contains the full simulation array.
Summarize a merMod list
## S3 method for class 'merModList' print(x, ...)## S3 method for class 'merModList' print(x, ...)
x |
a modList of class merModList |
... |
additional arguments |
a summary object of model information
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) summary(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) summary(mod)
Print the summary of a merMod list
## S3 method for class 'summary.merModList' print(x, ...)## S3 method for class 'summary.merModList' print(x, ...)
x |
a summary of amerModList object |
... |
additional arguments |
summary content printed to console
Select a random observation from the model frame of a merMod
randomObs(merMod, varList, seed = NULL)randomObs(merMod, varList, seed = NULL)
merMod |
an object of class merMod |
varList |
optional, a named list of conditions to subset the data on |
seed |
numeric, optional argument to set seed for simulations |
Each factor variable in the data frame has all factor levels from the full model.frame stored so that the new data is compatible with predict.merMod
a data frame with a single row for a random observation, but with full factor levels. See details for more.
Extract random-effects estimates for a merModList
## S3 method for class 'merModList' ranef(object, ...)## S3 method for class 'merModList' ranef(object, ...)
object |
a |
... |
some methods for these generic functions require additional arguments. |
Extract the estimates of the random-effects parameters from a list of
fitted merMod models. Takes the mean of the individual ranef
objects for each of the component models in the merModList.
a named, numeric vector of random-effects estimates.
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) ranef(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) ranef(mod)
Extract the correlations between the slopes and the intercepts from a model
REcorrExtract(model)REcorrExtract(model)
model |
an object that inherits from class merMod |
a numeric vector of the correlations among the effects
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REcorrExtract(fm1)fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REcorrExtract(fm1)
Extracts random effect terms from an lme4 model
REextract(merMod)REextract(merMod)
merMod |
a merMod object from the lme4 package |
a data frame with the following columns
The name of the grouping factor associated with the random effects
The level of the grouping factor associated with the random effects
One column per random effect, the name is derived from the merMod
One column per random effect, the name is derived from the merMod
m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) rfx <- REextract(m2) #Note the column names head(rfx)m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) rfx <- REextract(m2) #Note the column names head(rfx)
REimpact calculates the average predicted value for each row of a
new data frame across the distribution of expectedRank for a
merMod object. This allows the user to make meaningful comparisons about the
influence of random effect terms on the scale of the response variable,
for user-defined inputs, and accounting for the variability in grouping terms.
REimpact(merMod, newdata, groupFctr = NULL, term = NULL, breaks = 3, ...)REimpact(merMod, newdata, groupFctr = NULL, term = NULL, breaks = 3, ...)
merMod |
An object of class merMod |
newdata |
a data frame of observations to calculate group-level differences for |
groupFctr |
The name of the grouping factor over which the random
coefficient of interest varies. This is the variable to the right of the
pipe, |
term |
The name of the random coefficient of interest. This is the
variable to the left of the pipe, |
breaks |
an integer representing the number of bins to divide the group effects into, the default is 3; alternatively it can specify breaks from 0-100 for how to cut the expected rank distribution |
... |
additional arguments to pass to |
The function predicts the response at every level in the random effect term specified by the user. Then, the expected rank of each group level is binned to the number of bins specified by the user. Finally, a weighted mean of the fitted value for all observations in each bin of the expected ranks is calculated using the inverse of the variance as the weight – so that less precise estimates are downweighted in the calculation of the mean for the bin. Finally, a standard error for the bin mean is calculated.
This function uses the formula for variance of a weighted mean recommended by Cochran (1977).
A data.frame with all unique combinations of the number of cases, rows in the newdata element, and number of bins:
The row number of the observation from newdata.
The ranking bin for the expected rank, the higher the bin number, the greater the expected rank of the groups in that bin.
The weighted mean of the fitted values for case i in bin k
The standard deviation of the mean of the fitted values for case i in bin k.
The number of group effects contained in that bin.
Gatz, DF and Smith, L. The Standard Error of a Weighted Mean Concentration. I. Bootstrapping vs other methods. Atmospheric Environment. 1995;11(2)1185-1193. doi:10.1016/1352-2310(94)00210-C
Cochran, WG. 1977. Sampling Techniques (3rd Edition). Wiley, New York.
#For a one-level random intercept model m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) m1.er <- REimpact(m1, newdata = sleepstudy[1, ], breaks = 2) #For a one-level random intercept model with multiple random terms m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) #ranked by the random slope on Days m2.er1 <- REimpact(m2, newdata = sleepstudy[1, ], groupFctr = "Subject", term="Days") #ranked by the random intercept m2.er2 <- REimpact(m2, newdata = sleepstudy[1, ], groupFctr = "Subject", term="int") # You can also pass additional arguments to predictInterval through REimpact g1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) zed <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "d", n.sims = 50, include.resid.var = TRUE) zed2 <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "s", n.sims = 50, include.resid.var = TRUE) zed3 <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "d", breaks = 5, n.sims = 50, include.resid.var = TRUE)#For a one-level random intercept model m1 <- lmer(Reaction ~ Days + (1 | Subject), sleepstudy) m1.er <- REimpact(m1, newdata = sleepstudy[1, ], breaks = 2) #For a one-level random intercept model with multiple random terms m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) #ranked by the random slope on Days m2.er1 <- REimpact(m2, newdata = sleepstudy[1, ], groupFctr = "Subject", term="Days") #ranked by the random intercept m2.er2 <- REimpact(m2, newdata = sleepstudy[1, ], groupFctr = "Subject", term="int") # You can also pass additional arguments to predictInterval through REimpact g1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) zed <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "d", n.sims = 50, include.resid.var = TRUE) zed2 <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "s", n.sims = 50, include.resid.var = TRUE) zed3 <- REimpact(g1, newdata = InstEval[9:12, ], groupFctr = "d", breaks = 5, n.sims = 50, include.resid.var = TRUE)
REmargins calculates the average predicted value for each row of a
new data frame across the distribution of expectedRank for a
merMod object. This allows the user to make meaningful comparisons about the
influence of random effect terms on the scale of the response variable,
for user-defined inputs, and accounting for the variability in grouping terms.
REmargins( merMod, newdata = NULL, groupFctr = NULL, term = NULL, breaks = 4, .parallel = FALSE, ... )REmargins( merMod, newdata = NULL, groupFctr = NULL, term = NULL, breaks = 4, .parallel = FALSE, ... )
merMod |
An object of class merMod |
newdata |
a data frame of observations to calculate group-level differences for |
groupFctr |
The name of the grouping factor over which the random
coefficient of interest varies. This is the variable to the right of the
pipe, |
term |
The name of the random coefficient of interest. This is the
variable to the left of the pipe, |
breaks |
an integer representing the number of bins to divide the group effects into, the default is 3. |
.parallel |
logical should parallel computation be used, default is TRUE |
... |
additional arguments to pass to |
The function simulates the
The function predicts the response at every level in the random effect term specified by the user. Then, the expected rank of each group level is binned to the number of bins specified by the user. Finally, a weighted mean of the fitted value for all observations in each bin of the expected ranks is calculated using the inverse of the variance as the weight – so that less precise estimates are downweighted in the calculation of the mean for the bin. Finally, a standard error for the bin mean is calculated.
A data.frame with all unique combinations of the number of cases, rows in the newdata element:
The columns of the original data taken from newdata
The row number of the observation from newdata. Each row in newdata will be repeated for all unique levels of the grouping_var, term, and breaks.
The grouping variable the random effect is being marginalized over.
The term for the grouping variable the random effect is being marginalized over.
The ntile of the effect size for grouping_var and term
The original grouping value for this case
The predicted value from predictInterval for this case simulated
at the Nth ntile of the expected rank distribution of grouping_var and term
The upper bound of the predicted value.
The lower bound of the predicted value.
For each grouping term in newdata the predicted value is decomposed into its fit components via predictInterval and these are all returned here
The upper bound for the effect of each grouping term
The lower bound for the effect of each grouping term
The predicted fit with all the grouping terms set to 0 (average)
The upper bound fit with all the grouping terms set to 0 (average)
The lower bound fit with all the grouping terms set to 0 (average)
Gatz, DF and Smith, L. The Standard Error of a Weighted Mean Concentration. I. Bootstrapping vs other methods. Atmospheric Environment. 1995;11(2)1185-1193. doi:10.1016/1352-2310(94)00210-C
Cochran, WG. 1977. Sampling Techniques (3rd Edition). Wiley, New York.
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) mfx <- REmargins(merMod = fm1, newdata = sleepstudy[1:10,]) # You can also pass additional arguments to predictInterval through REimpact g1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) margin_df <- REmargins(g1, newdata = InstEval[20:25, ], groupFctr = c("s"), breaks = 4) margin_df <- REmargins(g1, newdata = InstEval[20:25, ], groupFctr = c("d"), breaks = 3)fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) mfx <- REmargins(merMod = fm1, newdata = sleepstudy[1:10,]) # You can also pass additional arguments to predictInterval through REimpact g1 <- lmer(y ~ lectage + studage + (1|d) + (1|s), data=InstEval) margin_df <- REmargins(g1, newdata = InstEval[20:25, ], groupFctr = c("s"), breaks = 4) margin_df <- REmargins(g1, newdata = InstEval[20:25, ], groupFctr = c("d"), breaks = 3)
For a user specified quantile (or quantiles) of the random effect terms in a merMod object. This allows the user to easily identify the observation associated with the nth percentile effect.
REquantile(merMod, quantile, groupFctr, term = "(Intercept)")REquantile(merMod, quantile, groupFctr, term = "(Intercept)")
merMod |
a merMod object with one or more random effect levels |
quantile |
a numeric vector with values between 0 and 100 for quantiles |
groupFctr |
a character of the name of the random effect grouping factor to extract quantiles from |
term |
a character of the random effect to extract for the grouping factor specified. Default is the intercept. |
a vector of the level of the random effect grouping term that corresponds to each quantile
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REquantile(fm1, quantile = 0.25, groupFctr = "Subject") REquantile(fm1, quantile = 0.25, groupFctr = "Subject", term = "Days")fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REquantile(fm1, quantile = 0.25, groupFctr = "Subject") REquantile(fm1, quantile = 0.25, groupFctr = "Subject", term = "Days")
Extract the standard deviation of the random effects from a merMod object
REsdExtract(model)REsdExtract(model)
model |
an object that inherits from class merMod |
a numeric vector for standard deviations of the random effects
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REsdExtract(fm1)fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) REsdExtract(fm1)
REsim simulates random effects from merMod object posterior distributionsSimulate random effects from merMod
REsim simulates random effects from merMod object posterior distributions
REsim(merMod, n.sims = 200, oddsRatio = FALSE, seed = NULL)REsim(merMod, n.sims = 200, oddsRatio = FALSE, seed = NULL)
merMod |
a merMod object from the lme4 package |
n.sims |
number of simulations to use |
oddsRatio |
logical, should parameters be converted to odds ratios? |
seed |
numeric, optional argument to set seed for simulations |
Use the Gelman sim technique to build empirical Bayes estimates. Uses the sim function in the arm package
a data frame with the following columns
groupFctrName of the grouping factor
groupIDLevel of the grouping factor
termName of random term (intercept/coefficient)
meanMean of the simulations
medianMedian of the simulations
sdStandard deviation of the simulations, NA if oddsRatio=TRUE
require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) re2 <- REsim(m2, 25) head(re2)require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) re2 <- REsim(m2, 25) head(re2)
Extract the Root Mean Squared Error for a lmerMod object
RMSE.merMod(merMod, scale = FALSE)RMSE.merMod(merMod, scale = FALSE)
merMod |
a lmerMod object from the lme4 package |
scale |
logical, should the result be returned on the scale of response variable standard deviations? |
a numeric which represents the RMSE
require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) RMSE.merMod(m2)require(lme4) m2 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) RMSE.merMod(m2)
Strips out transformations from variable names in data frames
sanitizeNames(data)sanitizeNames(data)
data |
a data.frame |
a data frame with variable names cleaned to remove factor() construction
Set up parallel environment
setup_parallel()setup_parallel()
Nothing
shinyMer launches a shiny app that allows you to interactively
explore an estimated merMod using functions from merTools.
shinyMer(merMod, simData = NULL, pos = 1)shinyMer(merMod, simData = NULL, pos = 1)
merMod |
An object of class "merMod". |
simData |
A data.frame to make predictions from (optional). If NULL, then the user can only make predictions using the data in the frame slot of the merMod object. |
pos |
The position of the environment to export function arguments to. Defaults to 1, the global environment, to allow shiny to run. |
A shiny app
Randomly reorder a dataframe by row
shuffle(data)shuffle(data)
data |
a data frame |
a data frame of the same dimensions with the rows reordered randomly
For each random effect term the function draws 'n.sims' samples from the conditional multivariate normal distribution (BLUPs + post‑var). The result is a named list where each element is an array of dimensions 'nrow(newdata)' × 'n.sims'.
simulate_random_effects( merMod, newdata, n.sims, .parallel = FALSE, seed = NULL, new.levels = c("zero", "draw") )simulate_random_effects( merMod, newdata, n.sims, .parallel = FALSE, seed = NULL, new.levels = c("zero", "draw") )
merMod |
A merMod object. |
newdata |
Data frame containing the prediction covariates. |
n.sims |
Number of simulations. |
.parallel |
Logical flag to enable parallel execution via foreach. |
seed |
Optional random seed for reproducibility. |
new.levels |
Character; how to treat grouping levels present in 'newdata' but absent from the fitted model. '"zero"' drops the random effect for such levels; ‘"draw"' samples each unobserved level’s effect from the estimated random-effect covariance ('VarCorr'). |
List of matrices (rows = observations, cols = simulations).
Strips attributes off of a data frame that come with a merMod model.frame
stripAttributes(data)stripAttributes(data)
data |
a data.frame |
a data frame with variable names cleaned to remove all attributes except for names, row.names, and class
Bootstrap a subset of an lme4 model
subBoot(merMod, n = NULL, FUN, R = 100, seed = NULL, warn = FALSE)subBoot(merMod, n = NULL, FUN, R = 100, seed = NULL, warn = FALSE)
merMod |
a valid merMod object |
n |
the number of rows to sample from the original data in the merMod object, by default will resample the entire model frame |
FUN |
the function to apply to each bootstrapped model |
R |
the number of bootstrap replicates, default is 100 |
seed |
numeric, optional argument to set seed for simulations |
warn |
logical, if TRUE, warnings from lmer will be issued, otherwise they will be suppressed default is FALSE |
This function allows users to estimate parameters of a large merMod object using bootstraps on a subset of the data.
a data.frame of parameters extracted from each of the R replications. The original values are appended to the top of the matrix.
(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)) resultMatrix <- subBoot(fm1, n = 160, FUN = thetaExtract, R = 20)(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)) resultMatrix <- subBoot(fm1, n = 160, FUN = thetaExtract, R = 20)
Split a data.frame by elements in a list
subsetList(data, list)subsetList(data, list)
data |
a data.frame |
list |
a named list of splitting conditions |
a data frame with values that match the conditions in the list
Print the results of a merMod list
## S3 method for class 'merModList' summary(object, ...)## S3 method for class 'merModList' summary(object, ...)
object |
a modList of class merModList |
... |
additional arguments |
summary content printed to console
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) print(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) print(mod)
Create a factor variable and include unobserved levels for compatibility with model prediction functions
superFactor(x, fullLev)superFactor(x, fullLev)
x |
a vector to be converted to a factor |
fullLev |
a vector of factor levels to be assigned to x |
a factor variable with all observed levels of x and all levels of x in fullLev
regularFactor <- c("A", "B", "C") regularFactor <- factor(regularFactor) levels(regularFactor) # Now make it super newLevs <- c("D", "E", "F") regularFactor <- superFactor(regularFactor, fullLev = newLevs) levels(regularFactor) # now superregularFactor <- c("A", "B", "C") regularFactor <- factor(regularFactor) levels(regularFactor) # Now make it super newLevs <- c("D", "E", "F") regularFactor <- superFactor(regularFactor, fullLev = newLevs) levels(regularFactor) # now super
A convenience function that returns the theta parameters for a
merMod object.
thetaExtract(merMod)thetaExtract(merMod)
merMod |
a valid merMod object |
a vector of the covariance, theta, parameters from a merMod
merMod
(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)) thetaExtract(fm1) #(a numeric vector of the covariance parameters)(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)) thetaExtract(fm1) #(a numeric vector of the covariance parameters)
Extract the variances and correlations for random effects from a merMod list
## S3 method for class 'merModList' VarCorr(x, sigma = 1, rdig = 3L, ...)## S3 method for class 'merModList' VarCorr(x, sigma = 1, rdig = 3L, ...)
x |
a |
sigma |
an optional numeric value used as a multiplier for the standard deviations. |
rdig |
the number of digits to round to, integer |
... |
Ignored for the |
a list with two elements "stddev" and "correlation" for the standard deviations and correlations averaged across models in the list
sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) VarCorr(mod)sim_list <- replicate(n = 10, expr = sleepstudy[sample(row.names(sleepstudy), 180),], simplify=FALSE) fml <- "Reaction ~ Days + (Days | Subject)" mod <- lmerModList(fml, data = sim_list) VarCorr(mod)
Creates a new data.frame with copies of the original observation, each assigned to a different user-specified value of a variable. Allows the user to look at the effect on predicted values of changing either a single variable or multiple variables.
wiggle(data, varlist, valueslist)wiggle(data, varlist, valueslist)
data |
a data frame with one or more observations to be reassigned |
varlist |
a character vector specifying the name(s) of the variable to adjust |
valueslist |
a list of vectors with the values to assign to var |
If the variable specified is a factor, then wiggle will return it as a character.
a data.frame with each row assigned to the one of the new variable combinations.
All variable combinations are returned, eg wiggling two variables with 3 and 4 variables
respectively will return a new dataset with 3 * 4 = 12 observations.
data(iris) wiggle(iris[3,], varlist = "Sepal.Width", valueslist = list(c(1, 2, 3, 5))) wiggle(iris[3:5,], "Sepal.Width", valueslist = list(c(1, 2, 3, 5))) wiggle(iris[3,], c("Sepal.Width", "Petal.Length"), list(c(1,2,3,5), c(3,5,6))) wiggle(iris[3:5,], c("Sepal.Width", "Petal.Length"), list(c(1,2,3,5), c(3,5,6)))data(iris) wiggle(iris[3,], varlist = "Sepal.Width", valueslist = list(c(1, 2, 3, 5))) wiggle(iris[3:5,], "Sepal.Width", valueslist = list(c(1, 2, 3, 5))) wiggle(iris[3,], c("Sepal.Width", "Petal.Length"), list(c(1,2,3,5), c(3,5,6))) wiggle(iris[3:5,], c("Sepal.Width", "Petal.Length"), list(c(1,2,3,5), c(3,5,6)))