The function ggstatsplot::ggcoefstats
generates dot-and-whisker plots for regression models saved in a tidy data frame. The tidy dataframes are prepared using the following packages: broom
, broom.mixed
, and parameters
. Additionally, if available, the model summary indices are also extracted from the following packages: broom
, broom.mixed
, and performance
.
In this vignette, we will see examples of how to use this function. We will try to cover as many classes of objects as possible. Unfortunately, there is no single dataset that will be helpful for carrying out all types of regression analyses and, therefore, we will use various datasets to explore data-specific hypotheses using regression models.
Note before: The following demo uses the pipe operator (%>%
), so in case you are not familiar with this operator, here is a good explanation: http://r4ds.had.co.nz/pipes.html
Although the statistical models displayed in the plot may differ based on the class of models being investigated, there are few aspects of the plot that will be invariant across models:
The dot-whisker plot contains a dot representing the estimate and their confidence intervals (95%
is the default). The estimate can either be effect sizes (for tests that depend on the F
statistic) or regression coefficients (for tests with t
and z
statistic), etc. The function will, by default, display a helpful x
-axis label that should clear up what estimates are being displayed. The confidence intervals can sometimes be asymmetric if bootstrapping was used.
The caption will always contain diagnostic information, if available, about models that can be useful for model selection: The smaller the Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC) values, the “better” the model is.
The output of this function will be a ggplot2
object and, thus, it can be further modified (e.g., change themes, etc.) with ggplot2
functions.
Most of the regression models that are supported in the underlying packages are also supported by ggcoefstats
. For example-
aareg
, anova
, aov
, aovlist
, Arima
, bayesx
, bayesGARCH
, BBmm
, BBreg
, bcplm
, bglmerMod
, bife
, bigglm
, biglm
, blavaan
, bmlm
, blmerMod
, bracl
, brglm2
, brmsfit
, brmultinom
, btergm
, cch
, cgam
, cgamm
, cglm
, clm
, clm2
, clmm
, clmm2
, coeftest
, complmrob
, confusionMatrix
, coxme
, coxph
, cpglm
, cpglmm
, crch
, DirichReg
, drc
, emmGrid
, epi.2by2
, ergm
, feis
, felm
, fitdistr
, flexsurvreg
, glmc
, glmerMod
, glmmTMB
, gls
, gam
, Gam
, gamlss
, garch
, glm
, glmmadmb
, glmmPQL
, glmRob
, glmrob
, glmx
, gmm
, hurdle
, ivreg
, iv_robust
, lavaan
, lm
, lm.beta
, lmerMod
, lmerModLmerTest
, lmodel2
, lmRob
, lmrob
, LORgee
, lrm
, mcmc
, mcmc.list
, MCMCglmm
, mclogit
, mmclogit
, mediate
, mixor
, mjoint
, mle2
, mlm
, multinom
, negbin
, nlmerMod
, nlrq
, nlreg
, nls
, orcutt
, plm
, polr
, ridgelm
, rjags
, rlm
, rlmerMod
, robmixglm
, rq
, rqss
, slm
, speedglm
, speedlm
, stanfit
, stanreg
, survreg
, svyglm
, svyolr
, svyglm
, tobit
, truncreg
, vgam
, wbgee
, wblm
, zcpglm
, zeroinfl
, etc.
In the following examples, we will try out a number of regression models and, additionally, we will also see how we can change different aspects of the plot itself.
aov
)# loading needed libraries library(ggstatsplot) library(ggplot2) # for reproducibility set.seed(123) # to speed up the calculation, let's use only 10% of the data movies_10 <- dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1) # model mod <- stats::aov(formula = rating ~ mpaa * genre, data = movies_10) # plot ggstatsplot::ggcoefstats( x = mod, effsize = "eta", # changing the effect size estimate being displayed partial = FALSE, # just eta-squared point.args = list(color = "red", size = 4, shape = 15), # changing the point geom package = "dutchmasters", # package from which color paletter is to be taken palette = "milkmaid", # color palette for labels title = "omnibus ANOVA" # title for the plot ) + # further modification with the ggplot2 commands # note the order in which the labels are entered ggplot2::scale_y_discrete(labels = c("MPAA", "Genre", "Interaction term")) + ggplot2::labs(x = "effect size estimate (eta-squared)", y = NULL)
Note that we can also use this function for model selection. You can try out different models with the code below and see how the AIC and BIC values change.
# setup set.seed(123) library(ggstatsplot) # to speed up the calculation, let's use only 10% of the data movies_10 <- dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1) # plot ggstatsplot::combine_plots( # model 1 ggstatsplot::ggcoefstats( x = stats::aov(formula = rating ~ mpaa, data = movies_10), title = "1. Only MPAA ratings" ), # model 2 ggstatsplot::ggcoefstats( x = stats::aov(formula = rating ~ genre, data = movies_10), title = "2. Only genre" ), # model 3 ggstatsplot::ggcoefstats( x = stats::aov(formula = rating ~ mpaa + genre, data = movies_10), title = "3. Additive effect of MPAA and genre" ), # model 4 ggstatsplot::ggcoefstats( x = stats::aov(formula = rating ~ mpaa * genre, data = movies_10), title = "4. Multiplicative effect of MPAA and genre" ), title.text = "Model selection using ggcoefstats" )
car
package (anova
)# setup set.seed(123) library(car) library(ggstatsplot) # model mod <- stats::lm( formula = conformity ~ fcategory * partner.status, data = Moore, contrasts = list(fcategory = contr.sum, partner.status = contr.sum) ) # plot ggstatsplot::ggcoefstats( x = car::Anova(mod), title = "Anova with `car`" ) #> Note: AIC and BIC values not available, so skipping caption.
ez
packageset.seed(123) library(ez) data(ANT) # run an ANOVA on the mean correct RT data. rt_anova <- suppressWarnings(ez::ezANOVA( data = ANT[ANT$error == 0, ], dv = rt, wid = subnum, within = cue, detailed = TRUE, return_aov = TRUE )) # plot ggstatsplot::ggcoefstats( x = rt_anova$aov, title = "Anova with `ez` package" ) #> Note: AIC and BIC values not available, so skipping caption.
lm
)# plot ggstatsplot::ggcoefstats( x = stats::lm( formula = rating ~ genre, data = dplyr::filter( .data = ggstatsplot::movies_long, genre %in% c( "Action", "Action Comedy", "Action Drama", "Comedy", "Drama", "Comedy Drama" ) ) ), conf.level = 0.99, # changing the confidence levels for confidence intervals sort = "ascending", # sorting the terms of the model based on estimate values ggtheme = ggplot2::theme_gray(), # changing the default theme stats.label.color = c("#CC79A7", "darkgreen", "#0072B2", "black", "red"), title = "Movie ratings by their genre", subtitle = "Source: www.imdb.com" ) + # further modification with the ggplot2 commands # note the order in which the labels are entered ggplot2::scale_y_discrete(labels = c( "Comedy", "Action Comedy", "Action Drama", "Comedy Drama", "Drama" )) + ggplot2::labs(y = "Genre compared to Action movies)") + ggplot2::theme(axis.title.y = ggplot2::element_text(size = 14, face = "bold"))
lmer
/lmerMod
)# set up library(lme4) library(ggstatsplot) set.seed(123) # to speed up the calculation, let's use only 10% of the data movies_10 <- dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1) # combining the two different plots ggstatsplot::combine_plots( # model 1: simple linear model ggstatsplot::ggcoefstats( x = stats::lm( formula = scale(rating) ~ scale(budget), data = movies_10 ), title = "linear model", stats.label.color = "black", exclude.intercept = FALSE # show the intercept ) + ggplot2::labs(x = parse(text = "'standardized regression coefficient' ~italic(beta)")), # model 2: linear mixed-effects model ggstatsplot::ggcoefstats( x = lme4::lmer( formula = scale(rating) ~ scale(budget) + (budget | genre), data = movies_10, control = lme4::lmerControl(calc.derivs = FALSE) ), title = "linear mixed-effects model", stats.label.color = "black", exclude.intercept = FALSE ) + ggplot2::labs( x = parse(text = "'standardized regression coefficient' ~italic(beta)"), y = "fixed effects" ), nrow = 2, ncol = 1, title.text = "Relationship between movie budget and its IMDB rating" ) #> Random effect variances not available. Returned R2 does not account for random effects.
Note that for mixed-effects models, only the fixed effects are shown because there are no confidence intervals for random effects terms. In case, you would like to see these terms, you can enter the same object you entered as x
argument to ggcoefstats
:
# setup set.seed(123) library(lme4) # to speed up the calculation, let's use only 10% of the data movies_10 <- dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1) # tidy output broomExtra::tidy_parameters( lme4::lmer( formula = scale(rating) ~ scale(budget) + (budget | genre), data = movies_10, control = lme4::lmerControl(calc.derivs = FALSE) ) ) #> # A tibble: 2 x 8 #> term estimate std.error conf.low conf.high statistic df.error p.value #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> #> 1 (Intercept) -0.0498 0.147 -0.338 0.238 -0.339 152 0.735 #> 2 scale(budget) 0.0428 0.0806 -0.115 0.201 0.531 152 0.595
rlmer
)Robust version of lmer
(as implemented in robustlmm
package) is also supported-
# setups set.seed(123) library(robustlmm) # model roblmm.mod <- robustlmm::rlmer( formula = scale(Reaction) ~ scale(Days) + (Days | Subject), data = sleepstudy, rho.sigma.e = psi2propII(smoothPsi, k = 2.28), rho.sigma.b = chgDefaults(smoothPsi, k = 5.11, s = 10) ) # plot ggstatsplot::ggcoefstats( x = roblmm.mod, title = "robust estimation of linear mixed-effects model", conf.level = 0.90 ) #> Note: AIC and BIC values not available, so skipping caption.
lmerTest
(lmerModLmerTest
)# setup set.seed(123) library(lmerTest) # fit linear mixed model to the ham data: fm <- lmerTest::lmer( formula = Informed.liking ~ Gender + Information * Product + (1 | Consumer) + (1 | Consumer:Product), data = ham ) # plot ggstatsplot::ggcoefstats( x = fm, title = "linear mixed-effects models with `lmerTest`" )
nlmer
/nlmerMod
)# data library(lme4) set.seed(123) startvec <- c(Asym = 200, xmid = 725, scal = 350) # model nm1 <- lme4::nlmer( formula = circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym | Tree, data = Orange, start = startvec ) # plot ggstatsplot::ggcoefstats( x = nm1, title = "non-linear mixed-effects model" )
nls
)So far we have been assuming a linear relationship between movie budget and rating. But what if we want to also explore the possibility of a non-linear relationship? In that case, we can run a non-linear least squares regression. Note that you need to choose some non-linear function, which will be based on prior exploratory data analysis (y ~ k/x + c
implemented here, but you can try out other non-linear functions, e.g. Y ~ k * exp(-b*c)
).
library(ggstatsplot) # to speed up the calculation, let's use only 10% of the data movies_10 <- dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1) # model mod <- stats::nls( formula = rating ~ k / budget + c, # try toying around with the form of non-linear function data = movies_10, start = list(k = 1, c = 0) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "non-linear least squares regression", subtitle = "Non-linear relationship between budget and rating" )
This analysis shows that there is indeed a possible non-linear association between rating and budget (non-linear regression term k
is significant), at least with the particular non-linear function we used.
slm
)# setup set.seed(123) library(slm) data("shan") # model mod <- slm( myformula = shan$PM_Xuhui ~ ., data = shan, method_cov_st = "fitAR", model_selec = -1 ) # plot ggstatsplot::ggcoefstats( x = mod, conf.level = 0.90, title = "stationary linear models", package = "rcartocolor", palette = "Vivid" ) #> Note: AIC and BIC values not available, so skipping caption.
glm
)In all the analyses carried out thus far, the outcome variable (y
in y ~ x
) has been continuous. In case the outcome variable is nominal/categorical/factor, we can use the generalized form of linear model that works even if the response is a numeric vector or a factor vector, etc.
# setup library(ggstatsplot) set.seed(123) # having a look at the Titanic dataset df <- as.data.frame(Titanic) # model mod <- stats::glm( formula = Survived ~ Sex + Age, data = df, weights = df$Freq, family = stats::binomial(link = "logit") ) # plot ggstatsplot::ggcoefstats( x = mod, ggtheme = ggthemes::theme_economist_white(), ggstatsplot.layer = FALSE, title = "generalized linear model (glm)", vline.args = list(color = "red", linetype = "solid"), stats.label.color = c("orangered", "dodgerblue") ) #> Can't calculate log-loss.
Note: Few things to keep in mind for glm
models,
The exact statistic will depend on the family used. Below we will see a host of different function calls to glm
with a variety of different families.
Some families will have a t
statistic associated with them, while others a z
statistic. The function will figure this out for you.
# creating dataframes to use for regression analyses library(ggstatsplot) # dataframe #1 df.counts <- data.frame( treatment = gl(n = 3, k = 3, length = 9), outcome = gl(n = 3, k = 1, length = 9), counts = c(18, 17, 15, 20, 10, 20, 25, 13, 12) ) %>% tibble::as_tibble(.) # dataframe #2 df.clotting <- data.frame( u = c(5, 10, 15, 20, 30, 40, 60, 80, 100), lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18), lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12) ) %>% tibble::as_tibble(.) # dataframe #3 x1 <- stats::rnorm(50) y1 <- stats::rpois(n = 50, lambda = exp(1 + x1)) df.3 <- data.frame(x = x1, y = y1) %>% tibble::as_tibble(.) # dataframe #4 x2 <- stats::rnorm(50) y2 <- rbinom( n = 50, size = 1, prob = stats::plogis(x2) ) df.4 <- data.frame(x = x2, y = y2) %>% tibble::as_tibble(.) # combining all plots in a single plot ggstatsplot::combine_plots( # Family: Poisson ggstatsplot::ggcoefstats( x = stats::glm( formula = counts ~ outcome + treatment, data = df.counts, family = stats::poisson(link = "log") ), title = "Family: Poisson", stats.label.color = "black" ), # Family: Gamma ggstatsplot::ggcoefstats( x = stats::glm( formula = lot1 ~ log(u), data = df.clotting, family = stats::Gamma(link = "inverse") ), title = "Family: Gamma", stats.label.color = "black" ), # Family: Quasi ggstatsplot::ggcoefstats( x = stats::glm( formula = y ~ x, family = quasi(variance = "mu", link = "log"), data = df.3 ), title = "Family: Quasi", stats.label.color = "black" ), # Family: Quasibinomial ggstatsplot::ggcoefstats( x = stats::glm( formula = y ~ x, family = stats::quasibinomial(link = "logit"), data = df.4 ), title = "Family: Quasibinomial", stats.label.color = "black" ), # Family: Quasipoisson ggstatsplot::ggcoefstats( x = stats::glm( formula = y ~ x, family = stats::quasipoisson(link = "log"), data = df.4 ), title = "Family: Quasipoisson", stats.label.color = "black" ), # Family: Gaussian ggstatsplot::ggcoefstats( x = stats::glm( formula = Sepal.Length ~ Species, family = stats::gaussian(link = "identity"), data = iris ), title = "Family: Gaussian", stats.label.color = "black" ), ncol = 2, title.text = "Exploring models with different `glm` families", title.color = "blue" )
glm2
)# setup library(glm2) y <- c(1, 1, 1, 0) # model fit1 <- glm2::glm2( formula = y ~ 1, family = binomial(link = "logit"), control = glm.control(trace = FALSE) ) # plot ggstatsplot::ggcoefstats( x = fit1, title = "greater stability for fitting generalized linear models", exclude.intercept = FALSE )
lrm
)# setup library(rms) set.seed(123) # data n <- 500 x1 <- runif(n, -1, 1) x2 <- runif(n, -1, 1) x3 <- sample(0:1, n, TRUE) y <- x1 + 0.5 * x2 + x3 + rnorm(n) y <- as.integer(cut2(y, g = 10)) dd <- datadist(x1, x2, x3) options(datadist = "dd") # model f <- rms::lrm(y ~ x1 + pol(x2, 2) + x3, eps = 1e-7) # eps to check against rstan # plot ggstatsplot::ggcoefstats( x = f, title = "Logistic Regression Model", package = "ggsci", palette = "category20c_d3" )
iv_robust
)# setup set.seed(123) library(fabricatr) library(estimatr) # data dat <- fabricate( N = 40, Y = rpois(N, lambda = 4), Z = rbinom(N, 1, prob = 0.4), D = Z * rbinom(N, 1, prob = 0.8), X = rnorm(N), G = sample(letters[1:4], N, replace = TRUE) ) # instrument for treatment `D` with encouragement `Z` mod <- estimatr::iv_robust( formula = Y ~ D + X | Z + X, data = dat ) # plot ggstatsplot::ggcoefstats( x = mod, exclude.intercept = FALSE, title = "Two-Stage Least Squares Instrumental Variables Regression" )
negbin
)Just to demonstrate that this can be done, let’s also flip the axes:
# setup library(MASS) library(lme4) set.seed(101) # data dd <- expand.grid( f1 = factor(1:3), f2 = LETTERS[1:2], g = 1:9, rep = 1:15, KEEP.OUT.ATTRS = FALSE ) mu <- 5 * (-4 + with(dd, as.integer(f1) + 4 * as.numeric(f2))) dd$y <- rnbinom(nrow(dd), mu = mu, size = 0.5) # model m.glm <- MASS::glm.nb(formula = y ~ f1 * f2, data = dd) # plot ggstatsplot::ggcoefstats( x = m.glm, title = "generalized linear model (GLM) for the negative binomial family", exclude.intercept = FALSE, only.significant = TRUE, stats.label.args = list(size = 2.5, direction = "both") ) + ggplot2::coord_flip()
glmer
/glmerMod
)# setup set.seed(123) library(lme4) # model mod <- lme4::glmer( formula = Survived ~ Sex + Age + (Sex + Age | Class), data = ggstatsplot::Titanic_full, family = stats::binomial(link = "logit"), control = lme4::glmerControl( optimizer = "Nelder_Mead", calc.derivs = FALSE, boundary.tol = 1e-7 ) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "generalized linear mixed-effects model", exponentiate = TRUE ) #> Can't calculate log-loss.
glmer.nb
)# setup library(MASS) library(lme4) set.seed(101) # data dd <- expand.grid( f1 = factor(1:3), f2 = LETTERS[1:2], g = 1:9, rep = 1:15, KEEP.OUT.ATTRS = FALSE ) mu <- 5 * (-4 + with(dd, as.integer(f1) + 4 * as.numeric(f2))) dd$y <- rnbinom(nrow(dd), mu = mu, size = 0.5) # model m.nb <- lme4::glmer.nb(formula = y ~ f1 * f2 + (1 | g), data = dd) # plot ggstatsplot::ggcoefstats( x = m.nb, title = "generalized linear mixed-effects model (GLMM) for the negative binomial family", exclude.intercept = FALSE ) #> Random effect variances not available. Returned R2 does not account for random effects.
cpglm
)# set up set.seed(123) library(cplm) # model mod <- cplm::cpglm( formula = RLD ~ factor(Zone) * factor(Stock), data = FineRoot ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Compound Poisson Generalized Linear Models" ) #> Note: AIC and BIC values not available, so skipping caption.
cpglmm
)# set up set.seed(123) library(cplm) # model mod <- cpglmm( formula = RLD ~ Stock + Spacing + (1 | Plant), data = FineRoot ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Compound Poisson Generalized Linear Mixed Models" ) #> Note: AIC and BIC values not available, so skipping caption.
bcplm
)# set up set.seed(123) library(cplm) # model mod <- bcplm( formula = RLD ~ factor(Zone) * factor(Stock), data = FineRoot, tune.iter = 2000, n.iter = 6000, n.burnin = 1000, n.thin = 5 ) #> Tuning phase... #> Acceptance rate: min(45.00%), mean(50.38%), max(60.00%) #> ----------------------------------------- #> Start Markov chain 1 #> Iteration: 3000 #> Acceptance rate: min(44.90%), mean(48.88%), max(52.50%) #> Iteration: 6000 #> Acceptance rate: min(45.05%), mean(48.90%), max(52.22%) #> ----------------------------------------- #> Start Markov chain 2 #> Iteration: 3000 #> Acceptance rate: min(45.40%), mean(49.54%), max(54.03%) #> Iteration: 6000 #> Acceptance rate: min(46.25%), mean(49.22%), max(53.90%) #> ----------------------------------------- #> Start Markov chain 3 #> Iteration: 3000 #> Acceptance rate: min(45.80%), mean(49.35%), max(53.43%) #> Iteration: 6000 #> Acceptance rate: min(46.05%), mean(49.18%), max(53.33%) #> ----------------------------------------- #> Markov Chain Monte Carlo ends! # plot ggstatsplot::ggcoefstats( x = mod, title = "Bayesian Compound Poisson Linear Models" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
zcpglm
)# set up set.seed(123) library(cplm) da <- subset(AutoClaim, IN_YY == 1) # use data in the Yip and Yau paper da <- transform(da, CLM_AMT5 = CLM_AMT5 / 1000, INCOME = INCOME / 10000) # model mod <- zcpglm( formula = CLM_AMT5 ~ CAR_USE + MARRIED + AREA + MVR_PTS || MVR_PTS + INCOME, data = da ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Zero-inflated Compound Poisson Generalized Linear Models", standardize = TRUE ) #> Note: AIC and BIC values not available, so skipping caption.
zeroinfl
)# setup set.seed(123) library(pscl) # data data("bioChemists", package = "pscl") # model mod <- pscl::zeroinfl( formula = art ~ . | 1, data = bioChemists, dist = "negbin" ) # plot ggstatsplot::ggcoefstats( x = mod, exclude.intercept = FALSE, title = "Zero-Inflated Count Data Regression" ) #> Note: AIC and BIC values not available, so skipping caption.
vgam
)# setup set.seed(123) library(VGAM) pneumo <- transform(pneumo, let = log(exposure.time)) # model mod <- vglm( formula = cbind(normal, mild, severe) ~ let, family = multinomial, data = pneumo ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Vector Generalized Linear Models", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
cgam
)# setup set.seed(123) library(cgam) data(cubic) # model m <- cgam(formula = y ~ incr.conv(x), data = cubic) # plot ggstatsplot::ggcoefstats( x = m, title = "Constrained Generalized Additive Model Fitting", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
cgamm
)# setup set.seed(123) library(cgam) # simulate a balanced data set with 30 clusters # each cluster has 30 data points n <- 30 m <- 30 # the standard deviation of between cluster error terms is 1 # the standard deviation of within cluster error terms is 2 sige <- 1 siga <- 2 # generate a continuous predictor x <- 1:(m * n) for (i in 1:m) { x[(n * (i - 1) + 1):(n * i)] <- round(runif(n), 3) } # generate a group factor group <- trunc(0:((m * n) - 1) / n) + 1 # generate the fixed-effect term mu <- 10 * exp(10 * x - 5) / (1 + exp(10 * x - 5)) # generate the random-intercept term asscosiated with each group avals <- rnorm(m, 0, siga) # generate the response y <- 1:(m * n) for (i in 1:m) { y[group == i] <- mu[group == i] + avals[i] + rnorm(n, 0, sige) } # use REML method to fit the model ans <- cgamm(formula = y ~ s.incr(x) + (1 | group), reml = TRUE) # plot ggstatsplot::ggcoefstats( x = ans, title = "Constrained Generalized Additive Mixed-Effects Model Fitting", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
hurdle
)# setup set.seed(123) library(pscl) data("bioChemists", package = "pscl") # geometric-poisson fm_hp2 <- pscl::hurdle( formula = art ~ ., data = bioChemists, zero = "geometric" ) # plot ggstatsplot::ggcoefstats( x = fm_hp2, only.significant = TRUE, exclude.intercept = FALSE, conf.level = 0.99, title = "Hurdle Models for Count Data Regression" ) #> Warning: Number of labels is greater than default palette color count. #> Try using another color `palette` (and/or `package`).
BBmm
)# setup library(PROreg) set.seed(123) # defining the parameters k <- 100 m <- 10 phi <- 0.5 beta <- c(1.5, -1.1) sigma <- 0.5 # simulating the covariate and random effects x <- runif(k, 0, 10) X <- model.matrix(~x) z <- as.factor(rBI(k, 4, 0.5, 2)) Z <- model.matrix(~ z - 1) u <- rnorm(5, 0, sigma) # the linear predictor and simulated response variable eta <- beta[1] + beta[2] * x + crossprod(t(Z), u) p <- 1 / (1 + exp(-eta)) y <- rBB(k, m, p, phi) dat <- data.frame(cbind(y, x, z)) dat$z <- as.factor(dat$z) # apply the model mod <- PROreg::BBmm( fixed.formula = y ~ x, random.formula = ~z, m = m, data = dat ) #> Iteration number: 1 #> Iteration number: 2 #> Iteration number: 3 #> Iteration number: 4 #> Iteration number: 5 #> Iteration number: 6 #> Iteration number: 7 #> Iteration number: 8 #> Iteration number: 9 # plot ggstatsplot::ggcoefstats( x = mod, exclude.intercept = FALSE, title = "beta-binomial mixed-effects model" ) #> Note: AIC and BIC values not available, so skipping caption.
BBreg
)# setup set.seed(18) library(PROreg) # we simulate a covariate, fix the paramters of the beta-binomial # distribution and simulate a response variable. # Then we apply the model, and try to get the same values. k <- 1000 m <- 10 x <- rnorm(k, 5, 3) beta <- c(-10, 2) p <- 1 / (1 + exp(-(beta[1] + beta[2] * x))) phi <- 1.2 y <- rBB(k, m, p, phi) # model model <- BBreg(y ~ x, m) # plot ggstatsplot::ggcoefstats( x = model, exclude.intercept = FALSE, title = "beta-binomial logistic regression model" ) #> Note: AIC and BIC values not available, so skipping caption.
bife
)# setup set.seed(123) library(bife) dataset <- psid # binary choice models with fixed effects mod <- bife( LFP ~ I(AGE^2) + log(INCH) + KID1 + KID2 + KID3 + factor(TIME) | ID, dataset ) # plot ggstatsplot::ggcoefstats( x = mod, title = "binary choice models with fixed effects" ) #> Note: AIC and BIC values not available, so skipping caption. #> Warning: Number of labels is greater than default palette color count. #> Try using another color `palette` (and/or `package`).
DirichReg
)# setup set.seed(123) library(DirichletReg) # data ALake <- ArcticLake ALake$Y <- DR_data(ALake[, 1:3]) # fit a quadratic Dirichlet regression models ("common") mod <- DirichReg(Y ~ depth + I(depth^2), ALake) # plot ggstatsplot::ggcoefstats( x = mod, title = "Dirichlet Regression" )
robmixglm
)# setup set.seed(123) library(robmixglm) library(MASS) data(forbes) # model forbes.robustmix <- robmixglm(100 * log10(pres) ~ bp, data = forbes) # plot ggstatsplot::ggcoefstats( x = forbes.robustmix, title = "robust generalized linear models" ) #> Note: AIC and BIC values not available, so skipping caption.
glmx
)# setup library(glmx) library(MASS) set.seed(1) d <- data.frame(x = runif(200, -1, 1)) d$y <- rnbinom(200, mu = exp(0 + 3 * d$x), size = 1) # model m_nb1 <- glmx( formula = y ~ x, data = d, family = negative.binomial, xlink = "log", xstart = 0 ) ggstatsplot::ggcoefstats( x = m_nb1, title = "Generalized Linear Models with Extra Parameters" )
glmertree
)# setup set.seed(123) library(glmertree) data("DepressionDemo", package = "glmertree") # fit normal linear regression LMM tree for continuous outcome lt <- glmertree::lmertree( formula = depression ~ treatment | cluster | age + anxiety + duration, data = DepressionDemo ) # fit logistic regression GLMM tree for binary outcome gt <- glmertree::glmertree( formula = depression_bin ~ treatment | cluster | age + anxiety + duration, data = DepressionDemo ) # plot ggstatsplot::combine_plots( ggstatsplot::ggcoefstats( x = lt$lmer, title = "normal linear regression LMM tree for continuous outcome" ), ggstatsplot::ggcoefstats( x = lt$lmer, title = "logistic regression GLMM tree for binary outcome" ) )
glmmPQL
)# setup set.seed(123) library(MASS) library(nlme) # model mod <- MASS::glmmPQL( fixed = y ~ trt + I(week > 2), random = ~ 1 | ID, family = binomial, data = bacteria, verbose = FALSE ) # plot ggstatsplot::ggcoefstats( x = mod, title = "generalized linear mixed models using Penalized Quasi-Likelihood", exclude.intercept = FALSE ) #> Can't calculate log-loss.
glmmTMB
)glmmTMB
package allows for flexibly fitting generalized linear mixed models (GLMMs) and extensions. Model objects from this package are also supported.
# set up library(glmmTMB) library(lme4) set.seed(123) # model mod <- glmmTMB::glmmTMB( formula = Reaction ~ Days + (Days | Subject), data = sleepstudy, family = glmmTMB::truncated_poisson() ) # plotting the model ggstatsplot::ggcoefstats( x = mod, conf.method = "uniroot", title = "generalized linear mixed models using Template Model Builder" )
Another example (given the number of terms, let’s only display labels for significant effects):
# setup set.seed(123) library(glmmTMB) data(Salamanders) # model zipm3 <- glmmTMB(count ~ spp * mined + (1 | site), zi = ~ spp * mined, data = Salamanders, family = "poisson" ) # plot ggstatsplot::ggcoefstats( x = zipm3, package = "palettesForR", palette = "Inkscape", only.significant = TRUE )
glmmadmb
)Another option is to use glmmadmb
package.
# setup library(glmmADMB) # simulate values set.seed(101) d <- data.frame(f = factor(rep(LETTERS[1:10], each = 10)), x = runif(100)) u <- rnorm(10, sd = 2) d$eta <- with(d, u[f] + 1 + 4 * x) pz <- 0.3 zi <- rbinom(100, size = 1, prob = pz) d$y <- ifelse(zi, 0, rpois(100, lambda = exp(d$eta))) # fit zipmodel <- glmmADMB::glmmadmb( formula = y ~ x + (1 | f), data = d, family = "poisson", zeroInflation = TRUE ) # plotting the model ggstatsplot::ggcoefstats( x = zipmodel, title = "generalized linear mixed models using AD Model Builder" )
clm
)So far we have dealt either with continuous or nominal/factor responses (or output variables), but sometimes we will encounter ordinal data (e.g., Likert scale measurement in behavioral sciences). In these cases, ordinal regression models are more suitable.
# for reproducibility set.seed(123) library(ordinal) # model (to speed up calculations, we will use just 10% of the dataset) mod <- ordinal::clm(formula = rating ~ temp * contact, data = wine) # plot ggstatsplot::ggcoefstats( x = mod, stats.label.color = "black", title = "cumulative link model (clm)", subtitle = "(using `ordinal` package)" ) + ggplot2::labs(x = "logit regression coefficient", y = NULL) #> Can't extract residuals from model. #> Can't calculate log-loss. #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
As can be seen from this plot, both factors (intentionality and consequences) were significant, and so was their interaction.
clm2
)# for reproducibility set.seed(123) library(ordinal) library(MASS) data(housing, package = "MASS") # data tab26 <- with(soup, table("Product" = PROD, "Response" = SURENESS)) dimnames(tab26)[[2]] <- c("Sure", "Not Sure", "Guess", "Guess", "Not Sure", "Sure") dat26 <- expand.grid(sureness = as.factor(1:6), prod = c("Ref", "Test")) dat26$wghts <- c(t(tab26)) # model mod <- ordinal::clm2( location = sureness ~ prod, scale = ~prod, data = dat26, weights = wghts, link = "logistic" ) # plot ggstatsplot::ggcoefstats( x = mod, title = "older version of `clm`" ) #> Can't extract residuals from model. #> Can't calculate log-loss. #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
clmm
)# for reproducibility set.seed(123) library(ordinal) # model mod <- clmm(formula = rating ~ temp + contact + (1 | judge), data = wine) # to speed up calculations, we will use just 10% of the dataset ggstatsplot::ggcoefstats( x = mod, title = "cumulative link mixed model (clmm)", subtitle = "(using `ordinal` package)" ) + ggplot2::labs( x = "coefficient from ordinal mixed-effects regression", y = "fixed effects" )
clmm2
)# for reproducibility set.seed(123) library(ordinal) # data dat <- subset(soup, as.numeric(as.character(RESP)) <= 24) dat$RESP <- dat$RESP[drop = TRUE] # model m1 <- ordinal::clmm2( SURENESS ~ PROD, random = RESP, data = dat, link = "probit", Hess = TRUE, method = "ucminf", threshold = "symmetric" ) # plot ggstatsplot::ggcoefstats( x = m1, title = "older version of cumulative link mixed models" ) #> Can't extract residuals from model. #> Can't calculate log-loss. #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
mixor
)# setup set.seed(123) library(mixor) data("SmokingPrevention") # data frame must be sorted by id variable SmokingPrevention <- SmokingPrevention[order(SmokingPrevention$class), ] # school model mod <- mixor( formula = thksord ~ thkspre + cc + tv + cctv, data = SmokingPrevention, id = school, link = "logit" ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Mixed-Effects Ordinal Regression Analysis", exclude.intercept = FALSE )
brglm2
)Note that we can also choose to display labels only for the significant effects. This can be helpful when a large number of regression coefficients are to be displayed in a single plot and we would like to focus only on the significant ones.
# setup set.seed(123) library(brglm2) data("lizards") # fit the model using maximum likelihood mean bias-reduced fit: lizardsBR_mean <- stats::glm( formula = cbind(grahami, opalinus) ~ height + diameter + light + time, family = binomial(logit), data = lizards, method = "brglmFit" ) # plot ggstatsplot::ggcoefstats( x = lizardsBR_mean, only.significant = TRUE, title = "bias reduction in generalized linear models" ) #> Can't calculate log-loss. #> Can't calculate proper scoring rules for models without integer response values. #> `performance_pcp()` only works for models with binary response values.
brmultinom
)# setup set.seed(123) library(MASS) library(brglm2) data("housing", package = "MASS") # Maximum likelihood using brmultinom with baseline category 'Low' houseML1 <- brglm2::brmultinom( formula = Sat ~ Infl + Type + Cont, weights = Freq, data = housing, type = "ML", ref = 1 ) # plot ggstatsplot::ggcoefstats( x = houseML1, title = "Bias Reduction For Multinomial Response Models Using The Poisson Trick" ) #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models. #> Warning: Number of labels is greater than default palette color count. #> Try using another color `palette` (and/or `package`).
bracl
)# setup set.seed(123) library(brglm2) data("stemcell") # bias reduction for adjacent category logit models # for ordinal responses using the Poisson trick fit_bracl <- brglm2::bracl( formula = research ~ as.numeric(religion) + gender, weights = frequency, data = stemcell, type = "ML" ) # plot ggstatsplot::ggcoefstats( x = fit_bracl, conf.int = FALSE, title = "bias reduction for adjacent category logit models" ) #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
# setup set.seed(123) library(glmc) # data n <- rbind(c(5903, 230), c(5157, 350)) mat <- matrix(0, nrow = sum(n), ncol = 2) mat <- rbind( matrix(1, nrow = n[1, 1], ncol = 1) %*% c(0, 0), matrix(1, nrow = n[1, 2], ncol = 1) %*% c(0, 1), matrix(1, nrow = n[2, 1], ncol = 1) %*% c(1, 0), matrix(1, nrow = n[2, 2], ncol = 1) %*% c(1, 1) ) # specifying the population constraints gfr <- .06179 * matrix(1, nrow = nrow(mat), ncol = 1) g <- matrix(1, nrow = nrow(mat), ncol = 1) amat <- matrix(mat[, 2] * g - gfr, ncol = 1) # defining constraints in the data frame. hrh <- data.frame(birth = mat[, 2], child = mat[, 1], constraints = amat) # model gfit <- glmc::glmc( formula = birth ~ child, data = hrh, family = "binomial", emplik.method = "Owen", control = glmc::glmc.control( trace.optim = 0, trace.glm = FALSE, maxit.glm = 10, maxit.weights = 200, itertrace.weights = TRUE, gradtol.weights = 10^(-6) ) ) #> [1] -0.02656434 -14.43904679 1320.26413390 1.00000000 #> [1] -0.0220031 -15.3239529 375.8086391 1.0000000 #> [1] -0.0217611 -15.3261430 18.0697997 1.0000000 #> [1] -0.02176048 -15.32614305 0.04572550 1.00000000 #> [1] -2.176048e-02 -1.532614e+01 2.958267e-07 9.000000e+00 # plot ggstatsplot::ggcoefstats( x = gfit, title = "generalized linear models subject to population constraints", exclude.intercept = FALSE )
blmerMod
)# for reproducibility set.seed(123) library(blme) # data data(sleepstudy) sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } # model mod <- blme::blmer( formula = scale(Reaction) ~ scale(Days) + (1 + Days | Subject), data = sleepstudy, cov.prior = NULL, REML = FALSE ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Bayesian linear mixed-effects models", exclude.intercept = FALSE ) #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
bglmerMod
)# for reproducibility set.seed(123) library(blme) # model mod <- blme::bglmer( formula = Survived ~ Sex + Age + (Sex + Age | Class), # select 20% of the sample to reduce the time of execution data = dplyr::sample_frac(tbl = ggstatsplot::Titanic_full, size = 0.2), family = stats::binomial(link = "logit"), control = lme4::glmerControl( optimizer = "Nelder_Mead", calc.derivs = FALSE, boundary.tol = 1e-7 ) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Bayesian generalized linear mixed-effects models" ) #> Can't calculate log-loss.
polr
)# polr model set.seed(123) library(MASS) polr.mod <- MASS::polr( formula = Sat ~ Infl + Type + Cont, weights = Freq, data = housing ) # plot ggstatsplot::ggcoefstats( x = polr.mod, coefficient.type = "both", title = "ordered logistic or probit regression", subtitle = "using `MASS` package" ) #> Can't extract residuals from model. #> Can't calculate log-loss. #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
mlm
)# model (converting all numeric columns in data to z-scores) mod <- stats::lm( formula = cbind(mpg, disp) ~ wt, data = purrr::modify_if(.x = mtcars, .p = is.numeric, ~ { as.numeric(scale(.x)) }) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "multiple linear regression models", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
multinom
)# setup set.seed(123) library(nnet) library(MASS) utils::example(topic = birthwt, echo = FALSE) # model bwt.mu <- nnet::multinom( formula = low ~ ., data = bwt, trace = FALSE ) # plot ggstatsplot::ggcoefstats( x = bwt.mu, title = "multinomial logistic regression models", package = "ggsci", palette = "default_ucscgb" ) #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models.
bmlm
)# setup set.seed(123) library(bmlm) # model fit <- bmlm::mlm(BLch9, verbose = FALSE) # exctrating summary df_summary <- bmlm::mlm_summary(fit) %>% dplyr::rename( .data = ., estimate = Mean, term = Parameter, std.error = SE, conf.low = `2.5%`, conf.high = `97.5%` ) # plot ggstatsplot::ggcoefstats( x = df_summary, title = "Bayesian multilevel mediation models with Stan" ) #> Note: The argument `statistic` must be specified. #> Skipping labels with statistical details. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
svyglm
)# data library(survey) set.seed(123) data(api) dstrat <- survey::svydesign( id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc ) # model mod <- survey::svyglm( formula = sch.wide ~ ell + meals + mobility, design = dstrat, family = quasibinomial() ) # plot ggstatsplot::ggcoefstats( x = mod, title = "survey-weighted generalized linear model" ) #> Can't calculate log-loss.
aovlist
)Let’s now consider an example of a repeated measures design.
# for reproducibility set.seed(123) library(ggstatsplot) # specifying the model (note the error structure) # let's use 20% of the sample to speed up the analysis mod <- stats::aov( formula = value ~ attribute * measure + Error(id / (attribute * measure)), data = iris_long ) # plot ggstatsplot::ggcoefstats( x = mod, effsize = "eta", ggtheme = ggthemes::theme_fivethirtyeight(), ggstatsplot.layer = FALSE, stats.label.color = c("#0072B2", "#D55E00", "darkgreen"), title = "Variation in measurements for Iris species", subtitle = "Source: Iris data set (by Fisher or Anderson)", caption = "Results from 2 by 2 RM ANOVA" ) + ggplot2::theme(plot.subtitle = ggplot2::element_text(size = 11, face = "plain")) #> Note: AIC and BIC values not available, so skipping caption.
As revealed by this analysis, all effects of this model are significant. But most of the variance is explained by the attribute
, with the next important explanatory factor being the measure
used. A very little amount of variation in measurement is accounted for by the interaction between these two factors.
robust
package (lmRob
, glmRob
)The robust regression models, as implemented in the robust
package are also supported.
ggstatsplot::combine_plots( # plot 1: glmRob ggstatsplot::ggcoefstats( x = robust::glmRob( formula = Survived ~ Sex, data = dplyr::sample_frac(tbl = ggstatsplot::Titanic_full, size = 0.20), family = stats::binomial(link = "logit") ), title = "generalized robust linear model", package = "dichromat", palette = "BrowntoBlue.10", ggtheme = ggthemes::theme_fivethirtyeight(), ggstatsplot.layer = FALSE ), # plot 2: lmRob ggstatsplot::ggcoefstats( x = robust::lmRob( formula = Sepal.Length ~ Sepal.Width * Species, data = iris ), title = "robust linear model", package = "awtools", palette = "a_palette", ggtheme = ggthemes::theme_tufte(), ggstatsplot.layer = FALSE ), # arguments relevant for `combine_plots` function title.text = "Robust variants of `lmRob` and `glmRob` (from`robust` package)", nrow = 2 ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: AIC and BIC values not available, so skipping caption.
robustbase
package (lmrob
, glmrob
)Another alternative is to use robust models, as implemented in the robustbase
package.
# setup set.seed(123) library(robustbase) # dataframe data(coleman) clotting <- data.frame( u = c(5, 10, 15, 20, 30, 40, 60, 80, 100), lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18), lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12) ) # combined plot for both generalized and simple robust models ggstatsplot::combine_plots( # plot 1: glmrob ggstatsplot::ggcoefstats( x = glmrob( formula = lot1 ~ log(u), data = clotting, family = Gamma ), title = "generalized robust linear model" ), # plot 2: lmrob ggstatsplot::ggcoefstats( x = lmrob(formula = Y ~ ., data = coleman), title = "robust linear model" ), # arguments relevant for `combine_plots` function title.text = "Robust variants of `lmRob` and `glmRob` (from`robustbase` package)", nrow = 2 )
complmrob
)# setup set.seed(123) library(complmrob) # data crimes <- data.frame( lifeExp = state.x77[, "Life Exp"], USArrests[, c("Murder", "Assault", "Rape")] ) # model mUSArr <- complmrob(formula = lifeExp ~ ., data = crimes) # plot ggstatsplot::ggcoefstats( x = mUSArr, title = "MM-type estimators for linear regression on compositional data" ) #> Note: AIC and BIC values not available, so skipping caption.
nlreg
)set.seed(123) library(nlreg) library(boot) data(calcium) # homoscedastic model fit calcium.nl <- nlreg::nlreg( formula = cal ~ b0 * (1 - exp(-b1 * time)), start = c(b0 = 4, b1 = 0.1), data = calcium ) # plot ggstatsplot::ggcoefstats( x = calcium.nl, conf.int = FALSE, title = "fit a nonlinear heteroscedastic model via maximum likelihood" ) #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while #> Note: AIC and BIC values not available, so skipping caption. #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while #> #> differentiating mean function -- may take a while #> differentiating variance function -- may take a while
felm
)Models of class felm
from lfe
package are also supported. This method is used to fit linear models with multiple group fixed effects, similarly to lm
. It uses the Method of Alternating projections to sweep out multiple group effects from the normal equations before estimating the remaining coefficients with OLS.
# setup set.seed(123) library(lfe) # create covariates x <- rnorm(1000) x2 <- rnorm(length(x)) # individual and firm id <- factor(sample(20, length(x), replace = TRUE)) firm <- factor(sample(13, length(x), replace = TRUE)) # effects for them id.eff <- rnorm(nlevels(id)) firm.eff <- rnorm(nlevels(firm)) # left hand side u <- rnorm(length(x)) y <- x + 0.5 * x2 + id.eff[id] + firm.eff[firm] + u # estimate and print result est <- lfe::felm(formula = y ~ x + x2 | id + firm) # plot ggstatsplot::ggcoefstats( x = est, title = "linear model with multiple group fixed effects" )
plm
)# data set.seed(123) library(plm) data("Produc", package = "plm") # model plm.mod <- plm::plm( formula = log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc, index = c("state", "year") ) # plot ggstatsplot::ggcoefstats( x = plm.mod, title = "linear models for panel data" )
mixed
)# setup set.seed(123) library(afex) data(sleepstudy) # data sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy$mysubgrp <- NA for (i in 1:5) { filter_group <- sleepstudy$mygrp == i sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE) } # linear model m1 <- afex::mixed( formula = Reaction ~ Days + (1 + Days | Subject), data = sleepstudy ) #> Fitting one lmer() model. [DONE] #> Calculating p-values. [DONE] # linear mixed-effects model m2 <- afex::mixed( formula = Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy ) #> Fitting one lmer() model. [DONE] #> Calculating p-values. [DONE] # plot ggstatsplot::combine_plots( ggstatsplot::ggcoefstats(m1, title = "linear model (`afex` package)"), ggstatsplot::ggcoefstats(m2, title = "linear mixed-effects model (`afex` package)") ) #> Can't extract residuals from model. #> Note: AIC and BIC values not available, so skipping caption. #> Random effect variances not available. Returned R2 does not account for random effects. #> Can't extract residuals from model. #> Note: AIC and BIC values not available, so skipping caption.
mlogit
)# setup set.seed(123) library(mlogit) # data data("Fishing", package = "mlogit") Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode") # a "mixed" model m <- mlogit(mode ~ price + catch | income, data = Fish) # plot ggstatsplot::ggcoefstats( x = m, title = "multinomial logit model" ) #> Can't calculate proper scoring rules for ordinal, multinomial or cumulative link models. #> Note: AIC and BIC values not available, so skipping caption.
mclogit
)# setup set.seed(123) library(mclogit) data(Transport) # model mod <- mclogit::mclogit( formula = cbind(resp, suburb) ~ distance + cost, data = Transport, control = mclogit.control(trace = FALSE) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Conditional Logit Models" ) #> Note: AIC and BIC values not available, so skipping caption.
mmclogit
)# setup set.seed(123) library(mclogit) data(electors) # model mod <- mclogit::mclogit( formula = cbind(Freq, interaction(time, class)) ~ econ.left / class + welfare / class + auth / class, random = ~ 1 | party.time, control = mclogit.control(trace = FALSE), data = within(electors, party.time <- interaction(party, time)) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "Mixed Conditional Logit Models" ) #> Warning: Number of labels is greater than default palette color count. #> Try using another color `palette` (and/or `package`).
cglm
)# setup set.seed(123) library(cglm) data(teenpov) # model fit.ide <- cglm::cglm( method = "ts", formula = hours ~ nonpov + inschool + spouse + age + mother, data = teenpov, id = "ID", link = "identity" ) # plot ggstatsplot::ggcoefstats( x = fit.ide, title = "conditional generalized linear models for clustered data" ) #> Note: AIC and BIC values not available, so skipping caption.
coxph
)Fitted proportional hazards regression model - as implemented in the survival
package - can also be displayed in a dot-whisker plot.
# for reproducibility set.seed(123) library(survival) # create the simplest test data set test1 <- list( time = c(4, 3, 1, 1, 2, 2, 3), status = c(1, 1, 1, 0, 1, 1, 0), x = c(0, 2, 1, 1, 1, 0, 0), sex = c(0, 0, 0, 0, 1, 1, 1) ) # fit a stratified model mod <- survival::coxph( formula = Surv(time, status) ~ x + strata(sex), data = test1 ) # plot ggstatsplot::ggcoefstats( x = mod, exponentiate = TRUE, title = "Cox proportional hazards regression model" )
coxme
)# setup set.seed(123) library(survival) library(coxme) # model fit <- coxme(Surv(y, uncens) ~ trt + (1 | center), eortc) # plot ggstatsplot::ggcoefstats( x = fit, exclude.intercept = FALSE, title = "mixed effects Cox model" ) #> Note: AIC and BIC values not available, so skipping caption.
truncreg
)# setup set.seed(123) library(truncreg) library(survival) # data data("tobin", package = "survival") # model cragg_trunc <- truncreg( formula = durable ~ age + quant, data = tobin, subset = durable > 0 ) # plot ggstatsplot::ggcoefstats( x = cragg_trunc, title = "Truncated Gaussian Regression Models" ) #> Note: AIC and BIC values not available, so skipping caption.
Arima
)# for reproducibility set.seed(123) # model fit <- stats::arima(x = lh, order = c(1, 0, 0)) # plot ggstatsplot::ggcoefstats( x = fit, title = "autoregressive integrated moving average" ) #> #> Call: #> stats::arima(x = lh, order = c(1, 0, 0)) #> #> Coefficients: #> ar1 intercept #> 0.5739 2.4133 #> s.e. 0.1161 0.1466 #> #> sigma^2 estimated as 0.1975: log likelihood = -29.38, aic = 64.76 #> #> Training set error measures: #> ME RMSE MAE MPE MAPE MASE #> Training set 0.0002080786 0.4443979 0.3490217 -3.56676 15.24388 0.9706522 #> ACF1 #> Training set 0.1355949 #> #> Call: #> stats::arima(x = lh, order = c(1, 0, 0)) #> #> Coefficients: #> ar1 intercept #> 0.5739 2.4133 #> s.e. 0.1161 0.1466 #> #> sigma^2 estimated as 0.1975: log likelihood = -29.38, aic = 64.76 #> #> Training set error measures: #> ME RMSE MAE MPE MAPE MASE #> Training set 0.0002080786 0.4443979 0.3490217 -3.56676 15.24388 0.9706522 #> ACF1 #> Training set 0.1355949 #> #> Call: #> stats::arima(x = lh, order = c(1, 0, 0)) #> #> Coefficients: #> ar1 intercept #> 0.5739 2.4133 #> s.e. 0.1161 0.1466 #> #> sigma^2 estimated as 0.1975: log likelihood = -29.38, aic = 64.76 #> #> Training set error measures: #> ME RMSE MAE MPE MAPE MASE #> Training set 0.0002080786 0.4443979 0.3490217 -3.56676 15.24388 0.9706522 #> ACF1 #> Training set 0.1355949
speedlm
/speedglm
)Example of high performance linear model-
# setup library(speedglm) set.seed(123) # model mod <- speedglm::speedlm( formula = mpg ~ wt + qsec, data = mtcars, fitted = TRUE ) # plot ggstatsplot::ggcoefstats( x = mod, title = "high performance linear model", exclude.intercept = FALSE )
Example of high performance generalized linear model-
# setup set.seed(123) library(speedglm) # data n <- 50000 k <- 5 y <- rgamma(n, 1.5, 1) x <- round(matrix(rnorm(n * k), n, k), digits = 3) colnames(x) <- paste("s", 1:k, sep = "") da <- data.frame(y, x) fo <- as.formula(paste("y~", paste(paste("s", 1:k, sep = ""), collapse = "+"))) # model mod <- speedglm::speedglm( formula = fo, data = da, family = stats::Gamma(log) ) # plot ggstatsplot::ggcoefstats( x = mod, title = "high performance generalized linear model" )
biglm
)# setup set.seed(123) library(biglm) # model bfit1 <- biglm( formula = scale(mpg) ~ scale(wt) + scale(disp), data = mtcars ) # plot ggstatsplot::ggcoefstats( x = bfit1, title = "bounded memory simple linear regression", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
bigglm
)# setup set.seed(123) library(biglm) data(trees) # model mod <- biglm::bigglm( formula = log(Volume) ~ log(Girth) + log(Height), data = trees, chunksize = 10, sandwich = TRUE ) # plot ggstatsplot::ggcoefstats( x = mod, title = "bounded memory general linear regression" ) #> Note: AIC and BIC values not available, so skipping caption.
survreg
)# setup set.seed(123) library(survival) # model mod <- survival::survreg( formula = Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist = "logistic" ) # plot ggstatsplot::ggcoefstats( x = mod, exclude.intercept = FALSE, ggtheme = hrbrthemes::theme_ipsum_rc(), package = "ggsci", palette = "legacy_tron", title = "parametric survival regression model" ) #> Can't calculate log-loss. #> Can't calculate proper scoring rules for models without integer response values. #> `performance_pcp()` only works for models with binary response values.
feis
)# setup set.seed(123) library(feisr) data("mwp", package = "feisr") # model feis.mod <- feis( formula = lnw ~ marry + enrol + as.factor(yeargr) | exp + I(exp^2), data = mwp, id = "id", robust = TRUE ) # plot ggstatsplot::ggcoefstats( x = feis.mod, title = "Fixed Effects Individual Slope Estimator" ) #> Note: AIC and BIC values not available, so skipping caption.
tobit
)# setup set.seed(123) library(AER) data("Affairs", package = "AER") # model m1 <- AER::tobit( formula = affairs ~ age + yearsmarried + religiousness + occupation + rating, data = Affairs ) # plot ggstatsplot::ggcoefstats( x = m1, title = "tobit regression" )
aareg
)# setup library(survival) set.seed(123) # model afit <- survival::aareg( formula = Surv(time, status) ~ age + sex + ph.ecog, data = lung, dfbeta = TRUE ) # plot ggstatsplot::ggcoefstats( x = afit, title = "Aalen's additive regression model", subtitle = "(for censored data)", k = 3 ) #> Note: AIC and BIC values not available, so skipping caption.
cch
)# setup set.seed(123) library(survival) # examples come from cch documentation subcoh <- nwtco$in.subcohort selccoh <- with(nwtco, rel == 1 | subcoh == 1) ccoh.data <- nwtco[selccoh, ] ccoh.data$subcohort <- subcoh[selccoh] ## central-lab histology ccoh.data$histol <- factor(ccoh.data$histol, labels = c("FH", "UH")) ## tumour stage ccoh.data$stage <- factor(ccoh.data$stage, labels = c("I", "II", "III", "IV")) ccoh.data$age <- ccoh.data$age / 12 # Age in years # model fit.ccP <- survival::cch( formula = Surv(edrel, rel) ~ stage + histol + age, data = ccoh.data, subcoh = ~subcohort, id = ~seqno, cohort.size = 4028 ) # plot ggstatsplot::ggcoefstats( x = fit.ccP, title = "relative risk regression model", subtitle = "(for case-cohort studies)", conf.level = 0.99 ) #> Note: AIC and BIC values not available, so skipping caption.
ridgelm
)For ridge regression, neither statistic values nor confidence intervals for estimates are available, so only estimates will be displayed.
# setup set.seed(123) library(MASS) # model names(longley)[1] <- "y" mod <- MASS::lm.ridge(formula = y ~ ., data = longley) # plot ggstatsplot::ggcoefstats( x = mod, title = "ridge regression" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details. #> Note: No confidence intervals available for regression coefficientsobject, so skipping whiskers in the plot.
gam
)Important: These model outputs contains both parametric and smooth terms. ggcoefstats
only displays the parametric terms.
# setup set.seed(123) library(mgcv) # model g <- mgcv::gam( formula = mpg ~ s(hp) + am + qsec, family = stats::quasi(), data = mtcars ) # plot ggstatsplot::ggcoefstats( x = g, title = "generalized additive models with integrated smoothness estimation", subtitle = "using `mgcv` package" ) #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
Gam
)# setup set.seed(123) library(gam) # model g <- gam::gam( formula = mpg ~ s(hp, 4) + am + qsec, data = mtcars ) # plot ggstatsplot::ggcoefstats( x = g, title = "generalized additive model", subtite = "(using `gam` package)" )
lme
)# for reproducibility set.seed(123) library(lme4) library(nlme) data("sleepstudy") # model mod <- nlme::lme( fixed = Reaction ~ Days, random = ~ 1 + Days | Subject, data = sleepstudy ) # plot ggstatsplot::ggcoefstats( x = mod, title = "linear mixed-effects models (`lme`)" )
gls
)The nlme
package provides a function to fit a linear model using generalized least squares. The errors are allowed to be correlated and/or have unequal variances.
# for reproducibility set.seed(123) library(nlme) # model mod <- nlme::gls( model = follicles ~ sin(2 * pi * Time) + cos(2 * pi * Time), data = Ovary, correlation = corAR1(form = ~ 1 | Mare) ) # plot ggstatsplot::ggcoefstats( x = mod, stats.label.color = "black", ggtheme = hrbrthemes::theme_ipsum_ps(), ggstatsplot.layer = FALSE, exclude.intercept = FALSE, title = "generalized least squares model" )
lm_robust
)# for reproducibility set.seed(123) library(estimatr) # model mod <- lm_robust( formula = mpg ~ gear + wt + cyl, data = mtcars ) # plot ggstatsplot::ggcoefstats( x = mod, title = "ordinary least squares with robust standard errors" ) #> Note: AIC and BIC values not available, so skipping caption.
coeftest
)# setup set.seed(123) library(lmtest) # load data and fit model data("Mandible", package = "lmtest") fm <- stats::lm(formula = length ~ age, data = Mandible, subset = (age <= 28)) # the following commands lead to the same tests: ct <- lmtest::coeftest(fm) # plot ggstatsplot::ggcoefstats( x = ct, plot = "inference for estimated coefficients", conf.level = 0.99 ) #> Note: AIC and BIC values not available, so skipping caption. #> Parameters can't be retrieved for objects of class 'simpleError'. #> Parameters can't be retrieved for objects of class 'simpleError'.
rlm
)# for reproducibility set.seed(123) # model mod <- MASS::rlm(formula = mpg ~ am * cyl, data = mtcars) # plot ggstatsplot::ggcoefstats( x = mod, point.args = list(color = "red", shape = 15), vline.args = list(size = 1, color = "#CC79A7", linetype = "dotdash"), stats.label.color = c("#0072B2", "#D55E00", "darkgreen"), title = "robust regression using an M estimator", ggtheme = ggthemes::theme_stata(), ggstatsplot.layer = FALSE )
rq
)# for reproducibility set.seed(123) library(quantreg) # loading dataframe needed for the analyses below data(stackloss) # plot ggstatsplot::ggcoefstats( x = quantreg::rq( formula = stack.loss ~ ., data = stackloss, method = "br" ), se.type = "iid", title = "quantile regression" )
nlrq
)# setup set.seed(123) library(quantreg) # preparing data Dat <- NULL Dat$x <- rep(1:25, 20) Dat$y <- stats::SSlogis(Dat$x, 10, 12, 2) * rnorm(500, 1, 0.1) # then fit the median using nlrq Dat.nlrq <- quantreg::nlrq( formula = y ~ SSlogis(x, Asym, mid, scal), data = Dat, tau = 0.5, trace = FALSE ) # plot ggstatsplot::ggcoefstats( x = Dat.nlrq, title = "non-linear quantile regression", se.type = "nid" )
rqss
)# setup set.seed(123) library(quantreg) data(CobarOre) # model fCO <- rqss( formula = z ~ qss(cbind(x, y), lambda = .08), data = CobarOre ) # model info ggstatsplot::ggcoefstats( x = fCO, title = "Additive Quantile Regression", exclude.intercept = FALSE ) #> Note: AIC and BIC values not available, so skipping caption.
ivreg
)# setup suppressPackageStartupMessages(library(AER)) set.seed(123) data("CigarettesSW", package = "AER") # model ivr <- AER::ivreg( formula = log(packs) ~ income | population, data = CigarettesSW, subset = year == "1995" ) # plot ggstatsplot::ggcoefstats( x = ivr, title = "instrumental-variable regression" )
mediate
)# setup set.seed(123) library(mediation) data(jobs) # base models b <- stats::lm( formula = job_seek ~ treat + econ_hard + sex + age, data = jobs ) c <- stats::lm( formula = depress2 ~ treat + job_seek + econ_hard + sex + age, data = jobs ) # mediation model mod <- mediation::mediate( model.m = b, model.y = c, sims = 50, treat = "treat", mediator = "job_seek" ) # plot ggstatsplot::ggcoefstats( x = mod, title = "causal mediation analysis" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
lmodel2
)# setup set.seed(123) library(lmodel2) data(mod2ex2) # model Ex2.res <- lmodel2::lmodel2( formula = Prey ~ Predators, data = mod2ex2, range.y = "relative", range.x = "relative", nperm = 99 ) # plot ggstatsplot::ggcoefstats( x = Ex2.res, exclude.intercept = FALSE, title = "Model II regression" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
gamlss
)# setup set.seed(123) library(gamlss) # model g <- gamlss::gamlss( formula = y ~ pb(x), sigma.fo = ~ pb(x), family = BCT, data = abdom, method = mixed(1, 20) ) #> GAMLSS-RS iteration 1: Global Deviance = 4771.925 #> GAMLSS-CG iteration 1: Global Deviance = 4771.013 #> GAMLSS-CG iteration 2: Global Deviance = 4770.994 #> GAMLSS-CG iteration 3: Global Deviance = 4770.994 # plot ggstatsplot::ggcoefstats( x = g, title = "generalized additive models for location scale and shape" ) #> ****************************************************************** #> Family: c("BCT", "Box-Cox t") #> #> Call: gamlss::gamlss(formula = y ~ pb(x), sigma.formula = ~pb(x), #> family = BCT, data = abdom, method = mixed(1, 20)) #> #> Fitting method: mixed(1, 20) #> #> ------------------------------------------------------------------ #> Mu link function: identity #> Mu Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -64.44299 1.33397 -48.31 <2e-16 *** #> pb(x) 10.69464 0.05787 184.80 <2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> ------------------------------------------------------------------ #> Sigma link function: log #> Sigma Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -2.65041 0.10859 -24.407 < 2e-16 *** #> pb(x) -0.01002 0.00380 -2.638 0.00855 ** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> ------------------------------------------------------------------ #> Nu link function: identity #> Nu Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -0.1072 0.6296 -0.17 0.865 #> #> ------------------------------------------------------------------ #> Tau link function: log #> Tau Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 2.4948 0.4261 5.855 7.86e-09 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> ------------------------------------------------------------------ #> NOTE: Additive smoothing terms exist in the formulas: #> i) Std. Error for smoothers are for the linear effect only. #> ii) Std. Error for the linear terms maybe are not accurate. #> ------------------------------------------------------------------ #> No. of observations in the fit: 610 #> Degrees of Freedom for the fit: 11.7603 #> Residual Deg. of Freedom: 598.2397 #> at cycle: 3 #> #> Global Deviance: 4770.994 #> AIC: 4794.515 #> SBC: 4846.419 #> ******************************************************************
gmm
)# setup set.seed(123) library(gmm) # examples come from the "gmm" package ## CAPM test with GMM data(Finance) r <- Finance[1:300, 1:10] rm <- Finance[1:300, "rm"] rf <- Finance[1:300, "rf"] z <- as.matrix(r - rf) t <- nrow(z) zm <- rm - rf h <- matrix(zm, t, 1) res <- gmm::gmm(z ~ zm, x = h) # plot ggstatsplot::ggcoefstats( x = res, package = "palettetown", palette = "victreebel", title = "generalized method of moment estimation" ) #> Note: AIC and BIC values not available, so skipping caption.
glmnet
)Although these models are not directly supported in ggcoefstats
because of the sheer number of terms that are typically present. But this function can still be used to selectively show few of the terms of interest:
# setup library(glmnet) set.seed(2014) # creating a dataframe x <- matrix(rnorm(100 * 20), 100, 20) y <- rnorm(100) fit1 <- glmnet::glmnet(x, y) (df <- broom::tidy(fit1)) #> # A tibble: 1,086 x 5 #> term step estimate lambda dev.ratio #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 1 -0.207 0.152 0 #> 2 (Intercept) 2 -0.208 0.139 0.00464 #> 3 (Intercept) 3 -0.209 0.127 0.0111 #> 4 (Intercept) 4 -0.210 0.115 0.0165 #> 5 (Intercept) 5 -0.210 0.105 0.0240 #> 6 (Intercept) 6 -0.210 0.0957 0.0321 #> 7 (Intercept) 7 -0.210 0.0872 0.0412 #> 8 (Intercept) 8 -0.210 0.0795 0.0497 #> 9 (Intercept) 9 -0.209 0.0724 0.0593 #> 10 (Intercept) 10 -0.208 0.0660 0.0682 #> # ... with 1,076 more rows # displaying only a certain step ggstatsplot::ggcoefstats(x = dplyr::filter(df, step == 4)) #> Note: The argument `statistic` must be specified. #> Skipping labels with statistical details. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details. #> Note: No confidence intervals available for regression coefficientsobject, so skipping whiskers in the plot.
mjoint
)# setup set.seed(123) library(joineRML) data(heart.valve) # data hvd <- heart.valve[!is.na(heart.valve$log.grad) & !is.na(heart.valve$log.lvmi) & heart.valve$num <= 50, ] # model fit <- joineRML::mjoint( formLongFixed = list( "grad" = log.grad ~ time + sex + hs, "lvmi" = log.lvmi ~ time + sex ), formLongRandom = list( "grad" = ~ 1 | num, "lvmi" = ~ time | num ), formSurv = Surv(fuyrs, status) ~ age, data = hvd, inits = list("gamma" = c(0.11, 1.51, 0.80)), timeVar = "time" ) # extract the survival fixed effects and plot them ggstatsplot::ggcoefstats( x = fit, conf.level = 0.99, exclude.intercept = FALSE, component = "longitudinal", package = "yarrr", palette = "basel", title = "joint model to time-to-event data and multivariate longitudinal data" ) #> Parameters can't be retrieved for objects of class 'simpleError'. #> Parameters can't be retrieved for objects of class 'simpleError'.
ergm
)# load the Florentine marriage network data set.seed(123) suppressPackageStartupMessages(library(ergm)) data(florentine) # fit a model where the propensity to form ties between # families depends on the absolute difference in wealth gest <- ergm(flomarriage ~ edges + absdiff("wealth")) # plot ggstatsplot::ggcoefstats( x = gest, conf.level = 0.99, title = "exponential-family random graph models" )
btergm
)# setup library(network) library(btergm) set.seed(123) # create 10 random networks with 10 actors networks <- list() for (i in 1:10) { mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network(mat) # create network object networks[[i]] <- nw # add network to the list } # create 10 matrices as covariate covariates <- list() for (i in 1:10) { mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } # model fit <- btergm::btergm( formula = networks ~ edges + istar(2) + edgecov(covariates), parallel = "multicore", ncpus = 4, R = 100, verbose = FALSE ) # plot ggstatsplot::ggcoefstats( x = fit, title = "Terms used in Exponential Family Random Graph Models", subtitle = "by bootstrapped pseudolikelihood or MCMC MLE" ) #> Note: AIC and BIC values not available, so skipping caption. #> Parameters can't be retrieved for objects of class 'simpleError'. #> Parameters can't be retrieved for objects of class 'simpleError'. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
garch
)# setup set.seed(123) library(tseries) data(EuStockMarkets) # model dax <- diff(log(EuStockMarkets))[, "DAX"] dax.garch <- tseries::garch(x = dax, trace = FALSE) # plot ggstatsplot::ggcoefstats( x = dax.garch, title = "generalized autoregressive conditional heteroscedastic (GARCH)" ) #> #> Could not extract p-values from model object. #> #> Could not extract p-values from model object.
bayesGARCH
)# setup set.seed(123) library(bayesGARCH) # data data(dem2gbp) y <- dem2gbp[1:750] # run the sampler (2 chains) mod <- bayesGARCH(y, control = list(n.chain = 4, l.chain = 200)) #> chain: 1 iteration: 10 parameters: 0.0466 0.2211 0.6633 95.2307 #> chain: 1 iteration: 20 parameters: 0.0453 0.1734 0.6749 100.9931 #> chain: 1 iteration: 30 parameters: 0.0437 0.2983 0.6326 122.0091 #> chain: 1 iteration: 40 parameters: 0.0373 0.2468 0.6516 117.6464 #> chain: 1 iteration: 50 parameters: 0.0334 0.2195 0.6856 132.9895 #> chain: 1 iteration: 60 parameters: 0.0264 0.2395 0.7011 77.8092 #> chain: 1 iteration: 70 parameters: 0.039 0.1546 0.7287 87.2207 #> chain: 1 iteration: 80 parameters: 0.0392 0.1785 0.7007 59.2417 #> chain: 1 iteration: 90 parameters: 0.0263 0.17 0.7616 45.3776 #> chain: 1 iteration: 100 parameters: 0.0404 0.2261 0.653 48.1779 #> chain: 1 iteration: 110 parameters: 0.0413 0.2076 0.6692 32.9934 #> chain: 1 iteration: 120 parameters: 0.038 0.1571 0.7163 42.1684 #> chain: 1 iteration: 130 parameters: 0.0382 0.1816 0.6917 40.3974 #> chain: 1 iteration: 140 parameters: 0.0373 0.2183 0.6454 49.3131 #> chain: 1 iteration: 150 parameters: 0.028 0.2353 0.7217 38.7436 #> chain: 1 iteration: 160 parameters: 0.0329 0.193 0.688 31.2683 #> chain: 1 iteration: 170 parameters: 0.0303 0.1572 0.752 35.0719 #> chain: 1 iteration: 180 parameters: 0.0312 0.1609 0.7358 25.6605 #> chain: 1 iteration: 190 parameters: 0.0324 0.1729 0.7255 27.5541 #> chain: 1 iteration: 200 parameters: 0.0325 0.2029 0.6629 14.5005 #> chain: 2 iteration: 10 parameters: 0.0304 0.194 0.7233 99.5183 #> chain: 2 iteration: 20 parameters: 0.0269 0.1915 0.7047 92.2118 #> chain: 2 iteration: 30 parameters: 0.0376 0.1563 0.7085 120.5545 #> chain: 2 iteration: 40 parameters: 0.0514 0.1677 0.6606 105.1761 #> chain: 2 iteration: 50 parameters: 0.049 0.2186 0.6222 86.5734 #> chain: 2 iteration: 60 parameters: 0.0399 0.2268 0.6536 121.7429 #> chain: 2 iteration: 70 parameters: 0.0453 0.2064 0.6692 125.9502 #> chain: 2 iteration: 80 parameters: 0.0444 0.1968 0.6784 129.3226 #> chain: 2 iteration: 90 parameters: 0.0401 0.3088 0.6002 78.7295 #> chain: 2 iteration: 100 parameters: 0.0827 0.2624 0.4892 86.8581 #> chain: 2 iteration: 110 parameters: 0.0873 0.3117 0.4449 106.5393 #> chain: 2 iteration: 120 parameters: 0.0555 0.2505 0.5971 69.3027 #> chain: 2 iteration: 130 parameters: 0.0419 0.2059 0.6605 86.7515 #> chain: 2 iteration: 140 parameters: 0.0378 0.2346 0.6585 60.4076 #> chain: 2 iteration: 150 parameters: 0.0416 0.1783 0.6736 53.2965 #> chain: 2 iteration: 160 parameters: 0.0303 0.1838 0.7273 46.5358 #> chain: 2 iteration: 170 parameters: 0.0268 0.294 0.635 32.8013 #> chain: 2 iteration: 180 parameters: 0.0231 0.1882 0.7561 31.1195 #> chain: 2 iteration: 190 parameters: 0.0349 0.1794 0.7253 31.5426 #> chain: 2 iteration: 200 parameters: 0.0103 0.194 0.7847 22.3614 #> chain: 3 iteration: 10 parameters: 0.041 0.1999 0.6922 88.6284 #> chain: 3 iteration: 20 parameters: 0.0344 0.2347 0.6883 70.6206 #> chain: 3 iteration: 30 parameters: 0.0561 0.3057 0.5408 66.4712 #> chain: 3 iteration: 40 parameters: 0.0488 0.2034 0.6447 76.0521 #> chain: 3 iteration: 50 parameters: 0.0315 0.2061 0.6692 82.5148 #> chain: 3 iteration: 60 parameters: 0.0363 0.2137 0.6556 62.328 #> chain: 3 iteration: 70 parameters: 0.0326 0.2292 0.6803 59.3234 #> chain: 3 iteration: 80 parameters: 0.0293 0.2246 0.6979 50.8155 #> chain: 3 iteration: 90 parameters: 0.0325 0.2034 0.6891 59.9616 #> chain: 3 iteration: 100 parameters: 0.0328 0.2121 0.6887 56.6966 #> chain: 3 iteration: 110 parameters: 0.0335 0.1786 0.7005 52.6272 #> chain: 3 iteration: 120 parameters: 0.0406 0.2197 0.6569 55.0879 #> chain: 3 iteration: 130 parameters: 0.042 0.2599 0.6224 42.6172 #> chain: 3 iteration: 140 parameters: 0.0309 0.2087 0.7024 30.2424 #> chain: 3 iteration: 150 parameters: 0.0571 0.1536 0.6589 26.7549 #> chain: 3 iteration: 160 parameters: 0.0423 0.229 0.6301 22.5458 #> chain: 3 iteration: 170 parameters: 0.0395 0.1837 0.6636 29.8957 #> chain: 3 iteration: 180 parameters: 0.0518 0.1538 0.6749 23.9366 #> chain: 3 iteration: 190 parameters: 0.0217 0.1947 0.7335 15.7109 #> chain: 3 iteration: 200 parameters: 0.0268 0.1615 0.7403 13.4429 #> chain: 4 iteration: 10 parameters: 0.0246 0.2093 0.7329 76.6507 #> chain: 4 iteration: 20 parameters: 0.0271 0.2488 0.6836 65.9045 #> chain: 4 iteration: 30 parameters: 0.0505 0.2757 0.6133 51.7088 #> chain: 4 iteration: 40 parameters: 0.0381 0.2015 0.7246 42.1259 #> chain: 4 iteration: 50 parameters: 0.0342 0.2354 0.6431 22.8195 #> chain: 4 iteration: 60 parameters: 0.0328 0.2087 0.7055 21.4401 #> chain: 4 iteration: 70 parameters: 0.0224 0.1887 0.7345 22.8402 #> chain: 4 iteration: 80 parameters: 0.0259 0.1626 0.7686 15.2682 #> chain: 4 iteration: 90 parameters: 0.0342 0.1973 0.7063 14.6136 #> chain: 4 iteration: 100 parameters: 0.0225 0.2295 0.6934 13.9877 #> chain: 4 iteration: 110 parameters: 0.0384 0.2655 0.6402 10.9089 #> chain: 4 iteration: 120 parameters: 0.0376 0.1749 0.7103 11.2682 #> chain: 4 iteration: 130 parameters: 0.0251 0.1722 0.7571 9.5268 #> chain: 4 iteration: 140 parameters: 0.0175 0.2019 0.7386 9.5059 #> chain: 4 iteration: 150 parameters: 0.0163 0.166 0.7742 10.7162 #> chain: 4 iteration: 160 parameters: 0.0254 0.1739 0.7633 7.8228 #> chain: 4 iteration: 170 parameters: 0.0152 0.1743 0.788 7.7128 #> chain: 4 iteration: 180 parameters: 0.036 0.2483 0.6861 7.8708 #> chain: 4 iteration: 190 parameters: 0.0513 0.2198 0.6061 9.5468 #> chain: 4 iteration: 200 parameters: 0.0276 0.2341 0.6831 8.5831 # plot ggstatsplot::ggcoefstats( x = mod, title = "Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
fitdistr
)# model set.seed(123) library(MASS) x <- rnorm(100, 5, 2) fit <- MASS::fitdistr(x, dnorm, list(mean = 3, sd = 1)) # plot ggstatsplot::ggcoefstats( x = fit, title = "maximum-likelihood fitting of univariate distributions", ggtheme = ggthemes::theme_pander() ) #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
mle2
)# setup set.seed(123) library(bbmle) # data x <- 0:10 y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) d <- data.frame(x, y) # custom function LL <- function(ymax = 15, xhalf = 6) { -sum(stats::dpois(y, lambda = ymax / (1 + x / xhalf), log = TRUE)) } # use default parameters of LL fit <- bbmle::mle2(LL, fixed = list(xhalf = 6)) # plot ggstatsplot::ggcoefstats( x = fit, title = "maximum likelihood estimation", ggtheme = ggthemes::theme_excel_new() ) #> Note: AIC and BIC values not available, so skipping caption. #> Parameters can't be retrieved for objects of class 'simpleError'. #> Parameters can't be retrieved for objects of class 'simpleError'.
emmeans
package (emmGrid
)# setup set.seed(123) library(emmeans) # linear model for sales of oranges per day oranges_lm1 <- stats::lm( formula = sales1 ~ price1 + price2 + day + store, data = oranges ) # reference grid; see vignette("basics", package = "emmeans") oranges_rg1 <- emmeans::ref_grid(oranges_lm1) marginal <- emmeans::emmeans(oranges_rg1, "day") # plot ggstatsplot::ggcoefstats( x = marginal, point.args = list(color = "darkgreen", shape = 9), title = "summary grid from `emmeans` package" ) #> Note: AIC and BIC values not available, so skipping caption.
orcutt
)# setup library(orcutt) set.seed(123) # model reg <- stats::lm(formula = mpg ~ wt + qsec + disp, data = mtcars) co <- orcutt::cochrane.orcutt(reg) # plot ggstatsplot::ggcoefstats( x = co, title = "Cochrane-Orcutt estimation" ) #> Note: AIC and BIC values not available, so skipping caption.
confusionMatrix
)# setup library(caret) set.seed(123) # setting up confusion matrix two_class_sample1 <- as.factor(sample(letters[1:2], 100, TRUE)) two_class_sample2 <- as.factor(sample(letters[1:2], 100, TRUE)) two_class_cm <- caret::confusionMatrix(two_class_sample1, two_class_sample2) # plot ggstatsplot::ggcoefstats( x = two_class_cm, by.class = TRUE, title = "confusion matrix" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
drc
)# setup set.seed(123) library(drc) # model mod <- drc::drm( formula = dead / total ~ conc, curveid = type, weights = total, data = selenium, fct = LL.2(), type = "binomial" ) # plot ggstatsplot::ggcoefstats( x = mod, conf.level = 0.99, title = "Dose-Response Curves" )
flexsurvreg
)# setup set.seed(123) library(flexsurv) data(ovarian) # compare generalized gamma fit with Weibull fitg <- flexsurvreg( formula = Surv(futime, fustat) ~ 1, data = ovarian, dist = "gengamma" ) ggstatsplot::ggcoefstats( x = fitg, title = "Flexible parametric regression for time-to-event data" ) #> 'r2()' does not support models of class 'flexsurvreg'. #> Can't extract residuals from model.
wblm
)# setup set.seed(123) library(panelr) # data data("WageData") wages <- panelr::panel_data(data = WageData, id = id, wave = t) # model mod <- wbm( formula = lwage ~ lag(union) + wks | blk + fem | blk * lag(union), data = wages ) # plot ggstatsplot::ggcoefstats( x = mod, title = "panel regression models fit via multilevel modeling" )
wbgee
)# setup set.seed(123) library(panelr) data("WageData") # data wages <- panelr::panel_data(data = WageData, id = id, wave = t) # model model <- panelr::wbgee( formula = lwage ~ lag(union) + wks | blk + fem | blk * lag(union), data = wages ) # plot ggstatsplot::ggcoefstats( x = model, exclude.intercept = FALSE, conf.level = 0.99, title = "panel regression models fit via generalized estimating equations" )
crch
)# setup set.seed(123) library(crch) data("RainIbk") # mean and standard deviation of square root transformed ensemble forecasts RainIbk$sqrtensmean <- apply(sqrt(RainIbk[, grep("^rainfc", names(RainIbk))]), 1, mean) RainIbk$sqrtenssd <- apply(sqrt(RainIbk[, grep("^rainfc", names(RainIbk))]), 1, sd) # fit linear regression model with Gaussian distribution CRCH <- crch( formula = sqrt(rain) ~ sqrtensmean, data = RainIbk, dist = "gaussian" ) # plot ggstatsplot::ggcoefstats( x = CRCH, title = "Censored Regression with Conditional Heteroscedasticy" ) #> Note: AIC and BIC values not available, so skipping caption.
epi.2by2
)# setup set.seed(123) library(epiR) # data dat <- matrix(c(13, 2163, 5, 3349), nrow = 2, byrow = TRUE) rownames(dat) <- c("DF+", "DF-") colnames(dat) <- c("FUS+", "FUS-") # model fit <- epiR::epi.2by2( dat = as.table(dat), method = "cross.sectional", conf.level = 0.95, units = 100, outcome = "as.columns" ) # tidy dataframe ggstatsplot::ggcoefstats( x = fit, parameters = "moa", title = "Summary measures for count data presented in a 2 by 2 table" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
brmsfit
)# setup set.seed(123) library(brms) # prior bprior1 <- prior(student_t(5, 0, 10), class = b) + prior(cauchy(0, 2), class = sd) # model fit1 <- brms::brm( formula = count ~ Age + Base * Trt + (1 | patient), data = epilepsy, family = poisson(), prior = bprior1, silent = TRUE ) # plot ggstatsplot::ggcoefstats( x = fit1, exclude.intercept = FALSE, conf.method = "HPDinterval", title = "Bayesian generalized (non-)linear multivariate multilevel models", subtitle = "using `brms` package" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
Let’s see another example where we use brms
to run the same model on multiple datasets-
# setup set.seed(123) library(brms) library(mice) # data imp <- mice::mice(data = nhanes2, print = FALSE) # fit the model using brms fit <- brms::brm_multiple( formula = bmi ~ age + hyp + chl, data = imp, chains = 1 ) #> #> SAMPLING FOR MODEL '06638bf6dad6b0252e01ce5a7504d5bc' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.165 seconds (Warm-up) #> Chain 1: 0.034 seconds (Sampling) #> Chain 1: 0.199 seconds (Total) #> Chain 1: #> #> SAMPLING FOR MODEL '06638bf6dad6b0252e01ce5a7504d5bc' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.136 seconds (Warm-up) #> Chain 1: 0.037 seconds (Sampling) #> Chain 1: 0.173 seconds (Total) #> Chain 1: #> #> SAMPLING FOR MODEL '06638bf6dad6b0252e01ce5a7504d5bc' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.129 seconds (Warm-up) #> Chain 1: 0.036 seconds (Sampling) #> Chain 1: 0.165 seconds (Total) #> Chain 1: #> #> SAMPLING FOR MODEL '06638bf6dad6b0252e01ce5a7504d5bc' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.141 seconds (Warm-up) #> Chain 1: 0.038 seconds (Sampling) #> Chain 1: 0.179 seconds (Total) #> Chain 1: #> #> SAMPLING FOR MODEL '06638bf6dad6b0252e01ce5a7504d5bc' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.176 seconds (Warm-up) #> Chain 1: 0.036 seconds (Sampling) #> Chain 1: 0.212 seconds (Total) #> Chain 1: # plot ggstatsplot::ggcoefstats( x = fit, title = "Same `brms` model with multiple datasets", conf.level = 0.99 ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
stanreg
)# set up set.seed(123) library(rstanarm) # model fit <- rstanarm::stan_glm( formula = mpg ~ wt + am, data = mtcars, chains = 1 ) #> #> SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0.002 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 20 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup) #> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup) #> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup) #> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup) #> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup) #> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup) #> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling) #> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling) #> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling) #> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling) #> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling) #> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.066 seconds (Warm-up) #> Chain 1: 0.071 seconds (Sampling) #> Chain 1: 0.137 seconds (Total) #> Chain 1: # plot ggstatsplot::ggcoefstats( x = fit, title = "Bayesian generalized linear models via Stan" ) #> object 'draws' not found #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
stanmvreg
)# setup set.seed(123) library(rstanarm) pbcLong$ybern <- as.integer(pbcLong$logBili >= mean(pbcLong$logBili)) # model mod <- rstanarm::stan_mvmer( formula = list( ybern ~ year + (1 | id), albumin ~ sex + year + (year | id) ), data = pbcLong, cores = 4, seed = 12345, iter = 1000 ) #> Fitting a multivariate glmer model. #> #> Please note the warmup may be much slower than later iterations! # plot ggstatsplot::ggcoefstats( x = mod, title = "Bayesian multivariate generalized linear models via Stan" ) #> argument is of length zero #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
nlstanreg
)# setup set.seed(123) library(rstanarm) data("Orange", package = "datasets") Orange$circumference <- Orange$circumference / 100 Orange$age <- Orange$age / 100 # model fit <- rstanarm::stan_nlmer( formula = circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym | Tree, data = Orange, # for speed only chains = 1, iter = 1000 ) #> #> SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup) #> Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup) #> Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup) #> Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup) #> Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup) #> Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup) #> Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling) #> Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling) #> Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling) #> Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling) #> Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling) #> Chain 1: Iteration: 1000 / 1000 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.904 seconds (Warm-up) #> Chain 1: 0.523 seconds (Sampling) #> Chain 1: 1.427 seconds (Total) #> Chain 1: # plot ggstatsplot::ggcoefstats( x = fit, title = "Bayesian nonlinear models via Stan" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
bayesx
)# setup set.seed(111) library(R2BayesX) ## generate some data n <- 200 ## regressor dat <- data.frame(x = runif(n, -3, 3)) ## response dat$y <- with(dat, 1.5 + sin(x) + rnorm(n, sd = 0.6)) ## estimate models with ## bayesx REML and MCMC b1 <- R2BayesX::bayesx(y ~ sx(x), method = "REML", data = dat) # plot ggstatsplot::ggcoefstats( x = b1, title = "STAR Models with BayesX", exclude.intercept = FALSE )
MCMCglmm
)# setup set.seed(123) library(lme4) library(MCMCglmm) data(sleepstudy) # model mm0 <- MCMCglmm::MCMCglmm( fixed = scale(Reaction) ~ scale(Days), random = ~Subject, data = sleepstudy, nitt = 4000, pr = TRUE, verbose = FALSE ) # plot ggstatsplot::ggcoefstats( x = mm0, title = "multivariate generalized linear mixed model", conf.method = "HPDinterval", exclude.intercept = FALSE, robust = TRUE ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
mcmc
)# loading the data set.seed(123) library(coda) data(line) # select first chain x1 <- line[[1]] # plot ggstatsplot::ggcoefstats( x = x1, title = "Markov Chain Monte Carlo objects", robust = TRUE, ess = TRUE ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
rjags
)# setup set.seed(123) library(R2jags) # An example model file is given in: model.file <- system.file(package = "R2jags", "model", "schools.txt") # data J <- 8.0 y <- c(28.4, 7.9, -2.8, 6.8, -0.6, 0.6, 18.0, 12.2) sd <- c(14.9, 10.2, 16.3, 11.0, 9.4, 11.4, 10.4, 17.6) # setting up model jags.data <- list("y", "sd", "J") jags.params <- c("mu", "sigma", "theta") jags.inits <- function() { list("mu" = rnorm(1), "sigma" = runif(1), "theta" = rnorm(J)) } # model fitting jagsfit <- R2jags::jags( data = list("y", "sd", "J"), inits = jags.inits, jags.params, n.iter = 10, model.file = model.file ) #> Compiling model graph #> Resolving undeclared variables #> Allocating nodes #> Graph information: #> Observed stochastic nodes: 8 #> Unobserved stochastic nodes: 10 #> Total graph size: 41 #> #> Initializing model # plot ggstatsplot::ggcoefstats( x = jagsfit, title = "Markov Chain Monte Carlo with Just Another Gibbs Sampler (JAGS)" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
lavaan
)# setup set.seed(123) library(lavaan) # The Holzinger and Swineford (1939) example HS.model <- " visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 " # model fit <- lavaan::lavaan( HS.model, data = HolzingerSwineford1939, auto.var = TRUE, auto.fix.first = TRUE, auto.cov.lv.x = TRUE ) # tidy ggstatsplot::ggcoefstats( x = fit, title = "latent variable model", conf.level = 0.99 ) #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
blavaan
)# setup set.seed(123) library(blavaan) future::plan("multiprocess") # The Holzinger and Swineford (1939) example HS.model <- " visual =~ x1 + x2 + x3 textual =~ x4 + x5 + x6 speed =~ x7 + x8 + x9 " # model fit <- blavaan::blavaan( HS.model, data = HolzingerSwineford1939, auto.var = TRUE, auto.fix.first = TRUE, auto.cov.lv.x = TRUE ) #> Computing posterior predictives... # tidy ggstatsplot::ggcoefstats( x = fit, title = "Bayesian latent variable model" ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
list
outputsNote that a number of regression models will return an object of class list
, in which case this function will fail. But often you can extract the object of interest from this list and use it to plot the regression coefficients.
# setup library(gamm4) set.seed(123) # data dat <- gamSim(1, n = 400, scale = 2) #> Gu & Wahba 4 term additive model # now add 20 level random effect `fac'... dat$fac <- fac <- as.factor(sample(1:20, 400, replace = TRUE)) dat$y <- dat$y + model.matrix(~ fac - 1) %*% rnorm(20) * .5 # model object br <- gamm4::gamm4( formula = y ~ s(x0) + x1 + s(x2), data = dat, random = ~ (1 | fac) ) # looking at the classes of the objects contained in the list purrr::map(br, class) #> $mer #> [1] "lmerMod" #> attr(,"package") #> [1] "lme4" #> #> $gam #> [1] "gam" # plotting ggstatsplot::combine_plots( # first object plot (only parametric terms are shown) ggstatsplot::ggcoefstats( x = br$gam, exclude.intercept = FALSE, title = "generalized additive model (parametric terms)", k = 3 ), # second object plot ggstatsplot::ggcoefstats( x = br$mer, exclude.intercept = FALSE, title = "linear mixed-effects model", k = 3 ), nrow = 1 ) #> Note: AIC and BIC values not available, so skipping caption. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
In case the estimates you are displaying come from multiple studies, you can also use this function to carry out random-effects meta-analysis. The dataframe you enter must contain at the minimum the following three columns-
term
: a column with names/identifiers to annotate each study/effectestimate
: a column with the observed effect sizes or outcomesstd.error
: a column the corresponding standard errors# setup set.seed(123) library(metaplus) # renaming to what the function expects df <- dplyr::rename(mag, estimate = yi, std.error = sei, term = study) # plot ggstatsplot::ggcoefstats( x = df, meta.analytic.effect = TRUE, meta.type = "parametric", title = "parametric random-effects meta-analysis" ) #> Note: The argument `statistic` must be specified. #> Skipping labels with statistical details. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
# plot ggstatsplot::ggcoefstats( x = df, meta.analytic.effect = TRUE, meta.type = "robust", title = "robust random-effects meta-analysis" ) #> Note: The argument `statistic` must be specified. #> Skipping labels with statistical details. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
# plot ggstatsplot::ggcoefstats( x = df, meta.analytic.effect = TRUE, meta.type = "bayes", title = "Bayesian random-effects meta-analysis" ) #> Note: The argument `statistic` must be specified. #> Skipping labels with statistical details. #> Note: No p-values and/or statistic available for the model object; #> skipping labels with statistical details.
You can also provide a dataframe containing all the other relevant information for additionally displaying labels with statistical information.
# let's make up a dataframe (with all available details) df_full <- tibble::tribble( ~term, ~statistic, ~estimate, ~std.error, ~p.value, ~df.residual, "study1", 0.158, 0.0665, 0.778, 0.875, 5L, "study2", 1.33, 0.542, 0.280, 0.191, 10L, "study3", 1.24, 0.045, 0.030, 0.001, 12L, "study4", 0.156, 0.500, 0.708, 0.885, 8L, "study5", 0.33, 0.032, 0.280, 0.101, 2L, "study6", 1.04, 0.085, 0.030, 0.001, 3L ) # plot ggstatsplot::ggcoefstats( x = df_full, meta.analytic.effect = TRUE, statistic = "t", package = "LaCroixColoR", palette = "paired" )
Sometimes you don’t have a model object but a custom dataframe that you want display using this function. If a data frame is to be plotted, it must contain columns named term
(names of predictors), and estimate
(corresponding estimates of coefficients or other quantities of interest). Other optional columns are conf.low
and conf.high
(for confidence intervals), and p.value
. You will also have to specify the type of statistic relevant for regression models ("t"
, "z"
, "f"
) in case you want to display statistical labels.
# set up set.seed(123) library(ggstatsplot) # data df <- structure( list( term = structure( 1:4, .Label = c("Africa", "Americas", "Asia", "Europe", "Oceania"), class = "factor" ), estimate = c( 1.67094092445114, 2.04302016067304, 7.7776936709062, 1.58877855949562 ), std.error = c( 0.424193563374225, 0.317894853532806, 1.54035329785677, 0.376364844792269 ), statistic = c( 3.93910013900195, 6.42671668939808, 5.04929205639251, 4.22137875383269 ), conf.low = c( 0.839536817763953, 1.41995769687811, 4.75865668363943, 0.851117018655769 ), conf.high = c( 2.50234503113833, 2.66608262446798, 10.796730658173, 2.32644010033548 ), p.value = c( 8.17877903974459e-05, 1.3038950646035e-10, 4.43450343577918e-07, 2.42812549594511e-05 ) ), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame") ) # plot ggstatsplot::ggcoefstats( x = df, statistic = "t", sort = "ascending", title = "Relationship between life expectancy and GDP", subtitle = "Source: Gapminder foundation", caption = "Data from Oceania continent not included" )
This function can also be used to extract outputs other than a plot, although it is much more preferable to use the underlying functions instead (e.g., broom::tidy
or parameters::model_parameters
).
# setup set.seed(123) library(ggstatsplot) # data DNase1 <- subset(DNase, Run == 1) # using a selfStart model nlmod <- stats::nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1) # tidy dataframe ggcoefstats(nlmod, output = "tidy") #> # A tibble: 3 x 11 #> term estimate conf.low conf.high std.error statistic df.error p.value #> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> #> 1 Asym 2.35 2.19 2.50 0.0782 30.0 13 2.17e-13 #> 2 xmid 1.48 1.32 1.64 0.0814 18.2 13 1.22e-10 #> 3 scal 1.04 0.978 1.10 0.0323 32.3 13 8.51e-14 #> significance p.value.formatted #> <chr> <chr> #> 1 *** <= 0.001 #> 2 *** <= 0.001 #> 3 *** <= 0.001 #> label #> <chr> #> 1 list(~widehat(italic(beta))==2.35, ~italic(t)(13)==30.01, ~italic(p)<= 0.001) #> 2 list(~widehat(italic(beta))==1.48, ~italic(t)(13)==18.23, ~italic(p)<= 0.001) #> 3 list(~widehat(italic(beta))==1.04, ~italic(t)(13)==32.27, ~italic(p)<= 0.001) # glance summary ggcoefstats(nlmod, output = "glance") #> # A tibble: 1 x 9 #> sigma isconv fintol loglik aic bic deviance df.residual nobs #> <dbl> <lgl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> #> 1 0.0192 TRUE 0.00000828 42.2 -76.4 -73.3 0.00479 13 16
This vignette was supposed to give a comprehensive account of regression models supported by ggcoefstats
. The list of supported models will keep expanding as additional tidiers are added to the broom
, broom.mixed
, parameters
, and performance
packages.
Note that not all models supported in these packages will be supported by ggcoefstats
. In particular, classes of objects for which there is no column for estimate
(e.g., kmeans
, optim
, muhaz
, survdiff
, zoo
, etc.) are not supported.
If you find any bugs or have any suggestions/remarks, please file an issue on GitHub
: https://github.com/IndrajeetPatil/ggstatsplot/issues
For details, see- https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/session_info.html