| Title: | Tools for Creating Publication-Ready Regression Tables |
|---|---|
| Description: | Simplifies regression modeling in R by integrating multiple modeling and summarization tools into a cohesive, user-friendly interface. Designed to be accessible for researchers, particularly those in Low- and Middle-Income Countries (LMIC). Built upon widely accepted statistical methods, including logistic regression (Hosmer et al. 2013, ISBN:9781118548429), log-binomial regression (Spiegelman and Hertzmark 2005 <doi:10.1093/aje/kwi188>), Poisson and robust Poisson regression (Zou 2004 <doi:10.1093/aje/kwh090>), negative binomial regression (Hilbe 2011, ISBN:9780521179515), and linear regression (Kutner et al. 2005, ISBN:9780071122214). Leverages multiple dependencies to ensure high-quality output and generate reproducible, publication-ready tables in alignment with best practices in epidemiology and applied statistics. |
| Authors: | Rubeshkumar Polani [aut, cre] (ORCID: <https://orcid.org/0000-0002-0418-7592>), Salin K Eliyas [aut] (ORCID: <https://orcid.org/0000-0002-8020-5860>), Manikandanesan Sakthivel [aut] (ORCID: <https://orcid.org/0000-0002-5438-3970>), Yuvaraj Krishnamoorthy [aut] (ORCID: <https://orcid.org/0000-0003-4688-510X>), Marie Gilbert Majella [aut] (ORCID: <https://orcid.org/0000-0003-4036-5162>) |
| Maintainer: | Rubeshkumar Polani <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0 |
| Built: | 2026-03-27 08:00:01 UTC |
| Source: | https://github.com/thinkdenominator/gtregression |
Works for any object from this package, since they all carry class ‘"gtregression"'. Returns NULL (quietly) if the field isn’t present.
## S3 method for class 'gtregression' x$name## S3 method for class 'gtregression' x$name
Common fields: - table, table_display, table_body - models, model_summaries, reg_check - approach, format (or engine), source - parts, spanners (for merged tables) - by, levels (for descriptive tables)
Fits a model under the hood (like 'uni_reg()'/'multi_reg()') and runs assumption checks appropriate to the approach. Returns a layered object with a tidy summary table and plots for quick visual diagnosis.
check_assumptions( data, outcome, exposures = NULL, approach = c("auto", "linear", "logit", "log-binomial", "poisson", "robpoisson", "negbin"), multivariate = TRUE, confounders = NULL, weights = NULL, cluster = NULL, groups = 10, top_n = 5, explain = TRUE, output = c("both", "summary", "plots"), quiet = TRUE, ... )check_assumptions( data, outcome, exposures = NULL, approach = c("auto", "linear", "logit", "log-binomial", "poisson", "robpoisson", "negbin"), multivariate = TRUE, confounders = NULL, weights = NULL, cluster = NULL, groups = 10, top_n = 5, explain = TRUE, output = c("both", "summary", "plots"), quiet = TRUE, ... )
data |
A data frame. |
outcome |
Character. Outcome variable name. |
exposures |
Character vector of predictors. If 'NULL', uses all columns except 'outcome'. |
approach |
One of '"auto","linear","logit","log-binomial","poisson", "robpoisson","negbin"'. Default '"auto"'. |
multivariate |
Logical. If 'TRUE', fits one adjusted model with all 'exposures' (and 'confounders' if supplied). If 'FALSE', screens each exposure with 'outcome ~ exposure' and stacks the diagnostics. |
confounders |
Optional character vector (used when 'multivariate = TRUE'). |
weights |
Optional weights vector name (character). |
cluster |
Optional cluster id variable name for robust notes (placeholder). |
groups |
Integer. Number of bins for calibration curve (binary models). |
top_n |
Integer. How many influential points to list/mark. |
explain |
Logical. If 'TRUE', adds plain-English suggestions to notes. |
output |
One of '"both","summary","plots"'. |
quiet |
Logical. Suppress messages. |
... |
Reserved for future options. |
An object of class 'gt_assumption_check' with: - '$summary': tibble of assumption results - '$plots': named list of ggplot objects (may be empty) - '$details': raw test objects to aid reproducibility (always includes '$fit') - '$meta': list with 'approach', 'formula', 'multivariate', 'n', 'weights_used'
# Logistic example df <- mtcars; df$am <- as.integer(df$am) ac <- check_assumptions( data = df, outcome = "am", exposures = c("wt","hp"), approach = "auto", multivariate = TRUE, explain = TRUE ) ac$summary if (interactive()) plot(ac) # Poisson example ac2 <- check_assumptions( data = warpbreaks, outcome = "breaks", exposures = c("wool","tension"), approach = "auto", multivariate = TRUE ) ac2$summary# Logistic example df <- mtcars; df$am <- as.integer(df$am) ac <- check_assumptions( data = df, outcome = "am", exposures = c("wt","hp"), approach = "auto", multivariate = TRUE, explain = TRUE ) ac$summary if (interactive()) plot(ac) # Poisson example ac2 <- check_assumptions( data = warpbreaks, outcome = "breaks", exposures = c("wool","tension"), approach = "auto", multivariate = TRUE ) ac2$summary
Computes Variance Inflation Factors (VIF) for fitted models returned by uni_reg(), multi_reg(), uni_reg_nbin(), or multi_reg_nbin(). Returns one VIF table per model. For multivariate models only
check_collinearity(model)check_collinearity(model)
model |
A fitted model object with class "uni_reg", "multi_reg", "uni_reg_nbin", or "multi_reg_nbin". |
A tibble containing VIF values and interpretation. For multivariable models, returns one tibble. For univariate models, an error is raised indicating VIF is not applicable.
if (requireNamespace("gtregression", quietly = TRUE) && requireNamespace("mlbench", quietly = TRUE) && getRversion() >= "4.1.0") { data(PimaIndiansDiabetes2, package = "mlbench") pima <- PimaIndiansDiabetes2 |> dplyr::filter(!is.na(diabetes)) pima$diabetes <- ifelse(pima$diabetes == "pos", 1, 0) fit <- multi_reg(pima, outcome = "diabetes", exposures = c("age", "mass", "glucose"), approach = "logit" ) check_collinearity(fit) }if (requireNamespace("gtregression", quietly = TRUE) && requireNamespace("mlbench", quietly = TRUE) && getRversion() >= "4.1.0") { data(PimaIndiansDiabetes2, package = "mlbench") pima <- PimaIndiansDiabetes2 |> dplyr::filter(!is.na(diabetes)) pima$diabetes <- ifelse(pima$diabetes == "pos", 1, 0) fit <- multi_reg(pima, outcome = "diabetes", exposures = c("age", "mass", "glucose"), approach = "logit" ) check_collinearity(fit) }
Assesses model convergence and provides diagnostics for each exposure (in univariate mode) or for the full model (in multivariable mode), depending on the regression approach used.
check_convergence( data, exposures, outcome, approach = "logit", multivariate = FALSE )check_convergence( data, exposures, outcome, approach = "logit", multivariate = FALSE )
data |
A data frame containing the dataset. |
exposures |
A character vector of predictor variable names.
If |
outcome |
A character string specifying the outcome variable. |
approach |
A character string specifying the regression approach.
One of:
|
multivariate |
Logical. If |
For robpoisson, predicted probabilities (fitted values) may exceed 1,
which is acceptable when estimating risk ratios but should not be interpreted
as actual probabilities.
This function is useful for identifying convergence issues, especially for
"log-binomial" models, which often fail to converge .
A data frame summarizing convergence diagnostics, including:
ExposureName of the exposure variable.
ModelThe regression approach used.
ConvergedTRUE if the model converged successfully;
FALSE otherwise.
Max.probMaximum predicted probability or fitted value in the dataset.
[identify_confounder()], [interaction_models()]
if (requireNamespace("gtregression", quietly = TRUE)) { data(data_PimaIndiansDiabetes, package = "gtregression") check_convergence( data = data_PimaIndiansDiabetes, exposures = c("age", "bmi"), outcome = "diabetes", approach = "logit" ) check_convergence( data = data_PimaIndiansDiabetes, exposures = c("age", "bmi"), outcome = "diabetes", approach = "logit", multivariate = TRUE ) }if (requireNamespace("gtregression", quietly = TRUE)) { data(data_PimaIndiansDiabetes, package = "gtregression") check_convergence( data = data_PimaIndiansDiabetes, exposures = c("age", "bmi"), outcome = "diabetes", approach = "logit" ) check_convergence( data = data_PimaIndiansDiabetes, exposures = c("age", "bmi"), outcome = "diabetes", approach = "logit", multivariate = TRUE ) }
A dataset from the MASS package containing risk factors associated with low birth weight (LBW) in newborns. Originally collected at Baystate Medical Center, Springfield, Massachusetts, USA.
data_birthwtdata_birthwt
A data frame with 189 observations and 10 variables:
Indicator for birth weight < 2500g (binary):
0 = normal, 1 = low birth weight
Mother's age in years (numeric)
Mother's weight in pounds at last menstrual period (numeric)
Mother's race (factor): 1 = White, 2 = Black,
3 = Other
Smoking status during pregnancy (binary): 0 = No,
1 = Yes
Number of previous premature labors (integer)
History of hypertension (binary): 0 = No, 1 = Yes
Presence of uterine irritability (binary): 0 = No,
1 = Yes
no of physician visits during the 1st trimester (integer, 0–6)
Birth weight in grams (numeric)
The outcome variable is binary ('low'): birth weight < 2500g (yes = 1) or not (no = 0).
Hosmer, D.W., Lemeshow, S. (1989). *Applied Logistic Regression.* New York: Wiley. Also available in MASS and described in detail in its documentation.
RCT on the effect of a drug on the seizures in patients with epilepsy. Contains repeated measures data with treatment groups, baseline seizure counts, and follow-up counts.
data_epilepsydata_epilepsy
A data frame with 236 observations and 9 variables:
Number of seizures in a 2-week period (count)
Treatment group (factor): placebo or progabide
Seizure count during baseline period (numeric)
Age of patient (numeric)
Indicator for 4th visit (binary)
Patient ID (factor)
Follow-up period number (integer)
Log of baseline seizures (numeric)
Log of age (numeric)
MASS package. Original data from Thall and Vail (1990)
This dataset contains observations on the number of days absent from school for children in rural Australia, along with student characteristics. It's commonly used to demonstrate count models such as Poisson and Negative Binomial regression.
data_gt_quindata_gt_quin
A data frame with 146 observations and 5 variables:
Ethnicity ("A" = Aboriginal,
"N" = Non-Aboriginal)
Sex ("F" or "M")
Age group ("F0", "F1", "F2", "F3")
Learner status ("AL" = average learner,
"SL" = slow learner)
Number of days absent from school (count outcome)
MASS package. See also Venables and Ripley (2002), *Modern Applied Statistics with S*.
investigating the relationship between infertility and abortions.
data_infertilitydata_infertility
A data frame with 248 observations and 8 variables:
Education level (0 = 0–5 years, 1 = 6–11 years, 2 = 12+ years)
Age in years
Number of prior pregnancies
Number of induced abortions
Infertility case status (1 = case, 0 = control)
Number of spontaneous abortions
Matched set ID
Pooled stratum ID used for conditional regression
https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/infert.html
Survival data from a clinical trial of lung cancer patients conducted by the Veteran's Administration.
data_lungcancerdata_lungcancer
A data frame with 137 observations and 8 variables:
Treatment group (1 = standard, 2 = test)
Cell type (squamous, smallcell, adeno, large)
Survival time (in days)
Censoring status (1 = died, 0 = censored)
Karnofsky performance score (higher = better)
Months from diagnosis to randomization
Age in years
Prior therapy (0 = no, 10 = yes)
https://CRAN.R-project.org/package=survival
Kalbfleisch JD and Prentice RL (1980). The Statistical Analysis of Failure Time Data.
A cleaned version of the original Pima Indians Diabetes dataset from the 'mlbench' package. Useful for demonstrating regression approaches for binary outcomes.
data_PimaIndiansDiabetesdata_PimaIndiansDiabetes
A data frame with 768 observations and 9 variables:
Number of times pregnant
Plasma glucose concentration (glucose tolerance test)
Diastolic blood pressure (mm Hg)
Triceps skin fold thickness (mm)
2-Hour serum insulin (mu U/ml)
Body mass index (BMI)
Diabetes pedigree function
Age in years
Factor indicating diabetes status (pos/neg)
descriptive_table( data, exposures, by = NULL, percent = c("column", "row"), digits = 1, show_missing = c("ifany", "no"), show_dichotomous = c("all_levels", "single_row"), show_overall = c("no", "first", "last"), statistic = NULL, value = NULL, format = c("gt", "flextable"), theme = c("minimal") )descriptive_table( data, exposures, by = NULL, percent = c("column", "row"), digits = 1, show_missing = c("ifany", "no"), show_dichotomous = c("all_levels", "single_row"), show_overall = c("no", "first", "last"), statistic = NULL, value = NULL, format = c("gt", "flextable"), theme = c("minimal") )
data |
data.frame |
exposures |
character; variables to summarise |
by |
optional single grouping variable |
percent |
"column" (default) or "row"; aliases like "col"/"rows" accepted |
digits |
integer; decimals for \itemshow_missing"ifany" (default) or "no" \itemshow_dichotomous"all_levels" (default) or "single_row" \itemshow_overall"no" (default), "first", or "last" \itemstatisticoptional named vector per continuous var: values in "mean","median","mode","count" (default is "median" = Median (IQR)) \itemvalueoptional named list for single-row binaries (e.g., list(sex="Female")) \itemformat"gt" (default) or "flextable" \itemthemepreset or primitives |
list with class c("gtregression","descriptive_table", <engine>):
$table: gt_tbl or flextable
$table_display: display-ready data
$table_body: long audit data (var/level/type)
metadata fields
Publication-ready summary of categorical and continuous variables (optionally stratified). Mimics the OG gtsummary style: * column headers include N, e.g. "Overall, N=200" * categorical rows shown as n ( * continuous rows default to Median (IQR) (footnote reflects summary)
Returns a tidy summary of each variable's structure, missingness, uniqueness, and suitability for use in regression models.
dissect(data)dissect(data)
data |
A data frame. |
A tibble with columns: Variable, Type, Missing ( and Regression Hint.
dissect(data_birthwt)dissect(data_birthwt)
Wrapper around 'forestploter::forest()' that works directly with 'forest_df()' output or with raw regression objects.
forest_reg( df = NULL, uni = NULL, multi = NULL, desc = NULL, theme = NULL, ci_col_width = 0.25, side = c("right", "left"), quiet = TRUE, ... )forest_reg( df = NULL, uni = NULL, multi = NULL, desc = NULL, theme = NULL, ci_col_width = 0.25, side = c("right", "left"), quiet = TRUE, ... )
df |
Output of 'forest_df()'. If 'NULL', will be built from (uni, multi, desc). |
uni, multi, desc
|
Optional gtregression objects to pass through to 'forest_df()'. |
theme |
Optional 'forestploter::forest_theme()'. If 'NULL', a sensible default is used. You may pass colors and styling either here (e.g., 'ci_col', 'refline_gp') or via '...'. |
ci_col_width |
Numeric or length-2 numeric. Relative width of the CI column(s). A vector like 'c(0.22, 0.26)' lets you tune uni vs adjusted columns separately. |
side |
Character. For each effect, position of the plot relative to the effect-size text: '"left"' = plot first then text; '"right"' = text first then plot. **Note:** The 'Characteristic' column (and any descriptive/summary columns) always remains on the left. |
quiet |
Logical. Suppress forestploter warnings. Default = 'TRUE'. |
... |
Passed to 'forestploter::forest()'. Common options include: 'ci_col', 'point_col', 'point_shape', 'rowheight', 'ticks_at', 'title', 'footnote'. |
bold_headers |
Logical. Bold the exposure headers (non-indented rows) in the first column. Default 'TRUE'. |
A 'gtregression_forest' object with elements: - 'plot': the forest plot - 'data': the input data frame (post-processed order, no 'se_*' columns) - 'meta': model metadata
Identifies whether one or more variables are confounders by comparing the crude and adjusted effect estimates of a primary exposure on an outcome. A variable is flagged as a confounder if its inclusion changes the estimate by more than a specified threshold (default = 10
identify_confounder( data, outcome, exposure, potential_confounder, approach = "logit", threshold = 10 )identify_confounder( data, outcome, exposure, potential_confounder, approach = "logit", threshold = 10 )
data |
A data frame containing the variables. |
outcome |
The name of the outcome variable (character string). |
exposure |
The primary exposure variable (character string). |
potential_confounder |
One or more variables to test as potential confounders. |
approach |
The regression modeling approach. One of:
|
threshold |
Numeric. Percent change threshold to define confounding (default = 10). If the absolute percent change exceeds this, the variable is flagged as a confounder. |
Supports logistic, log-binomial, Poisson, robust Poisson, negative binomial, and linear regression approaches.
This method does not evaluate effect modification. Use causal diagrams (e.g., DAGs) and subject-matter knowledge to supplement decisions.
If one confounder is provided, prints crude and adjusted estimates with a confounding flag. If multiple are given, returns a tibble with:
Name of potential confounder.
Crude effect estimate.
Adjusted estimate including the confounder.
Percent change from crude to adjusted.
Logical: whether confounding threshold is exceeded.
[check_convergence()], [interaction_models()]
data <- data_PimaIndiansDiabetes identify_confounder( data = data, outcome = "glucose", exposure = "insulin", potential_confounder = "age_cat", approach = "linear" )data <- data_PimaIndiansDiabetes identify_confounder( data = data, outcome = "glucose", exposure = "insulin", potential_confounder = "age_cat", approach = "linear" )
This function fits two models—one with and one without an interaction term between an exposure and a potential effect modifier— and compares them using either a likelihood ratio test (LRT) or Wald test. It is useful for assessing whether there is statistical evidence of interaction (effect modification).
interaction_models( data, outcome, exposure, covariates = NULL, effect_modifier, approach = "logit", test = c("LRT", "Wald"), verbose = TRUE )interaction_models( data, outcome, exposure, covariates = NULL, effect_modifier, approach = "logit", test = c("LRT", "Wald"), verbose = TRUE )
data |
A data frame containing all required variables. |
outcome |
The name of the outcome variable |
exposure |
The name of the main exposure variable. |
covariates |
character vector of additional covariates to adjust for |
effect_modifier |
The name of the variable to test for interaction |
approach |
The regression modeling approach to use. One of:
|
test |
Type of statistical test for model comparison. Either:
|
verbose |
Logical; if |
A list with the following elements:
model_no_interaction: The model without the interaction term.
model_with_interaction: The model with the interaction term.
p_value: The p-value for interaction (based on selected test).
interpretation: A brief text interpretation if
verbose = TRUE.
data <- data_PimaIndiansDiabetesdata <- data_PimaIndiansDiabetes
Merge tables (descriptive / uni / multi) and preserve look & notes
merge_tables(..., spanners = NULL, theme = "minimal")merge_tables(..., spanners = NULL, theme = "minimal")
... |
package tables with $table_display (same engine) |
spanners |
labels over each panel |
theme |
merge theme (preset or primitives) |
modify_table( gt_table, variable_labels = NULL, level_labels = NULL, header_labels = NULL, caption = NULL, bold_labels = FALSE, bold_levels = FALSE, remove_N = FALSE, remove_N_obs = FALSE, remove_abbreviations = FALSE, caveat = NULL )modify_table( gt_table, variable_labels = NULL, level_labels = NULL, header_labels = NULL, caption = NULL, bold_labels = FALSE, bold_levels = FALSE, remove_N = FALSE, remove_N_obs = FALSE, remove_abbreviations = FALSE, caveat = NULL )
gt_table |
Table object produced by this package (must contain '$table_display'). |
variable_labels |
Named character vector: 'c(old_var = "New label", ...)'. |
level_labels |
Named list for factor levels: 'list(var1 = c(old = "New", ...), var2 = c(...))'. |
header_labels |
Named vector to rename visible headers, e.g. 'c("OR (95 \itemcaptionOptional caption/title. \itembold_labelsLogical; bold variable (header) rows in the body. \itembold_levelsLogical; bold factor level rows in the body. \itemremove_NLogical; if 'TRUE', drops the 'N' column for univariate package tables. \itemremove_N_obsLogical; if 'TRUE', suppresses multivariable complete-case footnote. \itemremove_abbreviationsLogical; if 'TRUE', removes the Abbreviations footnote line. \itemcaveatOptional extra footnote. |
The modified table object (same class as input). Works with objects created by this package (class '"gtregression"': 'uni_reg()', 'multi_reg()', 'descriptive_table()', 'merge_tables()'). No 'gtsummary' dependency or fallback.
Create a publication-ready multivariable regression table using either gt or flextable, without a gtsummary dependency.
multi_reg( data, outcome, exposures, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )multi_reg( data, outcome, exposures, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )
data |
data.frame |
outcome |
character scalar; outcome column name |
exposures |
character vector; exposure column names (all included in one model) |
approach |
one of |
format |
one of |
theme |
preset name (e.g. |
A list of class c("gtregression","multi_reg", ...) with elements:
A gt_tbl (when format="gt") or flextable (when format="flextable").
Data frame of adjusted estimates and CIs (per level).
Data frame for display (headers + levels) without N column.
List with the single multivariable model.
summary() of the fitted model.
Diagnostics for linear model; message otherwise.
Metadata fields.
d <- mtcars if (requireNamespace("gt", quietly = TRUE)) { multi_reg(d, "am", c("mpg","cyl","gear"), approach = "logit", format = "gt")$table } if (requireNamespace("flextable", quietly = TRUE)) { multi_reg(d, "am", c("mpg","cyl","gear"), approach = "logit", format = "flextable")$table }d <- mtcars if (requireNamespace("gt", quietly = TRUE)) { multi_reg(d, "am", c("mpg","cyl","gear"), approach = "logit", format = "gt")$table } if (requireNamespace("flextable", quietly = TRUE)) { multi_reg(d, "am", c("mpg","cyl","gear"), approach = "logit", format = "flextable")$table }
Creates a forest plot from a 'gtsummary'-style object produced by 'gtregression' functions (e.g., 'uni_reg()', 'multi_reg()'). The function supports both univariate and multivariable models, renders hierarchical labels (variable headers vs. levels), and computes significance highlighting using either *p*-values (linear models) or CI-vs-reference rules (non-linear models).
plot_reg( tbl, title = NULL, order_y = NULL, log_x = FALSE, xlim = NULL, breaks = NULL, point_color = "#1F77B4", errorbar_color = "#4C4C4C", base_size = 14, show_ref = TRUE, sig_color = NULL, sig_errorbar_color = NULL, alpha = 0.05 )plot_reg( tbl, title = NULL, order_y = NULL, log_x = FALSE, xlim = NULL, breaks = NULL, point_color = "#1F77B4", errorbar_color = "#4C4C4C", base_size = 14, show_ref = TRUE, sig_color = NULL, sig_errorbar_color = NULL, alpha = 0.05 )
tbl |
A 'gtsummary'-like object returned by 'gtregression' (must contain 'table_body' and attributes 'source' and 'approach'). |
title |
Optional plot title (character). |
order_y |
Optional character vector to customize the y-axis header ordering. |
log_x |
Logical. If 'TRUE', log x-axis (ignored for linear models). |
xlim |
Optional numeric vector of length 2 for x-axis limits. |
breaks |
Optional numeric vector for x-axis tick breaks (ignored if 'log_x = TRUE'). |
point_color |
Fill color for points (default '"#1F77B4"'). |
errorbar_color |
Color for all error bars (default '"#4C4C4C"'). |
base_size |
Base font size for 'theme_minimal()' (default '14'). |
show_ref |
Logical. If 'TRUE', includes the reference level on the plot and labels it '(Ref.)'. |
sig_color |
Optional fill color for **significant** points; if 'NULL', significant points reuse 'point_color'. |
sig_errorbar_color |
Optional color for **significant** error bars; if 'NULL', significant bars reuse 'errorbar_color'. |
alpha |
Significance level for linear models when 'p.value' is available (default '0.05'). |
**Reference line**: The vertical reference is fixed at '0' for linear models and '1' for all other approaches, inferred from 'attr(tbl, "approach")'.
**Header / data detection**: Variable headers are recognized via 'row_type == "label"' together with 'header_row' or missing CI; categorical levels use 'row_type == "level"'; continuous predictors appear as 'row_type == "label"' **with** CIs and are treated as data rows.
**Significance highlighting**: - For 'approach == "linear"' with available 'p.value', rows are significant when 'p.value < alpha'. - Otherwise, rows are significant when the CI does not cross the reference ('0' or '1' as above). Use 'sig_color' / 'sig_errorbar_color' to customize the appearance.
A 'ggplot2' object representing the forest plot.
uni_reg, multi_reg, plot_reg_combine
if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("gtregression", quietly = TRUE)) { data("PimaIndiansDiabetes2", package = "mlbench") pima <- PimaIndiansDiabetes2 pima$diabetes <- ifelse(pima$diabetes == "pos", 1, 0) pima$bmi_cat <- cut( pima$mass, breaks = c(-Inf, 18.5, 24.9, 29.9, Inf), labels = c("Underweight", "Normal", "Overweight", "Obese") ) # Univariate logistic regression table via gtregression tbl_uni <- gtregression::uni_reg( data = pima, outcome = "diabetes", exposures = c("age", "bmi_cat"), approach = "logit" ) p <- plot_reg(tbl_uni, title = "Univariate (logit)", sig_color = "#D55E00") print(p) }if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("gtregression", quietly = TRUE)) { data("PimaIndiansDiabetes2", package = "mlbench") pima <- PimaIndiansDiabetes2 pima$diabetes <- ifelse(pima$diabetes == "pos", 1, 0) pima$bmi_cat <- cut( pima$mass, breaks = c(-Inf, 18.5, 24.9, 29.9, Inf), labels = c("Underweight", "Normal", "Overweight", "Obese") ) # Univariate logistic regression table via gtregression tbl_uni <- gtregression::uni_reg( data = pima, outcome = "diabetes", exposures = c("age", "bmi_cat"), approach = "logit" ) p <- plot_reg(tbl_uni, title = "Univariate (logit)", sig_color = "#D55E00") print(p) }
Creates two aligned forest plots (univariate and multivariable) from 'gtsummary'-style objects returned by 'gtregression' functions (e.g., 'uni_reg()', 'multi_reg()').
plot_reg_combine( tbl_uni, tbl_multi, title_uni = NULL, title_multi = NULL, ref_line = NULL, order_y = NULL, log_x = FALSE, point_color = "#1F77B4", errorbar_color = "#4C4C4C", base_size = 14, show_ref = TRUE, sig_color = NULL, sig_errorbar_color = NULL, xlim_uni = NULL, breaks_uni = NULL, xlim_multi = NULL, breaks_multi = NULL, alpha = 0.05 )plot_reg_combine( tbl_uni, tbl_multi, title_uni = NULL, title_multi = NULL, ref_line = NULL, order_y = NULL, log_x = FALSE, point_color = "#1F77B4", errorbar_color = "#4C4C4C", base_size = 14, show_ref = TRUE, sig_color = NULL, sig_errorbar_color = NULL, xlim_uni = NULL, breaks_uni = NULL, xlim_multi = NULL, breaks_multi = NULL, alpha = 0.05 )
tbl_uni |
Univariate 'gtsummary'-like table. |
tbl_multi |
Multivariable 'gtsummary'-like table. |
title_uni, title_multi
|
Optional panel titles. |
ref_line |
Optional numeric reference line (defaults to 0 for linear, 1 otherwise, inferred per panel). |
order_y |
Optional character vector to customize header ordering. |
log_x |
Logical. If 'TRUE', use log x-axis (ignored for linear models). |
point_color, errorbar_color
|
Base colors for non-significant rows. |
base_size |
Base font size for 'theme_minimal()'. |
show_ref |
Logical; if 'TRUE', include and tag reference levels '(Ref.)'. |
sig_color, sig_errorbar_color
|
Optional colors for significant rows; if 'NULL', they reuse the base colors. |
xlim_uni, breaks_uni
|
Optional x-limits and breaks for the univariate panel. |
xlim_multi, breaks_multi
|
Optional x-limits and breaks for the multivariable panel. |
alpha |
Significance level for linear models when 'p.value' is available. |
The y-axis rows are aligned by a unique '(variable, level)' key so each estimate appears exactly once per panel. Label styling is plain text by default (CRAN-safe). To render bold headers / grey refs in vignettes, pair
A 'patchwork' object with two 'ggplot2' panels.
if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("gtregression", quietly = TRUE)) { data("PimaIndiansDiabetes2", package = "mlbench") d <- PimaIndiansDiabetes2 d$diabetes <- ifelse(d$diabetes == "pos", 1, 0) tbl_u <- gtregression::uni_reg(d, outcome = "diabetes", exposures = c("age","glucose"), approach = "logit") tbl_m <- gtregression::multi_reg(d, outcome = "diabetes", exposures = c("age","glucose"), approach = "logit") plot_reg_combine(tbl_u, tbl_m, title_uni = "Univariate", title_multi = "Adjusted") }if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("gtregression", quietly = TRUE)) { data("PimaIndiansDiabetes2", package = "mlbench") d <- PimaIndiansDiabetes2 d$diabetes <- ifelse(d$diabetes == "pos", 1, 0) tbl_u <- gtregression::uni_reg(d, outcome = "diabetes", exposures = c("age","glucose"), approach = "logit") tbl_m <- gtregression::multi_reg(d, outcome = "diabetes", exposures = c("age","glucose"), approach = "logit") plot_reg_combine(tbl_u, tbl_m, title_uni = "Univariate", title_multi = "Adjusted") }
Prints the rendered table for any object produced by this package
(objects that include class "gtregression"), regardless of subtype
(uni_reg, multi_reg, stratified_*, merged_table,
descriptive_table, ...). If no rendered table is found, a compact
structure of the object (or its display data) is shown.
## S3 method for class 'gtregression' print(x, ...)## S3 method for class 'gtregression' print(x, ...)
x |
An object with class |
... |
Ignored. Present for compatibility with the generic. |
Saves a collection of gtsummary tables and ggplot2 plots into a .docx file.
save_docx(tables = NULL, plots = NULL, filename = "report.docx", titles = NULL)save_docx(tables = NULL, plots = NULL, filename = "report.docx", titles = NULL)
tables |
A list of gtsummary tables. |
plots |
A list of ggplot2 plot objects. |
filename |
File name for the output (with or without .docx extension). |
titles |
Optional. A character vector of titles. |
A Word document saved to a temporary directory (if no path is given). No object is returned.
library(gtsummary) library(ggplot2) tbl <- tbl_regression(glm(mpg ~ hp + wt, data = mtcars)) p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() save_docx( tables = list(tbl), plots = list(p), filename = file.path(tempdir(), "report.docx"), titles = c("Table 1: Regression", "Figure 1: Scatterplot") )library(gtsummary) library(ggplot2) tbl <- tbl_regression(glm(mpg ~ hp + wt, data = mtcars)) p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() save_docx( tables = list(tbl), plots = list(p), filename = file.path(tempdir(), "report.docx"), titles = c("Table 1: Regression", "Figure 1: Scatterplot") )
Saves a ggplot2 plot to a file in PNG, PDF, or JPG format.
save_plot( plot, filename = "plot", format = c("png", "pdf", "jpg"), width = 8, height = 6, dpi = 300 )save_plot( plot, filename = "plot", format = c("png", "pdf", "jpg"), width = 8, height = 6, dpi = 300 )
plot |
A ggplot2 object. |
filename |
Name of the file to save, with or without extension. |
format |
Output format. One of "png", "pdf", or "jpg". |
width |
Width of the saved plot in inches. |
height |
Height of the saved plot in inches. |
dpi |
Resolution of the plot in dots per inch (default is 300). |
Saves the file to a temporary directory (if no path is given).
library(ggplot2) p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() save_plot(p, filename = file.path(tempdir(), "scatterplot"), format = "png")library(ggplot2) p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point() save_plot(p, filename = file.path(tempdir(), "scatterplot"), format = "png")
Saves a gtsummary table as a Word, PDF, or HTML file
save_table(tbl, filename = "table", format = c("docx", "pdf", "html"))save_table(tbl, filename = "table", format = c("docx", "pdf", "html"))
tbl |
A gtsummary object (e.g., tbl_regression(), tbl_summary()). |
filename |
File name to save the output. Extension is optional. |
format |
Output format. One of "docx", "pdf", or "html". |
Saves the file to a temporary directory (if no path is given). Does not return an object.
model <- glm(mpg ~ hp + wt, data = mtcars) tbl <- gtsummary::tbl_regression(model) save_table(tbl, filename = file.path(tempdir(), "regression_table"), format = "docx")model <- glm(mpg ~ hp + wt, data = mtcars) tbl <- gtsummary::tbl_regression(model) save_table(tbl, filename = file.path(tempdir(), "regression_table"), format = "docx")
Performs stepwise model selection using forward, backward, or both directions across different regression approaches. Returns a summary table with evaluation metrics (AIC, BIC, log-likelihood, deviance) and the best model.
select_models( data, outcome, exposures, approach = "logit", direction = "forward" )select_models( data, outcome, exposures, approach = "logit", direction = "forward" )
data |
A data frame containing the outcome and predictor variables. |
outcome |
A character string indicating the outcome variable. |
exposures |
vector of predictor variables to consider in the model. |
approach |
Regression method. One of:
|
direction |
Stepwise selection direction. One of:
|
A list with the following components:
results_table: A tibble summarising each tested model's metric
(AIC, BIC, deviance, log-likelihood, adjusted R² if applicable).
best_model: The best-fitting model object based on low AIC.
all_models: A named list of all fitted models.
data <- data_PimaIndiansDiabetes stepwise <- select_models( data = data, outcome = "glucose", exposures = c("age", "pregnant", "mass"), approach = "linear", direction = "forward" ) summary(stepwise) stepwise$results_table stepwise$best_modeldata <- data_PimaIndiansDiabetes stepwise <- select_models( data = data, outcome = "glucose", exposures = c("age", "pregnant", "mass"), approach = "linear", direction = "forward" ) summary(stepwise) stepwise$results_table stepwise$best_model
Fits one multivariable model per stratum and returns a unified wide table: a single "Characteristic" column and, under bold spanners for each stratum, two columns: "Adjusted <effect>" and "p-value".
stratified_multi_reg( data, outcome, exposures, stratifier, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )stratified_multi_reg( data, outcome, exposures, stratifier, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )
data |
data.frame |
outcome |
character scalar; outcome column name |
exposures |
character vector; predictors included in each model |
stratifier |
character scalar; stratifying variable |
approach |
"logit","log-binomial","poisson","linear","robpoisson","negbin" |
format |
"gt" (default) or "flextable" |
theme |
preset (e.g. "minimal","striped","clinical","shaded","jama") or primitives c("plain","zebra","lines","labels_bold","compact","header_shaded") |
The footer shows two lines: 1) Abbreviations (from '.abbrev_note()'), 2) Per-stratum complete-case N used in the multivariable model.
A list of class c("gtregression","stratified_multi_reg", ...) with:
A gt_tbl (format="gt") or flextable (format="flextable").
Wide data.frame used to build the table.
Named list of per-stratum results (models/summaries/diagnostics).
Named lists by stratum.
Metadata fields.
if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("dplyr", quietly = TRUE)) { data(PimaIndiansDiabetes2, package = "mlbench") pima <- dplyr::mutate( PimaIndiansDiabetes2, diabetes = ifelse(diabetes == "pos", 1, 0), glucose_cat = dplyr::case_when( glucose < 140 ~ "Normal", glucose >= 140 ~ "High" ) ) stratified_multi <- stratified_multi_reg( data = pima, outcome = "diabetes", exposures = c("age", "mass"), stratifier = "glucose_cat", approach = "logit" ) stratified_multi$table }if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("dplyr", quietly = TRUE)) { data(PimaIndiansDiabetes2, package = "mlbench") pima <- dplyr::mutate( PimaIndiansDiabetes2, diabetes = ifelse(diabetes == "pos", 1, 0), glucose_cat = dplyr::case_when( glucose < 140 ~ "Normal", glucose >= 140 ~ "High" ) ) stratified_multi <- stratified_multi_reg( data = pima, outcome = "diabetes", exposures = c("age", "mass"), stratifier = "glucose_cat", approach = "logit" ) stratified_multi$table }
Performs univariate regression for each exposure on a binary, count, or continuous outcome, stratified by a specified variable. Produces a stacked 'gtsummary' table with one column per stratum, along with underlying models and diagnostics.
stratified_uni_reg( data, outcome, exposures, stratifier, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )stratified_uni_reg( data, outcome, exposures, stratifier, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )
data |
A data frame containing the variables. |
outcome |
name of the outcome variable. |
exposures |
A vector specifying the predictor (exposure) variables. |
stratifier |
A character string specifying the stratifier |
approach |
Modeling approach to use. One of: '"logit"' (Odds Ratios), '"log-binomial"' (Risk Ratios), '"poisson"' (Incidence Rate Ratios), '"robpoisson"' (Robust RR), '"linear"' (Beta coefficients), '"negbin"' (Incidence Rate Ratios),. |
An object of class 'stratified_uni_reg', which includes: - 'table': A 'gtsummary::tbl_stack' object with stratified results, - 'models': A list of fitted models for each stratum, - 'model_summaries': A tidy list of model summaries, - 'reg_check': A tibble of regression diagnostics (when available).
$tableStacked stratified regression table.
$modelsList of fitted model objects for each stratum.
$model_summariesList of tidy model summaries.
$reg_checkDiagnostic check results (when applicable).
[multi_reg()], [plot_reg()], [identify_confounder()]
if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("dplyr", quietly = TRUE)) { data(PimaIndiansDiabetes2, package = "mlbench") pima <- dplyr::mutate( PimaIndiansDiabetes2, diabetes = ifelse(diabetes == "pos", 1, 0), glucose_cat = dplyr::case_when( glucose < 140 ~ "Normal", glucose >= 140 ~ "High" ) ) stratified_uni <- stratified_uni_reg( data = pima, outcome = "diabetes", exposures = c("age", "mass"), stratifier = "glucose_cat", approach = "logit" ) stratified_uni$table }if (requireNamespace("mlbench", quietly = TRUE) && requireNamespace("dplyr", quietly = TRUE)) { data(PimaIndiansDiabetes2, package = "mlbench") pima <- dplyr::mutate( PimaIndiansDiabetes2, diabetes = ifelse(diabetes == "pos", 1, 0), glucose_cat = dplyr::case_when( glucose < 140 ~ "Normal", glucose >= 140 ~ "High" ) ) stratified_uni <- stratified_uni_reg( data = pima, outcome = "diabetes", exposures = c("age", "mass"), stratifier = "glucose_cat", approach = "logit" ) stratified_uni$table }
Create a publication-ready univariate regression table using either gt or flextable
uni_reg( data, outcome, exposures, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )uni_reg( data, outcome, exposures, approach = "logit", format = c("gt", "flextable"), theme = c("minimal") )
data |
data.frame |
outcome |
character scalar; outcome column name |
exposures |
character vector; exposure column names |
approach |
one of |
format |
one of |
theme |
preset name (e.g. |
A list of class c("gtregression","uni_reg", ...) with elements:
A gt_tbl (when format="gt") or flextable (when format="flextable").
Data frame of numeric estimates and CIs.
Data frame for display (headers + levels).
List of fitted univariate models.
Per-model summary() results.
Diagnostics for linear models; message otherwise.
Metadata fields.
d <- mtcars if (requireNamespace("gt", quietly = TRUE)) { uni_reg(d, "am", c("mpg","cyl"), approach = "logit", format = "gt")$table } if (requireNamespace("flextable", quietly = TRUE)) { uni_reg(d, "am", c("mpg","cyl"), approach = "logit", format = "flextable")$table }d <- mtcars if (requireNamespace("gt", quietly = TRUE)) { uni_reg(d, "am", c("mpg","cyl"), approach = "logit", format = "gt")$table } if (requireNamespace("flextable", quietly = TRUE)) { uni_reg(d, "am", c("mpg","cyl"), approach = "logit", format = "flextable")$table }