Cheatsheet

Quick reference for the full bossanova API surface. See the model API docs for complete details.

R packages used in examples

The R tabs use: lme4 + lmerTest (mixed models), emmeans (marginal means & contrasts), sandwich + lmtest (robust SEs), car (Type III tests, VIF), boot (bootstrap), broom (tidy output), simr (power analysis).

install.packages(c("lme4", "lmerTest", "emmeans", "sandwich",
                    "lmtest", "car", "boot", "broom", "simr"))

Model types at a glance¶

Model	Formula	`family`	`method`
Linear regression (LM)	`y ~ x1 + x2`	`"gaussian"` (default)	`"ols"` (default)
Logistic regression	`y ~ x1 + x2`	`"binomial"`	`"ml"`
Poisson regression	`y ~ x1 + x2`	`"poisson"`	`"ml"`
Gamma regression	`y ~ x1 + x2`	`"gamma"`	`"ml"`
Linear mixed model (LMER)	`y ~ x + (1 \| group)`	`"gaussian"` (default)	`"reml"` (default)
GLMM — logistic	`y ~ x + (1 \| group)`	`"binomial"`	`"ml"`
GLMM — Poisson	`y ~ x + (1 \| group)`	`"poisson"`	`"ml"`

Formula syntax¶

Operators¶

Operator	Example	Meaning
`+`	`a + b`	Main effects (additive)
`*`	`a * b`	Main effects + interaction (`a + b + a:b`)
`:`	`a:b`	Interaction only
`**` / `^`	`x**2`	Polynomial term
`(1 \| g)`	`y ~ x + (1 \| subj)`	Random intercept
`(x \| g)`	`y ~ x + (x \| subj)`	Random intercept + slope (correlated)
`(1 \| g1/g2)`	`y ~ x + (1 \| school/class)`	Nested random effects

Transforms (in-formula)¶

Transform	Effect	Example
`center(x)`	`x - mean(x)`	`y ~ center(age)`
`zscore(x)`	`(x - mean) / sd`	`y ~ zscore(x)`
`scale(x)`	Gelman scaling: `(x - mean) / (2 * sd)`	`y ~ scale(x)`
`log(x)`	Natural log	`y ~ log(income)`
`poly(x, d)`	Orthogonal polynomial of degree `d`	`y ~ poly(x, 3)`
`factor(x)`	Force categorical	`y ~ factor(year)`

Transforms are stateful — parameters learned at .fit() are reapplied at .predict(). Nesting supported: zscore(rank(x)).

Contrast coding (in-formula)¶

Function	Intercept =	Coefficients =
`treatment(x)`	Reference level mean	Difference from reference
`sum(x)`	Grand mean	Deviation from grand mean
`helmert(x)`	Grand mean	Level vs. mean of prior levels
`sequential(x)`	First level mean	Successive differences

Set reference: treatment(group, ref=B), sum(group, omit=A), helmert(group, [low, med, high]).

Initialize¶

bossanova

GLM

LMER

GLMER

GLM

LMER

GLMER

from bossanova import model

GLM

LMER

GLMER

m = model("y ~ x1 + x2", data)

library(lme4)
library(lmerTest)

GLM

LMER

GLMER

m <- lm(y ~ x1 + x2, data)

Constructor parameters¶

Parameter	Type	Default	Notes
`formula`	`str`	required	R-style formula
`data`	`DataFrame \| str \| None`	`None`	Polars DataFrame, file path, or `None` for simulation
`family`	`str`	`"gaussian"`	`"gaussian"`, `"binomial"`, `"poisson"`, `"gamma"`, `"tdist"`
`link`	`str \| None`	`None`	`None` = canonical link for family
`method`	`str \| None`	`None`	`None` = auto (`"ols"` for LM, `"reml"` for LMER, `"ml"` for GLM/GLMER)
`missing`	`str`	`"drop"`	`"drop"` or `"fail"`
`contrasts`	`dict \| None`	`None`	`{col: ndarray}` — prefer in-formula syntax instead

Fit¶

Estimate model parameters given some data

→ API: .fit()

bossanova

m = model("y ~ x", data).fit()

Key parameters¶

Parameter	Type	Default	Notes
`weights`	`str \| None`	`None`	Column name for observation weights
`offset`	`str \| None`	`None`	Column name for offset term
`nAGQ`	`int \| None`	`None`	GLMER only: GHQ points (`0`=fast, `1`=Laplace, `>1`=adaptive)

What `.fit()` populates¶

Property	LM	GLM	LMER	GLMER	Description
`params`	✓	✓	✓	✓	Coefficient estimates
`diagnostics`	✓	✓	✓	✓	AIC, BIC, loglik, R², deviance, etc.
`designmat`	✓	✓	✓	✓	Fixed-effects design matrix
`metadata`	✓	✓	✓	✓	nobs, nparams, ngroups
`varying_offsets`			✓	✓	BLUPs (random effect offsets)
`varying_params`			✓	✓	Population + BLUP coefficients
`varying_spread`			✓	✓	Variance components (σ², τ², ICC)
`varying_corr`			✓	✓	Random effect correlations

Augmented data columns (accessible via m.data): fitted, resid, hat, std_resid, cooksd.

Explore¶

Use a fitted model to estimate marginal effects

→ API: .explore()

bossanova

m.fit().explore("pairwise(treatment)")
m.fit().explore("treatment")            # estimated marginal means
m.fit().explore("x1")                   # marginal slopes

library(emmeans)

emmeans(m, ~ treatment)                       # estimated marginal means
pairs(emmeans(m, ~ treatment))                # pairwise contrasts
emtrends(m, ~ 1, var = "x1")                 # marginal slopes

Explore formulas¶

bossanova

The focal variable (left of ~) is the variable whose effect is estimated. Conditions (right of ~) control the reference grid.

Formula	Computes
`"x"`	EMMs for categorical `x`, marginal slope for continuous `x`
`"pairwise(x)"`	All pairwise contrasts
`"sequential(x)"`	Successive differences (B-A, C-B, ...)
`"x ~ z"`	Cross focal `x` with levels of `z`
`"x ~ z@[v1, v2]"`	Cross focal `x` with specific `z` values
`"x ~ z@range(5)"`	Cross focal `x` with 5 evenly-spaced `z` values
`"pairwise(x) ~ z"`	Pairwise contrasts at each level of `z`
`"x[A - B]"`	Contrast: A minus B
`"x[* - B]"`	Each other level vs B

emmeans	Computes
`emmeans(m, ~ x)`	EMMs for categorical `x`
`emtrends(m, ~ 1, var = "x")`	Marginal slope for continuous `x`
`pairs(emmeans(m, ~ x))`	All pairwise contrasts
`contrast(emmeans(m, ~ x), method = "consec")`	Successive differences
`emmeans(m, ~ x * z)`	Cross `x` with levels of `z`
`emmeans(m, ~ x, at = list(z = c(1, 2)))`	Cross `x` with specific `z` values
`emtrends(m, ~ 1, var = "x", at = list(z = seq(...)))`	Cross `x` with evenly-spaced `z` values
`pairs(emmeans(m, ~ x \| z))`	Pairwise contrasts at each level of `z`
`contrast(emmeans(m, ~ x), method = list(c(-1, 1, 0)))`	Custom contrast

Key parameters¶

Parameter	Type	Default	Notes
`formula`	`str`	required	Focal variable `~` conditions (see table above)
`effect_scale`	`str`	`"auto"`	`"auto"`, `"link"`, or `"response"` (GLMs). `"auto"` = `"response"` for AME/GLM, `"link"` otherwise
`how`	`str`	`"auto"`	`"auto"`, `"mem"` (at-mean), or `"ame"` (g-computation)
`varying`	`str`	`"exclude"`	`"exclude"` or `"include"` random effects (mixed)
`by`	`str \| None`	`None`	Subgroup analysis column

What `.explore()` populates¶

Property	Description
`effects`	Grid columns + estimate (+ SE, CI, p after `.infer()`)

Predict¶

Use a fitted model to make predictions

→ API: .predict()

bossanova

m.fit().predict()                            # in-sample
m.fit().predict(newdata=new_df)              # out-of-sample
m.fit().predict(type="link")                 # linear predictor (GLMs)

predict(m)                                   # in-sample
predict(m, newdata = new_df)                 # out-of-sample
predict(m, type = "link")                    # linear predictor (GLMs)
predict(m, newdata = new_df, re.form = NA)   # exclude random effects (mixed)

Key parameters¶

Parameter	Type	Default	Notes
`newdata`	`DataFrame \| None`	`None`	`None` = in-sample predictions
`type`	`str`	`"response"`	`"response"` or `"link"`
`varying`	`str`	`"exclude"`	`"include"` to use BLUPs in mixed models
`allow_new_levels`	`bool`	`False`	Allow unseen group levels (mixed models)

What `.predict()` populates¶

Property	Description
`predictions`	`fitted` (+ `link` for GLMs; + SE, CI after `.infer()`)

Infer¶

Compute uncertainty for parameter estimates, marginal effects, and model predictions

→ API: .infer()

bossanova

m.fit().infer()                                      # asymptotic (default)
m.fit().infer(how="boot", n_boot=5000)               # bootstrap
m.fit().infer(errors="HC3")                           # robust SEs
m.fit().infer(how="cv", k=10)                         # cross-validation
m.fit().explore("pairwise(x)").infer()                # infer on effects
m.fit().infer(how="profile")                          # profile CIs for variance components

# Asymptotic (default with lmerTest for mixed models)
summary(m)                                            # Satterthwaite p-values
confint(m)                                            # Wald CIs

# Bootstrap
confint(m, method = "boot", nsim = 5000)              # bootstrap CIs (mixed)
library(boot)
boot(data, function(d, i) coef(lm(y ~ x, d[i, ])), R = 5000)  # general

# Robust SEs
library(sandwich); library(lmtest)
coeftest(m, vcov = vcovHC(m, type = "HC3"))

# Profile CIs (mixed models)
confint(m, method = "profile")

# Infer on effects
pairs(emmeans(m, ~ x), infer = c(TRUE, TRUE))        # CIs + p-values

`how` × operation compatibility¶

	`asymp`	`boot`	`perm`	`cv`	`profile`	`joint`
`.fit()` → params	✓	✓	✓	✓
`.explore()` → effects	✓	✓	✓			✓
`.predict()` → predictions	✓	✓		✓
`.fit()` → varying_spread					✓

Key parameters¶

Parameter	Type	Default	Notes
`how`	`str`	`"asymp"`	Inference method (see matrix above)
`conf_level`	`float \| int \| str`	`0.95`	Accepts `0.95`, `95`, or `"95%"`
`errors`	`str`	`"iid"`	`"iid"`, `"HC0"`–`"HC3"`, `"hetero"`, `"unequal_var"`
`null`	`float`	`0.0`	Null hypothesis value
`alternative`	`str`	`"two-sided"`	`"two-sided"`, `"greater"`, `"less"`
`n_boot`	`int`	`1000`	Bootstrap resamples
`n_perm`	`int`	`1000`	Permutation resamples
`ci_type`	`str`	`"bca"`	`"bca"` or `"percentile"`
`k`	`int`	`10`	CV folds
`seed`	`int \| None`	`None`	Reproducibility
`n_jobs`	`int`	`1`	Parallel workers for resampling

Summary & export¶

Print a nicely formatted R-style summary

→ API: .summary() · .show_math()

bossanova

Method	Returns	Notes
`.summary()`	`FormattedText`	R-style summary; `style="compact"` for minimal
`.summary(digits=5)`	`FormattedText`	Control decimal places
`.show_math()`	`MathDisplay`	LaTeX equation (renders in notebooks)
`.to_odds_ratio()`	`DataFrame`	Exponentiated estimates (binomial only)
`.to_response_scale()`	`DataFrame`	Inverse-link on effects
`.to_effect_size()`	`DataFrame`	Cohen’s d, η², r (semi-partial)
`.vif()`	`DataFrame`	Variance inflation factors
`.filter_params(terms)`	`DataFrame`	Subset params by term name(s)
`.filter_significant(alpha)`	`DataFrame`	Params where p < alpha
`.filter_effects(...)`	`DataFrame`	Subset effects by terms/levels/contrasts
`.jointtest()`	`DataFrame`	Type III F / χ² tests
`.to_markdown()`	`str`	Pipe-delimited markdown of last result
`.to_csv()`	`str`	CSV of last result

Function	Returns	Notes
`summary(m)`	Console output	Full model summary
`broom::tidy(m)`	tibble	Tidy coefficient table
`broom::glance(m)`	tibble	Model diagnostics (AIC, BIC, R², ...)
`exp(cbind(coef(m), confint(m)))`	matrix	Odds ratios (binomial)
`car::vif(m)`	vector	Variance inflation factors
`car::Anova(m, type = "III")`	data.frame	Type III F / χ² tests
`broom::tidy(m) \|> knitr::kable()`	string	Markdown table
`write.csv(broom::tidy(m), ...)`	file	CSV export

Simulate¶

Simulate data from a model with or without fitting

→ API: .simulate()

bossanova

Pre-fit (generate data)

Post-fit (simulate responses)

Power analysis

Post-fit (simulate responses)

Power analysis (mixed)

Pre-fit (generate data)

Post-fit (simulate responses)

Power analysis

from bossanova.distributions import normal, categorical, varying

m = model("y ~ x + group")
m.simulate(n=200, coef={"x": 0.5, "groupB": -1.0},
           x=normal(0, 1), group=categorical(["A", "B"]))
m.fit().infer()

m = model("y ~ x + group")
m.simulate(n=100, coef={"x": 0.5}, x=normal(0, 1),
           group=categorical(["A", "B"]),
           power=500)                             # 500 simulations
m.power_results                                   # power, coverage, bias

Post-fit (simulate responses)

Power analysis (mixed)

m <- lm(y ~ x, data)
simulate(m, nsim = 1000)     # 1000 response vectors from fitted model

library(simr)

m <- lmer(y ~ x + (1 | group), data)
powerSim(m, nsim = 500)                           # power via simulation
powerCurve(m, along = "group", nsim = 100)        # power across sample sizes

Plotting¶

→ API: plot methods

All plot methods return a matplotlib Figure or seaborn FacetGrid.

Method	Shows	Requires
`plot_design()`	Design matrix heatmap	data
`plot_vif()`	VIF bar chart	data
`plot_relationships()`	Predictor scatterplot matrix	data
`plot_params()`	Coefficient forest plot	`.fit()`
`plot_resid()`	Residual diagnostics (4-panel)	`.fit()`
`plot_predict()`	Prediction curves with CI bands	`.fit()` + `term=`
`plot_explore()`	EMM / contrast plot	`.explore()` + `specs=`
`plot_resamples()`	Bootstrap/permutation distributions	`.infer(how="boot"\|"perm")`
`plot_ranef()`	Random effects caterpillar plot	`.fit()` + mixed model
`plot_profile()`	Profile likelihood curves	`.infer(how="profile")`

Model comparison¶

Compare nested and non-nested models using a variety of approaches

→ API: compare()

bossanova

from bossanova import compare

compare(m1, m2, m3)                    # Auto: F (LM), deviance (GLM), LRT (mixed)
compare(m1, m2, method="lrt")          # Explicit likelihood ratio test
compare(m1, m2, method="aic")          # AIC comparison with delta-AIC and weights
compare(m1, m2, method="bic")          # BIC comparison with delta-BIC and weights
compare(m1, m2, method="cv", cv=10)    # Cross-validation comparison
compare(m1, m2, method="lrt", refit=True)  # Refit REML → ML for valid LRT

anova(m1, m2, m3)                      # F-test (LM) or LRT (mixed)
anova(m1, m2, test = "LRT")            # explicit likelihood ratio test
AIC(m1, m2, m3)                        # AIC comparison
BIC(m1, m2, m3)                        # BIC comparison
anova(m1, m2, refit = TRUE)            # refit REML → ML for valid LRT (lmerTest)

Properties quick lookup¶

Property	Returns	Requires
`params`	Coefficients (+ SE, CI, p after `.infer()`)	`.fit()`
`diagnostics`	AIC, BIC, loglik, R², deviance, etc.	`.fit()`
`effects`	EMMs / marginal slopes (+ SE, CI, p after `.infer()`)	`.explore()`
`predictions`	Fitted values (+ SE, CI after `.infer()`)	`.predict()`
`simulations`	Generated data or simulated responses	`.simulate()`
`resamples`	Raw bootstrap/permutation values	`.infer(how="boot"\|"perm")`
`power_results`	Power, coverage, bias, RMSE	`.simulate(power=...)`
`varying_offsets`	BLUPs per group	`.fit()` + mixed
`varying_params`	Population + BLUP coefficients	`.fit()` + mixed
`varying_spread`	Variance components (σ², τ², ICC)	`.fit()` + mixed
`varying_corr`	Random effect correlations	`.fit()` + mixed
`designmat`	Design matrix	data
`metadata`	nobs, nparams, ngroups	data

Utilities¶

Function	Purpose
`compare(*models)`	Model comparison: auto-selects F / deviance / LRT by type; or `method="aic"`, `"bic"`, `"cv"`
`load_dataset(name)`	Load a built-in dataset
`show_datasets()`	List all available datasets
`set_backend("jax")`	Switch to JAX backend (default: `"numpy"`)
`get_backend()`	Current backend name
`set_display_digits(n)`	Set decimal places for display
`to_markdown(df)`	Export any DataFrame as markdown

Common patterns¶

bossanova

# Full workflow — one chain
m = model("y ~ x * group", data).fit().infer()
m.params       # estimates with CIs and p-values
m.summary()    # R-style summary table

# Explore + infer
m.explore("pairwise(group)").infer()
m.effects      # pairwise contrasts with CIs

# Robust standard errors
m.fit().infer(errors="HC3")

# Bootstrap with parallel workers
m.fit().infer(how="boot", n_boot=5000, n_jobs=4)

# Mixed model with profile CIs on variance components
m = model("y ~ x + (1 | subj)", data).fit().infer(how="profile")
m.varying_spread   # σ², τ² with profile CIs

# Full workflow
library(lme4); library(lmerTest); library(emmeans)
m <- lm(y ~ x * group, data)
summary(m)                                  # estimates with p-values
confint(m)                                  # CIs

# Explore + infer
pairs(emmeans(m, ~ group), infer = c(TRUE, TRUE))

# Robust standard errors
library(sandwich); library(lmtest)
coeftest(m, vcov = vcovHC(m, type = "HC3"))

# Bootstrap CIs
library(boot)
b <- boot(data, function(d, i) coef(lm(y ~ x, d[i, ])), R = 5000)
boot.ci(b, type = "bca")

# Mixed model with profile CIs
m <- lmer(y ~ x + (1 | subj), data)
summary(m)
confint(m, method = "profile")              # profile CIs on all parameters
VarCorr(m)                                  # variance components

Model types at a glance¶

Formula syntax¶

Operators¶

Transforms (in-formula)¶

Contrast coding (in-formula)¶

Initialize¶

Constructor parameters¶

Fit¶

Key parameters¶

What .fit() populates¶

Explore¶

Explore formulas¶

Key parameters¶

What .explore() populates¶

Predict¶

Key parameters¶

What .predict() populates¶

Infer¶

how × operation compatibility¶

Key parameters¶

Summary & export¶

Simulate¶

Plotting¶

Model comparison¶

Properties quick lookup¶

Utilities¶

Common patterns¶

What `.fit()` populates¶

What `.explore()` populates¶

What `.predict()` populates¶

`how` × operation compatibility¶