Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Quick reference for the full bossanova API surface. See the model API docs for complete details.

Model types at a glance

ModelFormulafamilymethod
Linear regression (LM)y ~ x1 + x2"gaussian" (default)"ols" (default)
Logistic regressiony ~ x1 + x2"binomial""ml"
Poisson regressiony ~ x1 + x2"poisson""ml"
Gamma regressiony ~ x1 + x2"gamma""ml"
Linear mixed model (LMER)y ~ x + (1 | group)"gaussian" (default)"reml" (default)
GLMM — logisticy ~ x + (1 | group)"binomial""ml"
GLMM — Poissony ~ x + (1 | group)"poisson""ml"

Formula syntax

Operators

OperatorExampleMeaning
+a + bMain effects (additive)
*a * bMain effects + interaction (a + b + a:b)
:a:bInteraction only
** / ^x**2Polynomial term
(1 | g)y ~ x + (1 | subj)Random intercept
(x | g)y ~ x + (x | subj)Random intercept + slope (correlated)
(1 | g1/g2)y ~ x + (1 | school/class)Nested random effects

Transforms (in-formula)

TransformEffectExample
center(x)x - mean(x)y ~ center(age)
zscore(x)(x - mean) / sdy ~ zscore(x)
scale(x)Gelman scaling: (x - mean) / (2 * sd)y ~ scale(x)
log(x)Natural logy ~ log(income)
poly(x, d)Orthogonal polynomial of degree dy ~ poly(x, 3)
factor(x)Force categoricaly ~ factor(year)

Transforms are stateful — parameters learned at .fit() are reapplied at .predict(). Nesting supported: zscore(rank(x)).

Contrast coding (in-formula)

FunctionIntercept =Coefficients =
treatment(x)Reference level meanDifference from reference
sum(x)Grand meanDeviation from grand mean
helmert(x)Grand meanLevel vs. mean of prior levels
sequential(x)First level meanSuccessive differences

Set reference: treatment(group, ref=B), sum(group, omit=A), helmert(group, [low, med, high]).


Initialize

bossanova
LM
GLM
LMER
GLMER
R
LM
GLM
LMER
GLMER
from bossanova import model
LM
GLM
LMER
GLMER
m = model("y ~ x1 + x2", data)

Constructor parameters

ParameterTypeDefaultNotes
formulastrrequiredR-style formula
dataDataFrame | str | NoneNonePolars DataFrame, file path, or None for simulation
familystr"gaussian""gaussian", "binomial", "poisson", "gamma", "tdist"
linkstr | NoneNoneNone = canonical link for family
methodstr | NoneNoneNone = auto ("ols" for LM, "reml" for LMER, "ml" for GLM/GLMER)
missingstr"drop""drop" or "fail"
contrastsdict | NoneNone{col: ndarray} — prefer in-formula syntax instead

Fit

Estimate model parameters given some data

API: .fit()

bossanova
R
m = model("y ~ x", data).fit()

Key parameters

ParameterTypeDefaultNotes
weightsstr | NoneNoneColumn name for observation weights
offsetstr | NoneNoneColumn name for offset term
nAGQint | NoneNoneGLMER only: GHQ points (0=fast, 1=Laplace, >1=adaptive)

What .fit() populates

PropertyLMGLMLMERGLMERDescription
paramsCoefficient estimates
diagnosticsAIC, BIC, loglik, R², deviance, etc.
designmatFixed-effects design matrix
metadatanobs, nparams, ngroups
varying_offsetsBLUPs (random effect offsets)
varying_paramsPopulation + BLUP coefficients
varying_spreadVariance components (σ², τ², ICC)
varying_corrRandom effect correlations

Augmented data columns (accessible via m.data): fitted, resid, hat, std_resid, cooksd.


Explore

Use a fitted model to estimate marginal effects

API: .explore()

bossanova
R
m.fit().explore("pairwise(treatment)")
m.fit().explore("treatment")            # estimated marginal means
m.fit().explore("x1")                   # marginal slopes

Explore formulas

bossanova
R

The focal variable (left of ~) is the variable whose effect is estimated. Conditions (right of ~) control the reference grid.

FormulaComputes
"x"EMMs for categorical x, marginal slope for continuous x
"pairwise(x)"All pairwise contrasts
"sequential(x)"Successive differences (B-A, C-B, ...)
"x ~ z"Cross focal x with levels of z
"x ~ z@[v1, v2]"Cross focal x with specific z values
"x ~ z@range(5)"Cross focal x with 5 evenly-spaced z values
"pairwise(x) ~ z"Pairwise contrasts at each level of z
"x[A - B]"Contrast: A minus B
"x[* - B]"Each other level vs B

Key parameters

ParameterTypeDefaultNotes
formulastrrequiredFocal variable ~ conditions (see table above)
effect_scalestr"link""link" or "response" (GLMs)
howstr"auto""auto", "mem" (at-mean), or "ame" (g-computation)
varyingstr"exclude""exclude" or "include" random effects (mixed)
bystr | NoneNoneSubgroup analysis column

What .explore() populates

PropertyDescription
effectsGrid columns + estimate (+ SE, CI, p after .infer())

Predict

Use a fitted model to make predictions

API: .predict()

bossanova
R
m.fit().predict()                            # in-sample
m.fit().predict(newdata=new_df)              # out-of-sample
m.fit().predict(type="link")                 # linear predictor (GLMs)

Key parameters

ParameterTypeDefaultNotes
newdataDataFrame | NoneNoneNone = in-sample predictions
typestr"response""response" or "link"
varyingstr"exclude""include" to use BLUPs in mixed models
allow_new_levelsboolFalseAllow unseen group levels (mixed models)

What .predict() populates

PropertyDescription
predictionsfitted (+ link for GLMs; + SE, CI after .infer())

Infer

Compute uncertainty for parameter estimates, marginal effects, and model predictions

API: .infer()

bossanova
R
m.fit().infer()                                      # asymptotic (default)
m.fit().infer(how="boot", n_boot=5000)               # bootstrap
m.fit().infer(errors="HC3")                           # robust SEs
m.fit().infer(how="cv", k=10)                         # cross-validation
m.fit().explore("pairwise(x)").infer()                # infer on effects
m.fit().infer(how="profile")                          # profile CIs for variance components

how × operation compatibility

asympbootpermcvprofilejoint
.fit() → params
.explore() → effects
.predict() → predictions
.fit() → varying_spread

Key parameters

ParameterTypeDefaultNotes
howstr"asymp"Inference method (see matrix above)
conf_levelfloat | int | str0.95Accepts 0.95, 95, or "95%"
errorsstr"iid""iid", "HC0""HC3", "hetero", "unequal_var"
nullfloat0.0Null hypothesis value
alternativestr"two-sided""two-sided", "greater", "less"
n_bootint1000Bootstrap resamples
n_permint1000Permutation resamples
ci_typestr"bca""bca" or "percentile"
kint10CV folds
seedint | NoneNoneReproducibility
n_jobsint1Parallel workers for resampling

Summary & export

Print a nicely formatted R-style summary

API: .summary() · .show_math()

bossanova
R
MethodReturnsNotes
.summary()FormattedTextR-style summary; style="compact" for minimal
.summary(digits=5)FormattedTextControl decimal places
.show_math()MathDisplayLaTeX equation (renders in notebooks)
.to_odds_ratio()DataFrameExponentiated estimates (binomial only)
.to_response_scale()DataFrameInverse-link on effects
.to_effect_size()DataFrameCohen’s d, η², r (semi-partial)
.vif()DataFrameVariance inflation factors
.filter_params(terms)DataFrameSubset params by term name(s)
.filter_significant(alpha)DataFrameParams where p < alpha
.filter_effects(...)DataFrameSubset effects by terms/levels/contrasts
.jointtest()DataFrameType III F / χ² tests
.to_markdown()strPipe-delimited markdown of last result
.to_csv()strCSV of last result

Simulate

Simulate data from a model with or without fitting

API: .simulate()

bossanova
Pre-fit (generate data)
Post-fit (simulate responses)
Power analysis
R
Post-fit (simulate responses)
Power analysis (mixed)
Pre-fit (generate data)
Post-fit (simulate responses)
Power analysis
from bossanova.distributions import normal, categorical, varying

m = model("y ~ x + group")
m.simulate(n=200, coef={"x": 0.5, "groupB": -1.0},
           x=normal(0, 1), group=categorical(["A", "B"]))
m.fit().infer()

Plotting

API: plot methods

All plot methods return a matplotlib Figure or seaborn FacetGrid.

MethodShowsRequires
plot_design()Design matrix heatmapdata
plot_vif()VIF bar chartdata
plot_relationships()Predictor scatterplot matrixdata
plot_params()Coefficient forest plot.fit()
plot_resid()Residual diagnostics (4-panel).fit()
plot_predict()Prediction curves with CI bands.fit() + term=
plot_explore()EMM / contrast plot.explore() + specs=
plot_resamples()Bootstrap/permutation distributions.infer(how="boot"|"perm")
plot_ranef()Random effects caterpillar plot.fit() + mixed model
plot_profile()Profile likelihood curves.infer(how="profile")

Model comparison

Compare nested and non-nested models using a variety of approaches

API: compare()

bossanova
R
from bossanova import compare

compare(m1, m2, m3)                    # Auto: F (LM), deviance (GLM), LRT (mixed)
compare(m1, m2, method="lrt")          # Explicit likelihood ratio test
compare(m1, m2, method="aic")          # AIC comparison with delta-AIC and weights
compare(m1, m2, method="bic")          # BIC comparison with delta-BIC and weights
compare(m1, m2, method="cv", cv=10)    # Cross-validation comparison
compare(m1, m2, method="lrt", refit=True)  # Refit REML → ML for valid LRT

Properties quick lookup

PropertyReturnsRequires
paramsCoefficients (+ SE, CI, p after .infer()).fit()
diagnosticsAIC, BIC, loglik, R², deviance, etc..fit()
effectsEMMs / marginal slopes (+ SE, CI, p after .infer()).explore()
predictionsFitted values (+ SE, CI after .infer()).predict()
simulationsGenerated data or simulated responses.simulate()
resamplesRaw bootstrap/permutation values.infer(how="boot"|"perm")
power_resultsPower, coverage, bias, RMSE.simulate(power=...)
varying_offsetsBLUPs per group.fit() + mixed
varying_paramsPopulation + BLUP coefficients.fit() + mixed
varying_spreadVariance components (σ², τ², ICC).fit() + mixed
varying_corrRandom effect correlations.fit() + mixed
designmatDesign matrixdata
metadatanobs, nparams, ngroupsdata

Utilities

FunctionPurpose
compare(*models)Model comparison: auto-selects F / deviance / LRT by type; or method="aic", "bic", "cv"
load_dataset(name)Load a built-in dataset
show_datasets()List all available datasets
set_backend("jax")Switch to JAX backend (default: "numpy")
get_backend()Current backend name
set_display_digits(n)Set decimal places for display
to_markdown(df)Export any DataFrame as markdown

Common patterns

bossanova
R
# Full workflow — one chain
m = model("y ~ x * group", data).fit().infer()
m.params       # estimates with CIs and p-values
m.summary()    # R-style summary table

# Explore + infer
m.explore("pairwise(group)").infer()
m.effects      # pairwise contrasts with CIs

# Robust standard errors
m.fit().infer(errors="HC3")

# Bootstrap with parallel workers
m.fit().infer(how="boot", n_boot=5000, n_jobs=4)

# Mixed model with profile CIs on variance components
m = model("y ~ x + (1 | subj)", data).fit().infer(how="profile")
m.varying_spread   # σ², τ² with profile CIs