model - bossanova

Unified statistical model with simulation-first support.

Provides a single interface for lm, glm, lmer, and glmer. Model type is inferred from formula structure and family parameter.

Parameters:

Name	Type	Description	Default
`formula`	`str`	R/Patsy-style formula (e.g., `"y ~ x"`, ``"y ~ x + (1	group)"``).
`data`	`DataFrame \| str \| None`	Input data. Accepts a Polars DataFrame, a file path (str to CSV/TSV/Parquet/JSON/NDJSON), or None for simulation-first workflows.	required
`family`	`str`	Response distribution: `"gaussian"` (default), `"binomial"`, `"poisson"`, `"gamma"`, or `"tdist"`.	required
`link`	`str \| None`	Link function (`"identity"`, `"logit"`, `"log"`, etc.). None uses canonical link for family.	required
`method`	`str \| None`	Estimation method: `"ols"`, `"ml"`, or `"reml"`. None auto-selects.	required
`missing`	`str`	Missing value handling: `"drop"` (default) or `"fail"`.	required
`contrasts`	`dict \| None`	Custom contrast matrices for categorical predictors. Dict mapping column names to ndarray matrices of shape `(n_levels, n_levels - 1)`. For named contrasts, use formula syntax instead: `sum(x)`, `treatment(x, ref=B)`, etc.	required

Notes: Formula Operators

Operator	Example	Meaning
`+`	`a + b`	Main effects (additive)
`*`	`a * b`	Main effects + interaction (`a + b + a:b`)
`:`	`a:b`	Interaction term only
`**` / `^`	`x**2`	Power term (quadratic, cubic, etc.)
`\|`	`(1 \| group)`	Random effect (mixed models)

In-Formula Transforms

Transform	Effect	Example
`center(x)`	`x - mean(x)`	`y ~ center(age)`
`norm(x)`	`x / sd(x)`	`y ~ norm(income)`
`zscore(x)`	`(x - mean(x)) / sd(x)`	`y ~ zscore(x)`
`scale(x)`	`(x - mean(x)) / (2 * sd(x))` — Gelman scaling	`y ~ scale(x)`
`rank(x)`	Average-method rank	`y ~ rank(x)`
`signed_rank(x)`	`sign(x) * rank(\|x\|)`	`y ~ signed_rank(x)`
`log(x)`	Natural log	`y ~ log(x)`
`log10(x)`	Base-10 log	`y ~ log10(x)`
`sqrt(x)`	Square root	`y ~ sqrt(x)`
`poly(x, d)`	Orthogonal polynomial basis of degree `d`	`y ~ poly(x, 2)`
`factor(x)`	Treat as categorical	`y ~ factor(year)`

All stateful transforms (center, norm, zscore, scale, rank, signed_rank) learn parameters from training data and reapply them on new data during .predict(). Nesting is supported: zscore(rank(x)).

Contrast Coding (In-Formula)

Function	Example	Intercept means	Coefficients mean
`treatment(x)`	`y ~ treatment(group)`	Reference level mean	Difference from reference
`sum(x)`	`y ~ sum(group)`	Grand mean	Deviation from grand mean
`helmert(x)`	`y ~ helmert(group)`	Grand mean	Level vs. mean of previous levels
`poly(x)`	`y ~ poly(group)`	Grand mean	Linear, quadratic, cubic trends
`sequential(x)`	`y ~ sequential(group)`	First level mean	Successive differences (B-A, C-B)

Set a custom reference level or ordering with keyword args: treatment(group, ref=B), sum(group, omit=A), helmert(group, [low, med, high]). ::

model("y ~ sum(group) + x", data)                    # sum coding
model("y ~ treatment(group, ref=B) + x", data)       # treatment with ref="B"
model("y ~ group + x", data, contrasts={"group": M}) # custom ndarray matrix

Attributes:

Name	Type	Description
`contrasts`	`dict \| None`	Custom contrast matrices mapping column names to ndarray matrices.
`data`	`DataFrame \| str \| None`	Input data as a Polars DataFrame, file path, or None for simulation.
`designmat`	`DataFrame`	Fixed-effects design matrix as a named DataFrame.
`diagnostics`	`DataFrame`	Model-level goodness-of-fit diagnostics.
`effects`	`DataFrame`	Marginal effects or estimated marginal means (EMMs) table.
`family`	`str`	Response distribution (`"gaussian"`, `"binomial"`, `"poisson"`, `"gamma"`, `"tdist"`).
`formula`	`str`	R-style model formula string (e.g., ``"y ~ x + (1
`link`	`str \| None`	Link function (e.g., `"identity"`, `"logit"`, `"log"`). None uses canonical link.
`metadata`	`DataFrame`	Model metadata: observation counts, parameter count, group counts.
`method`	`str \| None`	Estimation method (`"ols"`, `"ml"`, `"reml"`). None auto-selects.
`missing`	`str`	Missing value handling: `"drop"` (default) or `"fail"`.
`params`	`DataFrame`	Coefficient estimates table from the fitted model.
`power_results`	`DataFrame`	Power analysis results from `simulate(power=...)`.
`predictions`	`DataFrame`	Predictions table from the fitted model.
`resamples`	`DataFrame \| None`	Raw resampled values from bootstrap or permutation inference.
`simulations`	`DataFrame`	Simulations table from data generation or response simulation.
`varying_corr`	`DataFrame`	Random effect correlations for mixed models.
`varying_offsets`	`DataFrame`	Random effects (BLUPs) -- group-level deviations from population parameters.
`varying_params`	`DataFrame`	Group-specific coefficients (population params + varying offsets).
`varying_spread`	`DataFrame`	Variance components for mixed models in long form.

Methods:

Name	Description
`explore`	Compute marginal effects or estimated marginal means (EMMs).
`filter_effects`	Filter effects DataFrame by term, level, or contrast.
`filter_params`	Filter params to specific coefficient terms.
`filter_significant`	Filter params to statistically significant results.
`fit`	Fit the model to data.
`infer`	Augment current results with statistical inference.
`jointtest`	Compute ANOVA-style joint hypothesis tests for model terms.
`plot_design`	Plot design matrix as an annotated heatmap.
`plot_explore`	Plot marginal effects or estimated marginal means.
`plot_params`	Plot parameter estimates as a forest plot.
`plot_predict`	Plot marginal predictions across a predictor range.
`plot_profile`	Plot profile likelihood curves.
`plot_ranef`	Plot random effect estimates.
`plot_relationships`	Plot pairwise relationships between response and predictors.
`plot_resamples`	Plot distribution of resampled statistics.
`plot_resid`	Plot residual diagnostics (4-panel grid).
`plot_vif`	Plot VIF diagnostics as correlation heatmap.
`predict`	Generate predictions from the fitted model.
`reset_contrasts`	Reset contrast coding to defaults (treatment/dummy coding).
`set_contrasts`	Set custom contrast coding for categorical predictors.
`set_display`	Toggle automatic result display in the REPL and notebooks.
`show_math`	Display structural LaTeX equation with term explanations.
`simulate`	Generate data from scratch or simulate responses from a fitted model.
`summary`	Generate a formatted model summary (R-style or compact).
`to_effect_size`	Compute standardized effect sizes from params.
`to_odds_ratio`	Transform params to odds ratio scale (binomial GLM only).
`to_response_scale`	Transform effects from link scale to response scale.
`vif`	Compute variance inflation factors for model predictors.

Attributes¶

contrasts¶

contrasts: dict | None = field(default=None, repr=False)

Custom contrast matrices mapping column names to ndarray matrices.

data¶

data: pl.DataFrame | str | None = field(default=None, repr=False, converter=_coerce_data, validator=(validators.optional(validators.instance_of(pl.DataFrame))), on_setattr=(setters.frozen))

Input data as a Polars DataFrame, file path, or None for simulation.

designmat¶

designmat: pl.DataFrame

Fixed-effects design matrix as a named DataFrame.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: n x p design matrix with columns named after model terms.

diagnostics¶

diagnostics: pl.DataFrame

Model-level goodness-of-fit diagnostics.

Returns a single-row DataFrame with metrics that vary by model type: df, AIC, BIC, loglik, R-squared (OLS), deviance/dispersion (GLM), sigma/ICC (mixed). After .infer(how='cv'), adds cv_error.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Single-row DataFrame with model diagnostics.

effects¶

effects: pl.DataFrame

Marginal effects or estimated marginal means (EMMs) table.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Grid columns (focal variable levels or contrast labels) plus estimate. After `.infer()`, adds se, statistic, df, p_value, ci_lower, ci_upper.

family¶

family: str = field(default='gaussian', validator=(validators.in_(FAMILIES)))

Response distribution ("gaussian", "binomial", "poisson", "gamma", "tdist").

formula¶

formula: str = field(on_setattr=(setters.frozen))

R-style model formula string (e.g., "y ~ x + (1|group)").

link¶

link: str | None = field(default=None)

Link function (e.g., "identity", "logit", "log"). None uses canonical link.

metadata¶

metadata: pl.DataFrame

Model metadata: observation counts, parameter count, group counts.

Returns a single-row DataFrame with structural info about the model and data. For mixed models, includes ngroups.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Single-row DataFrame with model metadata.

method¶

method: str | None = field(default=None)

Estimation method ("ols", "ml", "reml"). None auto-selects.

missing¶

missing: str = field(default='drop', validator=(validators.in_(['drop', 'fail'])))

Missing value handling: "drop" (default) or "fail".

params¶

params: pl.DataFrame

Coefficient estimates table from the fitted model.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: term, estimate. After `.infer()`, adds se, statistic, df, p_value, ci_lower, ci_upper.

power_results¶

power_results: pl.DataFrame

Power analysis results from simulate(power=...).

Returns:

Type	Description
`DataFrame`	DataFrame with columns: n, sigma, term, true_value, power,
`DataFrame`	power_ci_lower, power_ci_upper, coverage, bias, rmse,
`DataFrame`	mean_se, empirical_se, n_sims, n_failed.

predictions¶

predictions: pl.DataFrame

Predictions table from the fitted model.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: fitted (and link for GLMs). After `.infer()`, adds se, ci_lower, ci_upper.

resamples¶

resamples: pl.DataFrame | None

Raw resampled values from bootstrap or permutation inference.

Returns a long-format DataFrame with one row per (resample, term) combination:

resample (int): Resample index (0 to n_resamples - 1).
term (str): Parameter or effect name.
value (float): Resampled estimate (bootstrap) or null test statistic (permutation).

Total rows = n_resamples × n_terms.

Returns None if .infer() hasn’t been called with how="boot"|"perm", or if save_resamples=False was passed.

Returns:

Type	Description
`DataFrame \| None`	pl.DataFrame

simulations¶

simulations: pl.DataFrame

Simulations table from data generation or response simulation.

Pre-fit mode (n=): full generated dataset with predictors and response. Post-fit mode (nsim=): sim_1, sim_2, ... columns with simulated responses.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Simulated data (contents vary by mode).

varying_corr¶

varying_corr: pl.DataFrame

Random effect correlations for mixed models.

Returns a tidy DataFrame of pairwise correlations between random effects. Empty (zero rows) for models with only random intercepts or uncorrelated (diagonal) RE structures.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: group, effect1, effect2, corr.

varying_offsets¶

varying_offsets: pl.DataFrame

Random effects (BLUPs) -- group-level deviations from population parameters.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: group, level, plus one column per random effect. After `.infer()`, adds pi_lower/pi_upper prediction intervals.

varying_params¶

varying_params: pl.DataFrame

Group-specific coefficients (population params + varying offsets).

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: group, level, plus one column per random effect with the conditional coefficient (population + BLUP).

varying_spread¶

varying_spread: pl.DataFrame

Variance components for mixed models in long form.

Each variance term (residual, random effects) gets both a variance and std_dev row. Correlations get a corr row and ICC gets an icc row.

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: `term`, `metric`, `estimate`. After `.infer(how="profile")`, or ``.infer(how=“boot”

Methods¶

explore¶

explore(formula: str, *, inverse_transforms: bool = True, effect_scale: Literal['auto', 'link', 'response'] = 'auto', varying: Literal['exclude', 'include'] = 'exclude', how: Literal['auto', 'mem', 'ame'] = 'auto', by: str | None = None) -> ModelResult

Compute marginal effects or estimated marginal means (EMMs).

Parameters:

Name	Type	Description	Default
`formula`	`str`	Explore formula specifying the focal variable (left of `~`) and optional conditions (right of `~`). Grammar:: lhs [ ‘~’ rhs ] lhs := focal_term	contrast_fn ‘(’ focal_term ‘)’
`inverse_transforms`	`bool`	When True (default), raw variable names and values in the explore formula are auto-resolved through learned formula transforms (e.g. `center`, `zscore`, `scale`). Set to False to use transformed-scale names/values directly.	`True`
`effect_scale`	`Literal[‘auto’, ‘link’, ‘response’]`	`"auto"` (default), `"link"`, or `"response"` (inverse-link / data scale). `"auto"` resolves to `"response"` when `how="ame"` with a non-identity link (GLMs), `"link"` otherwise.	`‘auto’`
`varying`	`Literal[‘exclude’, ‘include’]`	`"exclude"` (default) or `"include"` random effects.	`‘exclude’`
`how`	`Literal[‘auto’, ‘mem’, ‘ame’]`	`"auto"` (default), `"mem"` for emmeans-style balanced reference grid (Marginal Estimated Mean), or `"ame"` for g-computation / Average Marginal Effect. `"auto"` selects `"ame"` for GLMs (non-identity link), `"mem"` for linear models.	`‘auto’`
`by`	`str \| None`	(str	None) Column name for subgroup analysis. Computes separate effects within each level of this variable.

Returns:

Name	Type	Description
`self`	`ModelResult`	For method chaining. Results in `.effects`.

Examples:

EMMs and contrasts::

m = model("y ~ treatment", data).fit()
m.explore("treatment").infer()
m.explore("pairwise(treatment)").infer()

With transforms (raw variable names resolve automatically)::

m = model("y ~ center(x) + group", data).fit()
m.explore("group ~ x@[50]")  # auto-centers 50

Notes: Terminology: The focal variable (LHS) is the variable whose effect is estimated. Conditions (RHS, after ~) control the reference grid — pin covariates at values or cross the focal with levels of another variable. The @ operator sets specific values; [A - B] brackets specify contrasts.

Formula Patterns

Formula	Computes
`"X"`	EMMs (categorical) or marginal slope (continuous)
`"X@v"`	EMM at a single value
`"X@[a, b, c]"`	EMMs at specific levels or values
`"X@range(n)"`	EMMs at n evenly-spaced values
`"X@quantile(n)"`	EMMs at n quantile values
`"X ~ Z"`	Cross focal X with levels of Z
`"X ~ Z@v"`	Pin condition Z at value v
`"X ~ Z@[v1, v2]"`	Cross focal X with specific Z values
`"X ~ Z@range(n)"`	Cross focal X with n evenly-spaced Z values
`"X ~ Z@quantile(n)"`	Cross focal X with n quantile Z values
`"X ~ A + B"`	Multiple conditions
`"X[A - B]"`	Contrast: A minus B
`"X[A - B, C - B]"`	Multiple contrasts
`"X[* - B]"`	Each other level vs B
`"X[(A + B) - C]"`	mean(A, B) minus C
`"X[A - B] ~ Z[C - D]"`	Interaction contrast: (A-B at C) minus (A-B at D)
`"x ~ Z[A - B]"`	Slope of x at A minus slope at B
`"X:Z[A:C - B:D]"`	Cell contrast: A:C minus cell B:D

Contrast Functions

Formula	Computes	Order-dependent
`treatment(X, ref=A)`	Each level vs reference A	No
`dummy(X, ref=A)`	Same as `treatment`	No
`sum(X)`	Each level vs grand mean	No
`deviation(X, ref=A)`	Same as `sum`	No
`poly(X)` / `poly(X, d)`	Orthogonal polynomials up to degree d	Yes
`sequential(X)`	Adjacent: B-A, C-B	Yes
`helmert(X)`	Each level vs mean of previous	Yes
`pairwise(X)`	All pairs: B-A, C-A, C-B	No

Keyword Arguments for Scaling

Kwarg	Values	Effect
`effect_scale`	`"auto"` (default), `"link"`, `"response"`	`"auto"` = response for AME/GLM, else link
`varying`	`"exclude"` (default), `"include"`	Population vs group-specific (mixed models)
`inverse_transforms`	`True` (default), `False`	Auto-resolve `center(x)` / `zscore(x)`

filter_effects¶

filter_effects(*, terms: list[str] | str | None = None, levels: list[str] | str | None = None, contrasts: list[str] | str | None = None) -> pl.DataFrame

Filter effects DataFrame by term, level, or contrast.

Parameters:

Name	Type	Description	Default
`terms`	`list[str] \| str \| None`	Term name(s) to include (filters ‘term’ column).	`None`
`levels`	`list[str] \| str \| None`	Level name(s) to include (filters first grid column).	`None`
`contrasts`	`list[str] \| str \| None`	Contrast name(s) to include (filters ‘contrast’ column).	`None`

Returns:

Type	Description
`DataFrame`	Filtered effects DataFrame.

Examples:

>>> m.explore("treatment").infer()
>>> m.filter_effects(levels=["A", "B"])
>>> m.explore("pairwise(treatment)").infer()
>>> m.filter_effects(contrasts="A - B")

filter_params¶

filter_params(terms: list[str] | str) -> pl.DataFrame

Filter params to specific coefficient terms.

Parameters:

Name	Type	Description	Default
`terms`	`list[str] \| str`	Term name(s) to include.	required

Returns:

Type	Description
`DataFrame`	Filtered DataFrame with only the specified terms.

Examples:

>>> m.filter_params("x")
>>> m.filter_params(["x", "z"])

filter_significant¶

filter_significant(alpha: float = 0.05) -> pl.DataFrame

Filter params to statistically significant results.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Significance threshold. Default 0.05.	`0.05`

Returns:

Type	Description
`DataFrame`	DataFrame with only rows where p_value < alpha.

Examples:

>>> m.fit().infer().filter_significant()
>>> m.filter_significant(0.01)

fit¶

fit(*, weights: str | None = None, offset: str | None = None, nAGQ: int | None = None, **kwargs: object) -> ModelResult

Fit the model to data.

Parameters:

Name	Type	Description	Default
`weights`	`str \| None`	Column name for observation weights (non-negative).	`None`
`offset`	`str \| None`	Column name for an offset term (added to the linear predictor).	`None`
`nAGQ`	`int \| None`	Gauss-Hermite quadrature points for GLMM (0=fast, 1=Laplace, >1=adaptive GHQ for scalar random intercept models).	`None`
`**kwargs`	`object`	Fitting options. Recognized keys: `method` (`"ols"`, `"ml"`, `"reml"`), `solver` (`"qr"`, `"irls"`, `"pls"`, `"pirls"`), `tol`, `max_iter`, `max_outer_iter`, `verbose`, `use_hessian`.	`{}`

Returns:

Name	Type	Description
`self`	`ModelResult`	For method chaining.

Examples:

Fit and inspect::

m = model("y ~ x", data).fit()
m.params

Chain with inference::

m = model("y ~ x", data).fit().infer()

infer¶

infer(how: Literal['asymp', 'boot', 'perm', 'cv', 'profile', 'joint'] = 'asymp', conf_level: float | int | str = 0.95, errors: Literal['iid', 'HC0', 'HC1', 'HC2', 'HC3', 'hetero', 'unequal_var'] = 'iid', null: float = 0.0, alternative: Literal['two-sided', 'greater', 'less'] = 'two-sided', *, n_boot: int = 1000, n_perm: int = 1000, ci_type: Literal['bca', 'percentile'] = 'bca', seed: int | None = None, n_jobs: int = 1, save_resamples: bool = True, k: int = 10, n_steps: int = 20, verbose: bool = False, threshold: float | None = None, profile: bool = True, holdout_group: str | None = None) -> ModelResult

Augment current results with statistical inference.

Operates on the last operation: .fit() -> params, .explore() -> effects, .predict() -> predictions, .simulate() -> simulations.

Use how="joint" for ANOVA-style joint hypothesis tests (F or chi-squared per term).

Which how methods apply to which operations::

Operation     asymp   boot   perm   cv   profile   joint
─────────     ─────   ────   ────   ──   ───────   ─────
.fit()          ✓      ✓      ✓     ✓
.explore()      ✓      ✓      ✓                     ✓
.predict()      ✓      ✓            ✓
.simulate()     ✓

profile is a special case: it operates on mixed-model variance components (varying_spread), not on the last operation’s results.

Parameters:

Name	Type	Description	Default
`how`	`Literal[‘asymp’, ‘boot’, ‘perm’, ‘cv’, ‘profile’, ‘joint’]`	`"asymp"` (default), `"boot"`, `"perm"`, `"cv"`, `"profile"` (mixed model variance components), or `"joint"` (ANOVA-style per-term tests).	`‘asymp’`
`conf_level`	`float \| int \| str`	Confidence level (0.95, 95, or “95%”). Default 0.95.	`0.95`
`errors`	`Literal[‘iid’, ‘HC0’, ‘HC1’, ‘HC2’, ‘HC3’, ‘hetero’, ‘unequal_var’]`	Error structure. `"iid"` (default) for standard OLS/MLE errors, `"HC0"`–`"HC3"` for sandwich estimators, `"hetero"` (alias for HC3/HC0), or `"unequal_var"` for Welch-style per-cell SEs with Satterthwaite df.	`‘iid’`
`null`	`float`	Null hypothesis value (default 0.0).	`0.0`
`alternative`	`Literal[‘two-sided’, ‘greater’, ‘less’]`	`"two-sided"` (default), `"greater"`, or `"less"`.	`‘two-sided’`
`n_boot`	`int`	Number of bootstrap resamples (`how="boot"`).	`1000`
`n_perm`	`int`	Number of permutations (`how="perm"`).	`1000`
`ci_type`	`Literal[‘bca’, ‘percentile’]`	Bootstrap CI method: `"bca"` (default) or `"percentile"`.	`‘bca’`
`seed`	`int \| None`	Random seed for reproducibility.	`None`
`n_jobs`	`int`	Parallel workers. Default 1 (serial).	`1`
`save_resamples`	`bool`	Store individual resample results. Default True.	`True`
`k`	`int`	Number of CV folds (`how="cv"`). Default 10.	`10`
`n_steps`	`int`	Profile steps (`how="profile"`). Default 20.	`20`
`verbose`	`bool`	Print progress. Default False.	`False`
`threshold`	`float \| None`	Significance threshold (`how="profile"`).	`None`
`profile`	`bool`	Auto-compute profile likelihood CIs for variance components when `how="boot"` or `how="perm"` on a mixed model. Default True. Set False to skip.	`True`

Returns:

Name	Type	Description
`self`	`ModelResult`	For method chaining.

jointtest¶

jointtest(terms: list[str] | None = None, *, errors: Literal['iid', 'HC0', 'HC1', 'HC2', 'HC3', 'hetero', 'unequal_var'] = 'iid') -> pl.DataFrame

Compute ANOVA-style joint hypothesis tests for model terms.

Convenience method that auto-fits if needed, then computes joint (Type III) F-tests (gaussian) or chi-square tests (non-gaussian) for each model term.

Parameters:

Name	Type	Description	Default
`terms`	`list[str] \| None`	Specific terms to test, or None for all terms.	`None`
`errors`	`Literal[‘iid’, ‘HC0’, ‘HC1’, ‘HC2’, ‘HC3’, ‘hetero’, ‘unequal_var’]`	Error structure. `"iid"` (default) for standard OLS/MLE errors, `"HC0"`–`"HC3"` or `"hetero"` for sandwich SEs, or `"unequal_var"` for Welch ANOVA with per-term Satterthwaite df.	`‘iid’`

Returns:

Type	Description
`DataFrame`	pl.DataFrame: Columns: term, df1, df2 (F-test only), statistic, p_value.

Examples:

Auto-fits and returns ANOVA table::

m = model("y ~ x1 + x2", data)
m.jointtest()

Welch ANOVA (per-term Satterthwaite df)::

m = model("y ~ factor(group)", data)
m.jointtest(errors="unequal_var")

plot_design¶

plot_design(**kwargs: object) -> object

Plot design matrix as an annotated heatmap.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_design`. Key options: `max_rows`, `annotate_terms`, `show_contrast_info`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_explore¶

plot_explore(specs: str, **kwargs: object) -> object

Plot marginal effects or estimated marginal means.

Parameters:

Name	Type	Description	Default
`specs`	`str`	Specification string for the marginal effects/means.	required
`**kwargs`	`object`	Forwarded to `plot_explore`. Key options: `hue`, `col`, `row`, `show_pvalue`, `ref_line`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_params¶

plot_params(**kwargs: object) -> object

Plot parameter estimates as a forest plot.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_params`. Key options: `include_intercept`, `sort`, `show_values`, `show_pvalue`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_predict¶

plot_predict(term: str, **kwargs: object) -> object

Plot marginal predictions across a predictor range.

Accepts either a bare column name or an explore-style formula::

m.plot_predict("age")              # bare column
m.plot_predict("age ~ sex")        # hue by sex
m.plot_predict("age ~ sex@Female") # pin sex=Female
m.plot_predict("age@range(5)")     # 5-point grid
m.plot_predict("age@[25,50,75]")   # explicit grid values

Parameters:

Name	Type	Description	Default
`term`	`str`	Predictor variable or explore-style formula string.	required
`**kwargs`	`object`	Forwarded to `plot_predict`. Key options: `hue`, `col`, `at`, `interval`, `show_data`, `show_rug`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_profile¶

plot_profile(**kwargs: object) -> object

Plot profile likelihood curves.

Requires .infer(how="profile") to have been called first.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_profile`. Key options: `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_ranef¶

plot_ranef(**kwargs: object) -> object

Plot random effect estimates.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_ranef`. Key options: `group`, `term`, `show`, `sort`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_relationships¶

plot_relationships(**kwargs: object) -> object

Plot pairwise relationships between response and predictors.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_relationships`. Key options: `show_vif`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_resamples¶

plot_resamples(**kwargs: object) -> object

Plot distribution of resampled statistics.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_resamples`. Key options: `which`, `include_intercept`, `terms`, `show_ci`, `show_pvalue`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_resid¶

plot_resid(**kwargs: object) -> object

Plot residual diagnostics (4-panel grid).

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_resid`. Key options: `which`, `residual_type`, `lowess`, `label_outliers`, `height`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

plot_vif¶

plot_vif(**kwargs: object) -> object

Plot VIF diagnostics as correlation heatmap.

Parameters:

Name	Type	Description	Default
`**kwargs`	`object`	Forwarded to `plot_vif`. Key options: `cmap`, `height`, `aspect`.	`{}`

Returns:

Name	Type	Description
`object`	`object`	Matplotlib figure.

predict¶

predict(newdata: str | pl.DataFrame | None = None, type: Literal['response', 'link'] = 'response', *, varying: Literal['exclude', 'include'] = 'exclude', allow_new_levels: bool = False, n_points: int | Literal['data'] = 50, **kwargs: object) -> ModelResult

Generate predictions from the fitted model.

Accepts a formula string, a DataFrame, or None:

Formula mode (str): Builds a prediction grid from an explore-style formula and returns grid columns prepended to fitted values — like R’s ggpredict() / effects package.
DataFrame mode (pl.DataFrame): Predicts on the given data.
None: Predicts on the training data.

Parameters:

Name	Type	Description	Default
`newdata`	`str \| DataFrame \| None`	Formula string (e.g. `"wt ~ cyl"`), a Polars DataFrame, or None for training-data predictions.	`None`
`type`	`Literal[‘response’, ‘link’]`	Scale: `"response"` (default) or `"link"`.	`‘response’`
`varying`	`Literal[‘exclude’, ‘include’]`	Random effects: `"exclude"` (default, population-level) or `"include"` (conditional with BLUPs).	`‘exclude’`
`allow_new_levels`	`bool`	If True, unseen groups predict at population level. Default False.	`False`
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables in formula mode. Use `"data"` for observed unique values. Default 50. Ignored in DataFrame/None mode.	`50`
`**kwargs`	`object`	Reserved for future use.	`{}`

Returns:

Name	Type	Description
`self`	`ModelResult`	For method chaining. Results in `.predictions`.

Examples:

Formula-mode predictions (like R ggpredict)::

m = model("mpg ~ wt + hp + cyl", mtcars).fit()
m.predict("wt ~ cyl").predictions
# Returns: wt | cyl | fitted

m.predict("wt ~ cyl").infer().predictions
# Returns: wt | cyl | fitted | se | ci_lower | ci_upper

m.predict("wt@range(10) ~ cyl@[4, 6]").predictions
m.predict("wt", n_points="data").predictions

DataFrame predictions (unchanged)::

m.predict(new_df).predictions

Training-data predictions (unchanged)::

m.predict().predictions

reset_contrasts¶

reset_contrasts() -> Self

Reset contrast coding to defaults (treatment/dummy coding).

.. deprecated:: Create a new model instead of resetting contrasts in place.

Clears any custom contrasts set via set_contrasts(). The model must be re-fitted after resetting.

Returns:

Name	Type	Description
`self`	`Self`	The model instance for method chaining.

set_contrasts¶

set_contrasts(**contrasts: object) -> Self

Set custom contrast coding for categorical predictors.

.. deprecated:: Use formula syntax (sum(x), treatment(x, ref=B)) or the contrasts= constructor kwarg for ndarray matrices instead.

Parameters:

Name	Type	Description	Default
`**contrasts`	`object`	Column name → contrast spec. Values: a string (`'treatment'`, `'sum'`, `'helmert'`, `'poly'`, `'sequential'`), a tuple `('treatment', 'B')` for custom reference, or an ndarray contrast matrix.	`{}`

Returns:

Name	Type	Description
`self`	`Self`	For method chaining. Model must be re-fitted after.

set_display¶

set_display(enabled: bool) -> None

Toggle automatic result display in the REPL and notebooks.

When enabled (the default), evaluating a model object after calling .fit(), .explore(), .predict(), or .simulate() automatically shows the relevant result table below the model header. Disable for a compact one-line repr.

Parameters:

Name	Type	Description	Default
`enabled`	`bool`	True to auto-display results, False for compact repr.	required

show_math¶

show_math(*, explanations: bool = True) -> 'MathDisplay'

Display structural LaTeX equation with term explanations.

Works before or after .fit(). Before data, shows generic symbolic form. With data, shows specific factor levels and transform parameters.

Parameters:

Name	Type	Description	Default
`explanations`	`bool`	If True (default), include term explanations with contrast types, reference levels, and transformation parameters.	`True`

Returns:

Type	Description
`‘MathDisplay’`	MathDisplay object with rich display support (repr_latex,
`‘MathDisplay’`	repr_html, to_latex).

Examples:

>>> m = model("y ~ x + group", data)
>>> m.show_math()           # renders in Jupyter
>>> m.show_math().to_latex() # raw LaTeX string

simulate¶

simulate(n: int | None = None, nsim: int | None = None, seed: int | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, varying: str = 'fitted', power: int | dict[str, Any] | None = None, **var_specs: object) -> ModelResult

Generate data from scratch or simulate responses from a fitted model.

Pre-fit mode (n=): generate a dataset from formula and distributions. Post-fit mode (nsim=): simulate response vectors from the fitted model. Power mode (power=): run simulation-based power analysis.

Parameters:

Name	Type	Description	Default
`n`	`int \| None`	Number of observations to generate (pre-fit mode).	`None`
`nsim`	`int \| None`	Number of response vectors to simulate (post-fit mode).	`None`
`seed`	`int \| None`	Random seed for reproducibility.	`None`
`coef`	`dict[str, float] \| None`	True coefficient values for pre-fit mode.	`None`
`sigma`	`float`	Residual SD for gaussian pre-fit mode (default 1.0).	`1.0`
`varying`	`str`	Random effects handling in post-fit: `"fitted"` or `"sample"`.	`‘fitted’`
`power`	`int \| dict[str, Any] \| None`	Power analysis configuration. `int` for simple (number of sims), `dict` for sweeps (e.g., `{"n_sims": 500, "n": [50, 100, 200]}`). When set, overrides `nsim` and runs power analysis mode.	`None`
`**var_specs`	`object`	Distribution specs for predictors (e.g., `x=normal(0, 1)`).	`{}`

Returns:

Name	Type	Description
`self`	`ModelResult`	For method chaining.

summary¶

summary(style: Literal['r', 'compact'] = 'r', digits: int = 3) -> FormattedText

Generate a formatted model summary (R-style or compact).

Parameters:

Name	Type	Description	Default
`style`	`Literal[‘r’, ‘compact’]`	`"r"` (default) for full R-style output, `"compact"` for brief.	`‘r’`
`digits`	`int`	Decimal places for numeric values. Default 3.	`3`

Returns:

Name	Type	Description
`FormattedText`	`FormattedText`	Rich display object that renders automatically in REPL and Jupyter notebooks without needing `print()`.

to_effect_size¶

to_effect_size(*, include_intercept: bool = False) -> pl.DataFrame

Compute standardized effect sizes from params.

Computes Cohen’s d, semi-partial r, eta-squared, and odds ratio (for binomial models).

Parameters:

Name	Type	Description	Default
`include_intercept`	`bool`	Whether to include the intercept row.	`False`

Returns:

Type	Description
`DataFrame`	DataFrame with added effect size columns (d, r_semi, eta_sq,
`DataFrame`	and odds_ratio for binomial).

Examples:

>>> m = model("y ~ x1 + x2", data).fit().infer()
>>> m.to_effect_size()

to_odds_ratio¶

to_odds_ratio() -> pl.DataFrame

Transform params to odds ratio scale (binomial GLM only).

Exponentiates estimate, ci_lower, and ci_upper columns.

Returns:

Type	Description
`DataFrame`	DataFrame with exponentiated values.

Examples:

>>> m = model("y ~ x", data, family="binomial").fit().infer()
>>> m.to_odds_ratio()

to_response_scale¶

to_response_scale() -> pl.DataFrame

Transform effects from link scale to response scale.

Applies the inverse link function to estimate and CI columns on the effects DataFrame. For example, converts log-odds to probabilities for logistic models.

Returns:

Type	Description
`DataFrame`	DataFrame with values on response scale.

Examples:

>>> m = model("y ~ x", data, family="binomial").fit()
>>> m.explore("x").infer()
>>> m.to_response_scale()

vif¶

vif() -> pl.DataFrame

Compute variance inflation factors for model predictors.

VIF measures multicollinearity — how much the variance of each coefficient is inflated due to correlation with other predictors. Only requires data (not fitting), since VIF is a property of the design matrix.

Returns:

Type	Description
`DataFrame`	DataFrame with columns: term, vif, ci_increase_factor.

Examples:

>>> m = model("y ~ x1 + x2 + x3", data)
>>> m.vif()