Unified statistical model with simulation-first support.
Provides a single interface for lm, glm, lmer, and glmer. Model type is inferred from formula structure and family parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula | str | R/Patsy-style formula (e.g., "y ~ x", ``"y ~ x + (1 | group)"``). |
data | DataFrame | str | None | Input data. Accepts a Polars DataFrame, a file path (str to CSV/TSV/Parquet/JSON/NDJSON), or None for simulation-first workflows. | required |
family | str | Response distribution: "gaussian" (default), "binomial", "poisson", "gamma", or "tdist". | required |
link | str | None | Link function ("identity", "logit", "log", etc.). None uses canonical link for family. | required |
method | str | None | Estimation method: "ols", "ml", or "reml". None auto-selects. | required |
missing | str | Missing value handling: "drop" (default) or "fail". | required |
contrasts | dict | None | Custom contrast matrices for categorical predictors. Dict mapping column names to ndarray matrices of shape (n_levels, n_levels - 1). For named contrasts, use formula syntax instead: sum(x), treatment(x, ref=B), etc. | required |
Notes: Formula Operators
| Operator | Example | Meaning |
|---|---|---|
+ | a + b | Main effects (additive) |
* | a * b | Main effects + interaction (a + b + a:b) |
: | a:b | Interaction term only |
** / ^ | x**2 | Power term (quadratic, cubic, etc.) |
| | (1 | group) | Random effect (mixed models) |
In-Formula Transforms
| Transform | Effect | Example |
|---|---|---|
center(x) | x - mean(x) | y ~ center(age) |
norm(x) | x / sd(x) | y ~ norm(income) |
zscore(x) | (x - mean(x)) / sd(x) | y ~ zscore(x) |
scale(x) | (x - mean(x)) / (2 * sd(x)) — Gelman scaling | y ~ scale(x) |
rank(x) | Average-method rank | y ~ rank(x) |
signed_rank(x) | sign(x) * rank(|x|) | y ~ signed_rank(x) |
log(x) | Natural log | y ~ log(x) |
log10(x) | Base-10 log | y ~ log10(x) |
sqrt(x) | Square root | y ~ sqrt(x) |
poly(x, d) | Orthogonal polynomial basis of degree d | y ~ poly(x, 2) |
factor(x) | Treat as categorical | y ~ factor(year) |
All stateful transforms (center, norm, zscore, scale,
rank, signed_rank) learn parameters from training data and
reapply them on new data during .predict(). Nesting is supported:
zscore(rank(x)).
Contrast Coding (In-Formula)
| Function | Example | Intercept means | Coefficients mean |
|---|---|---|---|
treatment(x) | y ~ treatment(group) | Reference level mean | Difference from reference |
sum(x) | y ~ sum(group) | Grand mean | Deviation from grand mean |
helmert(x) | y ~ helmert(group) | Grand mean | Level vs. mean of previous levels |
poly(x) | y ~ poly(group) | Grand mean | Linear, quadratic, cubic trends |
sequential(x) | y ~ sequential(group) | First level mean | Successive differences (B-A, C-B) |
Set a custom reference level or ordering with keyword args:
treatment(group, ref=B), sum(group, omit=A),
helmert(group, [low, med, high]). ::
model("y ~ sum(group) + x", data) # sum coding
model("y ~ treatment(group, ref=B) + x", data) # treatment with ref="B"
model("y ~ group + x", data, contrasts={"group": M}) # custom ndarray matrixAttributes:
| Name | Type | Description |
|---|---|---|
contrasts | dict | None | Custom contrast matrices mapping column names to ndarray matrices. |
data | DataFrame | str | None | Input data as a Polars DataFrame, file path, or None for simulation. |
designmat | DataFrame | Fixed-effects design matrix as a named DataFrame. |
diagnostics | DataFrame | Model-level goodness-of-fit diagnostics. |
effects | DataFrame | Marginal effects or estimated marginal means (EMMs) table. |
family | str | Response distribution ("gaussian", "binomial", "poisson", "gamma", "tdist"). |
formula | str | R-style model formula string (e.g., ``"y ~ x + (1 |
link | str | None | Link function (e.g., "identity", "logit", "log"). None uses canonical link. |
metadata | DataFrame | Model metadata: observation counts, parameter count, group counts. |
method | str | None | Estimation method ("ols", "ml", "reml"). None auto-selects. |
missing | str | Missing value handling: "drop" (default) or "fail". |
params | DataFrame | Coefficient estimates table from the fitted model. |
power_results | DataFrame | Power analysis results from simulate(power=...). |
predictions | DataFrame | Predictions table from the fitted model. |
resamples | DataFrame | None | Raw resampled values from bootstrap or permutation inference. |
simulations | DataFrame | Simulations table from data generation or response simulation. |
varying_corr | DataFrame | Random effect correlations for mixed models. |
varying_offsets | DataFrame | Random effects (BLUPs) -- group-level deviations from population parameters. |
varying_params | DataFrame | Group-specific coefficients (population params + varying offsets). |
varying_spread | DataFrame | Variance components for mixed models (sigma2, tau2, rho, ICC). |
Methods:
| Name | Description |
|---|---|
explore | Compute marginal effects or estimated marginal means (EMMs). |
filter_effects | Filter effects DataFrame by term, level, or contrast. |
filter_params | Filter params to specific coefficient terms. |
filter_significant | Filter params to statistically significant results. |
fit | Fit the model to data. |
infer | Augment current results with statistical inference. |
jointtest | Compute ANOVA-style joint hypothesis tests for model terms. |
plot_design | Plot design matrix as an annotated heatmap. |
plot_explore | Plot marginal effects or estimated marginal means. |
plot_params | Plot parameter estimates as a forest plot. |
plot_predict | Plot marginal predictions across a predictor range. |
plot_profile | Plot profile likelihood curves. |
plot_ranef | Plot random effect estimates. |
plot_relationships | Plot pairwise relationships between response and predictors. |
plot_resamples | Plot distribution of resampled statistics. |
plot_resid | Plot residual diagnostics (4-panel grid). |
plot_vif | Plot VIF diagnostics as correlation heatmap. |
predict | Generate predictions from the fitted model. |
reset_contrasts | Reset contrast coding to defaults (treatment/dummy coding). |
set_contrasts | Set custom contrast coding for categorical predictors. |
set_display | Toggle automatic result display in the REPL and notebooks. |
show_math | Display structural LaTeX equation with term explanations. |
simulate | Generate data from scratch or simulate responses from a fitted model. |
summary | Generate a formatted model summary (R-style or compact). |
to_effect_size | Compute standardized effect sizes from params. |
to_odds_ratio | Transform params to odds ratio scale (binomial GLM only). |
to_response_scale | Transform effects from link scale to response scale. |
vif | Compute variance inflation factors for model predictors. |
Attributes¶
contrasts¶
contrasts: dict | None = field(default=None, repr=False)Custom contrast matrices mapping column names to ndarray matrices.
data¶
data: pl.DataFrame | str | None = field(default=None, repr=False, converter=_coerce_data, validator=(validators.optional(validators.instance_of(pl.DataFrame))), on_setattr=(setters.frozen))Input data as a Polars DataFrame, file path, or None for simulation.
designmat¶
designmat: pl.DataFrameFixed-effects design matrix as a named DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: n x p design matrix with columns named after model terms. |
diagnostics¶
diagnostics: pl.DataFrameModel-level goodness-of-fit diagnostics.
Returns a single-row DataFrame with metrics that vary by model type:
df, AIC, BIC, loglik, R-squared (OLS), deviance/dispersion (GLM),
sigma/ICC (mixed). After .infer(how='cv'), adds cv_error.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Single-row DataFrame with model diagnostics. |
effects¶
effects: pl.DataFrameMarginal effects or estimated marginal means (EMMs) table.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Grid columns (focal variable levels or contrast labels) plus estimate. After .infer(), adds se, statistic, df, p_value, ci_lower, ci_upper. |
family¶
family: str = field(default='gaussian', validator=(validators.in_(FAMILIES)))Response distribution ("gaussian", "binomial", "poisson", "gamma", "tdist").
formula¶
formula: str = field(on_setattr=(setters.frozen))R-style model formula string (e.g., "y ~ x + (1|group)").
link¶
link: str | None = field(default=None)Link function (e.g., "identity", "logit", "log"). None uses canonical link.
metadata¶
metadata: pl.DataFrameModel metadata: observation counts, parameter count, group counts.
Returns a single-row DataFrame with structural info about the
model and data. For mixed models, includes ngroups.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Single-row DataFrame with model metadata. |
method¶
method: str | None = field(default=None)Estimation method ("ols", "ml", "reml"). None auto-selects.
missing¶
missing: str = field(default='drop', validator=(validators.in_(['drop', 'fail'])))Missing value handling: "drop" (default) or "fail".
params¶
params: pl.DataFrameCoefficient estimates table from the fitted model.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: term, estimate. After .infer(), adds se, statistic, df, p_value, ci_lower, ci_upper. |
power_results¶
power_results: pl.DataFramePower analysis results from simulate(power=...).
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: n, sigma, term, true_value, power, |
DataFrame | power_ci_lower, power_ci_upper, coverage, bias, rmse, |
DataFrame | mean_se, empirical_se, n_sims, n_failed. |
predictions¶
predictions: pl.DataFramePredictions table from the fitted model.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: fitted (and link for GLMs). After .infer(), adds se, ci_lower, ci_upper. |
resamples¶
resamples: pl.DataFrame | NoneRaw resampled values from bootstrap or permutation inference.
Returns a long-format DataFrame with one row per (resample, term) combination:
resample (int): Resample index (0 to n_resamples - 1).
term (str): Parameter or effect name.
value (float): Resampled estimate (bootstrap) or null test statistic (permutation).
Total rows = n_resamples × n_terms.
Returns None if .infer() hasn’t been called with
how="boot"|"perm", or if save_resamples=False was passed.
Returns:
| Type | Description |
|---|---|
DataFrame | None | pl.DataFrame |
simulations¶
simulations: pl.DataFrameSimulations table from data generation or response simulation.
Pre-fit mode (n=): full generated dataset with predictors and response.
Post-fit mode (nsim=): sim_1, sim_2, ... columns with simulated responses.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Simulated data (contents vary by mode). |
varying_corr¶
varying_corr: pl.DataFrameRandom effect correlations for mixed models.
Returns a tidy DataFrame of pairwise correlations between random effects. Empty (zero rows) for models with only random intercepts or uncorrelated (diagonal) RE structures.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: group, effect1, effect2, corr. |
varying_offsets¶
varying_offsets: pl.DataFrameRandom effects (BLUPs) -- group-level deviations from population parameters.
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: group, level, plus one column per random effect. After .infer(), adds pi_lower/pi_upper prediction intervals. |
varying_params¶
varying_params: pl.DataFrameGroup-specific coefficients (population params + varying offsets).
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: group, level, plus one column per random effect with the conditional coefficient (population + BLUP). |
varying_spread¶
varying_spread: pl.DataFrameVariance components for mixed models (sigma2, tau2, rho, ICC).
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: component, estimate. After .infer(how="profile"), or ``.infer(how=“boot” |
Methods¶
explore¶
explore(formula: str, *, inverse_transforms: bool = True, effect_scale: Literal['link', 'response'] = 'link', varying: Literal['exclude', 'include'] = 'exclude', how: Literal['auto', 'mem', 'ame'] = 'auto', by: str | None = None) -> ModelResultCompute marginal effects or estimated marginal means (EMMs).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula | str | Explore formula specifying the focal variable (left of ~) and optional conditions (right of ~). Grammar::lhs [ ‘~’ rhs ] lhs := focal_term | contrast_fn ‘(’ focal_term ‘)’ |
inverse_transforms | bool | When True (default), raw variable names and values in the explore formula are auto-resolved through learned formula transforms (e.g. center, zscore, scale). Set to False to use transformed-scale names/values directly. | True |
effect_scale | Literal[‘link’, ‘response’] | "link" (default) or "response" (inverse-link / data scale). | ‘link’ |
varying | Literal[‘exclude’, ‘include’] | "exclude" (default) or "include" random effects. | ‘exclude’ |
how | Literal[‘auto’, ‘mem’, ‘ame’] | "auto" (default), "mem" for emmeans-style balanced reference grid (Marginal Estimated Mean), or "ame" for g-computation / Average Marginal Effect. "auto" selects "ame" for GLMs (non-identity link), "mem" for linear models. | ‘auto’ |
by | str | None | (str | None) Column name for subgroup analysis. Computes separate effects within each level of this variable. |
Returns:
| Name | Type | Description |
|---|---|---|
self | ModelResult | For method chaining. Results in .effects. |
Examples:
EMMs and contrasts::
m = model("y ~ treatment", data).fit()
m.explore("treatment").infer()
m.explore("pairwise(treatment)").infer()With transforms (raw variable names resolve automatically)::
m = model("y ~ center(x) + group", data).fit()
m.explore("group ~ x@[50]") # auto-centers 50Notes:
Terminology: The focal variable (LHS) is the variable whose
effect is estimated. Conditions (RHS, after ~) control the
reference grid — pin covariates at values or cross the focal with
levels of another variable. The @ operator sets specific values;
[A - B] brackets specify contrasts.
Formula Patterns
| Formula | Computes |
|---|---|
"X" | EMMs (categorical) or marginal slope (continuous) |
"X@v" | EMM at a single value |
"X@[a, b, c]" | EMMs at specific levels or values |
"X@range(n)" | EMMs at n evenly-spaced values |
"X@quantile(n)" | EMMs at n quantile values |
"X ~ Z" | Cross focal X with levels of Z |
"X ~ Z@v" | Pin condition Z at value v |
"X ~ Z@[v1, v2]" | Cross focal X with specific Z values |
"X ~ Z@range(n)" | Cross focal X with n evenly-spaced Z values |
"X ~ Z@quantile(n)" | Cross focal X with n quantile Z values |
"X ~ A + B" | Multiple conditions |
"X[A - B]" | Contrast: A minus B |
"X[A - B, C - B]" | Multiple contrasts |
"X[* - B]" | Each other level vs B |
"X[(A + B) - C]" | mean(A, B) minus C |
"X[A - B] ~ Z[C - D]" | Interaction contrast: (A-B at C) minus (A-B at D) |
"x ~ Z[A - B]" | Slope of x at A minus slope at B |
"X:Z[A:C - B:D]" | Cell contrast: A:C minus cell B:D |
Contrast Functions
| Formula | Computes | Order-dependent |
|---|---|---|
treatment(X, ref=A) | Each level vs reference A | No |
dummy(X, ref=A) | Same as treatment | No |
sum(X) | Each level vs grand mean | No |
deviation(X, ref=A) | Same as sum | No |
poly(X) / poly(X, d) | Orthogonal polynomials up to degree d | Yes |
sequential(X) | Adjacent: B-A, C-B | Yes |
helmert(X) | Each level vs mean of previous | Yes |
pairwise(X) | All pairs: B-A, C-A, C-B | No |
Keyword Arguments for Scaling
| Kwarg | Values | Effect |
|---|---|---|
effect_scale | "link" (default), "response" | Linear predictor vs response scale (GLMs) |
varying | "exclude" (default), "include" | Population vs group-specific (mixed models) |
inverse_transforms | True (default), False | Auto-resolve center(x) / zscore(x) |
filter_effects¶
filter_effects(*, terms: list[str] | str | None = None, levels: list[str] | str | None = None, contrasts: list[str] | str | None = None) -> pl.DataFrameFilter effects DataFrame by term, level, or contrast.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
terms | list[str] | str | None | Term name(s) to include (filters ‘term’ column). | None |
levels | list[str] | str | None | Level name(s) to include (filters first grid column). | None |
contrasts | list[str] | str | None | Contrast name(s) to include (filters ‘contrast’ column). | None |
Returns:
| Type | Description |
|---|---|
DataFrame | Filtered effects DataFrame. |
Examples:
>>> m.explore("treatment").infer()
>>> m.filter_effects(levels=["A", "B"])
>>> m.explore("pairwise(treatment)").infer()
>>> m.filter_effects(contrasts="A - B")filter_params¶
filter_params(terms: list[str] | str) -> pl.DataFrameFilter params to specific coefficient terms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
terms | list[str] | str | Term name(s) to include. | required |
Returns:
| Type | Description |
|---|---|
DataFrame | Filtered DataFrame with only the specified terms. |
Examples:
>>> m.filter_params("x")
>>> m.filter_params(["x", "z"])filter_significant¶
filter_significant(alpha: float = 0.05) -> pl.DataFrameFilter params to statistically significant results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
alpha | float | Significance threshold. Default 0.05. | 0.05 |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with only rows where p_value < alpha. |
Examples:
>>> m.fit().infer().filter_significant()
>>> m.filter_significant(0.01)fit¶
fit(*, weights: str | None = None, offset: str | None = None, nAGQ: int | None = None, **kwargs: object) -> ModelResultFit the model to data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
weights | str | None | Column name for observation weights (non-negative). | None |
offset | str | None | Column name for an offset term (added to the linear predictor). | None |
nAGQ | int | None | Gauss-Hermite quadrature points for GLMM (0=fast, 1=Laplace, >1=adaptive GHQ for scalar random intercept models). | None |
**kwargs | object | Fitting options. Recognized keys: method ("ols", "ml", "reml"), solver ("qr", "irls", "pls", "pirls"), tol, max_iter, max_outer_iter, verbose, use_hessian. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
self | ModelResult | For method chaining. |
Examples:
Fit and inspect::
m = model("y ~ x", data).fit()
m.paramsChain with inference::
m = model("y ~ x", data).fit().infer()infer¶
infer(how: Literal['asymp', 'boot', 'perm', 'cv', 'profile', 'joint'] = 'asymp', conf_level: float | int | str = 0.95, errors: Literal['iid', 'HC0', 'HC1', 'HC2', 'HC3', 'hetero', 'unequal_var'] = 'iid', null: float = 0.0, alternative: Literal['two-sided', 'greater', 'less'] = 'two-sided', *, n_boot: int = 1000, n_perm: int = 1000, ci_type: Literal['bca', 'percentile'] = 'bca', seed: int | None = None, n_jobs: int = 1, save_resamples: bool = True, k: int = 10, n_steps: int = 20, verbose: bool = False, threshold: float | None = None, profile: bool = True, holdout_group: str | None = None) -> ModelResultAugment current results with statistical inference.
Operates on the last operation: .fit() -> params, .explore() ->
effects, .predict() -> predictions, .simulate() -> simulations.
Use how="joint" for ANOVA-style joint hypothesis tests (F or
chi-squared per term).
Which how methods apply to which operations::
Operation asymp boot perm cv profile joint
───────── ───── ──── ──── ── ─────── ─────
.fit() ✓ ✓ ✓ ✓
.explore() ✓ ✓ ✓ ✓
.predict() ✓ ✓ ✓
.simulate() ✓profile is a special case: it operates on mixed-model variance
components (varying_spread), not on the last operation’s results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
how | Literal[‘asymp’, ‘boot’, ‘perm’, ‘cv’, ‘profile’, ‘joint’] | "asymp" (default), "boot", "perm", "cv", "profile" (mixed model variance components), or "joint" (ANOVA-style per-term tests). | ‘asymp’ |
conf_level | float | int | str | Confidence level (0.95, 95, or “95%”). Default 0.95. | 0.95 |
errors | Literal[‘iid’, ‘HC0’, ‘HC1’, ‘HC2’, ‘HC3’, ‘hetero’, ‘unequal_var’] | Error structure. "iid" (default) for standard OLS/MLE errors, "HC0"–"HC3" for sandwich estimators, "hetero" (alias for HC3/HC0), or "unequal_var" for Welch-style per-cell SEs with Satterthwaite df. | ‘iid’ |
null | float | Null hypothesis value (default 0.0). | 0.0 |
alternative | Literal[‘two-sided’, ‘greater’, ‘less’] | "two-sided" (default), "greater", or "less". | ‘two-sided’ |
n_boot | int | Number of bootstrap resamples (how="boot"). | 1000 |
n_perm | int | Number of permutations (how="perm"). | 1000 |
ci_type | Literal[‘bca’, ‘percentile’] | Bootstrap CI method: "bca" (default) or "percentile". | ‘bca’ |
seed | int | None | Random seed for reproducibility. | None |
n_jobs | int | Parallel workers. Default 1 (serial). | 1 |
save_resamples | bool | Store individual resample results. Default True. | True |
k | int | Number of CV folds (how="cv"). Default 10. | 10 |
n_steps | int | Profile steps (how="profile"). Default 20. | 20 |
verbose | bool | Print progress. Default False. | False |
threshold | float | None | Significance threshold (how="profile"). | None |
profile | bool | Auto-compute profile likelihood CIs for variance components when how="boot" or how="perm" on a mixed model. Default True. Set False to skip. | True |
Returns:
| Name | Type | Description |
|---|---|---|
self | ModelResult | For method chaining. |
jointtest¶
jointtest(terms: list[str] | None = None, *, errors: Literal['iid', 'HC0', 'HC1', 'HC2', 'HC3', 'hetero', 'unequal_var'] = 'iid') -> pl.DataFrameCompute ANOVA-style joint hypothesis tests for model terms.
Convenience method that auto-fits if needed, then computes joint (Type III) F-tests (gaussian) or chi-square tests (non-gaussian) for each model term.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
terms | list[str] | None | Specific terms to test, or None for all terms. | None |
errors | Literal[‘iid’, ‘HC0’, ‘HC1’, ‘HC2’, ‘HC3’, ‘hetero’, ‘unequal_var’] | Error structure. "iid" (default) for standard OLS/MLE errors, "HC0"–"HC3" or "hetero" for sandwich SEs, or "unequal_var" for Welch ANOVA with per-term Satterthwaite df. | ‘iid’ |
Returns:
| Type | Description |
|---|---|
DataFrame | pl.DataFrame: Columns: term, df1, df2 (F-test only), statistic, p_value. |
Examples:
Auto-fits and returns ANOVA table::
m = model("y ~ x1 + x2", data)
m.jointtest()Welch ANOVA (per-term Satterthwaite df)::
m = model("y ~ factor(group)", data)
m.jointtest(errors="unequal_var")plot_design¶
plot_design(**kwargs: object) -> objectPlot design matrix as an annotated heatmap.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_design. Key options: max_rows, annotate_terms, show_contrast_info, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_explore¶
plot_explore(specs: str, **kwargs: object) -> objectPlot marginal effects or estimated marginal means.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
specs | str | Specification string for the marginal effects/means. | required |
**kwargs | object | Forwarded to plot_explore. Key options: hue, col, row, show_pvalue, ref_line, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_params¶
plot_params(**kwargs: object) -> objectPlot parameter estimates as a forest plot.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_params. Key options: include_intercept, sort, show_values, show_pvalue, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_predict¶
plot_predict(term: str, **kwargs: object) -> objectPlot marginal predictions across a predictor range.
Accepts either a bare column name or an explore-style formula::
m.plot_predict("age") # bare column
m.plot_predict("age ~ sex") # hue by sex
m.plot_predict("age ~ sex@Female") # pin sex=Female
m.plot_predict("age@range(5)") # 5-point grid
m.plot_predict("age@[25,50,75]") # explicit grid valuesParameters:
| Name | Type | Description | Default |
|---|---|---|---|
term | str | Predictor variable or explore-style formula string. | required |
**kwargs | object | Forwarded to plot_predict. Key options: hue, col, at, interval, show_data, show_rug, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_profile¶
plot_profile(**kwargs: object) -> objectPlot profile likelihood curves.
Requires .infer(how="profile") to have been called first.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_profile. Key options: height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_ranef¶
plot_ranef(**kwargs: object) -> objectPlot random effect estimates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_ranef. Key options: group, term, show, sort, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_relationships¶
plot_relationships(**kwargs: object) -> objectPlot pairwise relationships between response and predictors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_relationships. Key options: show_vif, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_resamples¶
plot_resamples(**kwargs: object) -> objectPlot distribution of resampled statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_resamples. Key options: which, include_intercept, terms, show_ci, show_pvalue. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_resid¶
plot_resid(**kwargs: object) -> objectPlot residual diagnostics (4-panel grid).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_resid. Key options: which, residual_type, lowess, label_outliers, height. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
plot_vif¶
plot_vif(**kwargs: object) -> objectPlot VIF diagnostics as correlation heatmap.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs | object | Forwarded to plot_vif. Key options: cmap, height, aspect. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
object | object | Matplotlib figure. |
predict¶
predict(newdata: str | pl.DataFrame | None = None, type: Literal['response', 'link'] = 'response', *, varying: Literal['exclude', 'include'] = 'exclude', allow_new_levels: bool = False, n_points: int | Literal['data'] = 50, **kwargs: object) -> ModelResultGenerate predictions from the fitted model.
Accepts a formula string, a DataFrame, or None:
Formula mode (
str): Builds a prediction grid from an explore-style formula and returns grid columns prepended to fitted values — like R’sggpredict()/effectspackage.DataFrame mode (
pl.DataFrame): Predicts on the given data.None: Predicts on the training data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
newdata | str | DataFrame | None | Formula string (e.g. "wt ~ cyl"), a Polars DataFrame, or None for training-data predictions. | None |
type | Literal[‘response’, ‘link’] | Scale: "response" (default) or "link". | ‘response’ |
varying | Literal[‘exclude’, ‘include’] | Random effects: "exclude" (default, population-level) or "include" (conditional with BLUPs). | ‘exclude’ |
allow_new_levels | bool | If True, unseen groups predict at population level. Default False. | False |
n_points | int | Literal[‘data’] | Number of grid points for continuous focal variables in formula mode. Use "data" for observed unique values. Default 50. Ignored in DataFrame/None mode. | 50 |
**kwargs | object | Reserved for future use. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
self | ModelResult | For method chaining. Results in .predictions. |
Examples:
Formula-mode predictions (like R ggpredict)::
m = model("mpg ~ wt + hp + cyl", mtcars).fit()
m.predict("wt ~ cyl").predictions
# Returns: wt | cyl | fitted
m.predict("wt ~ cyl").infer().predictions
# Returns: wt | cyl | fitted | se | ci_lower | ci_upper
m.predict("wt@range(10) ~ cyl@[4, 6]").predictions
m.predict("wt", n_points="data").predictionsDataFrame predictions (unchanged)::
m.predict(new_df).predictionsTraining-data predictions (unchanged)::
m.predict().predictionsreset_contrasts¶
reset_contrasts() -> SelfReset contrast coding to defaults (treatment/dummy coding).
.. deprecated:: Create a new model instead of resetting contrasts in place.
Clears any custom contrasts set via set_contrasts().
The model must be re-fitted after resetting.
Returns:
| Name | Type | Description |
|---|---|---|
self | Self | The model instance for method chaining. |
set_contrasts¶
set_contrasts(**contrasts: object) -> SelfSet custom contrast coding for categorical predictors.
.. deprecated::
Use formula syntax (sum(x), treatment(x, ref=B)) or
the contrasts= constructor kwarg for ndarray matrices instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**contrasts | object | Column name → contrast spec. Values: a string ('treatment', 'sum', 'helmert', 'poly', 'sequential'), a tuple ('treatment', 'B') for custom reference, or an ndarray contrast matrix. | {} |
Returns:
| Name | Type | Description |
|---|---|---|
self | Self | For method chaining. Model must be re-fitted after. |
set_display¶
set_display(enabled: bool) -> NoneToggle automatic result display in the REPL and notebooks.
When enabled (the default), evaluating a model object after calling
.fit(), .explore(), .predict(), or .simulate()
automatically shows the relevant result table below the model header.
Disable for a compact one-line repr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
enabled | bool | True to auto-display results, False for compact repr. | required |
show_math¶
show_math(*, explanations: bool = True) -> 'MathDisplay'Display structural LaTeX equation with term explanations.
Works before or after .fit(). Before data, shows generic symbolic
form. With data, shows specific factor levels and transform parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
explanations | bool | If True (default), include term explanations with contrast types, reference levels, and transformation parameters. | True |
Returns:
| Type | Description |
|---|---|
‘MathDisplay’ | MathDisplay object with rich display support (repr_latex, |
‘MathDisplay’ | repr_html, to_latex). |
Examples:
>>> m = model("y ~ x + group", data)
>>> m.show_math() # renders in Jupyter
>>> m.show_math().to_latex() # raw LaTeX stringsimulate¶
simulate(n: int | None = None, nsim: int | None = None, seed: int | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, varying: str = 'fitted', power: int | dict[str, Any] | None = None, **var_specs: object) -> ModelResultGenerate data from scratch or simulate responses from a fitted model.
Pre-fit mode (n=): generate a dataset from formula and distributions.
Post-fit mode (nsim=): simulate response vectors from the fitted model.
Power mode (power=): run simulation-based power analysis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n | int | None | Number of observations to generate (pre-fit mode). | None |
nsim | int | None | Number of response vectors to simulate (post-fit mode). | None |
seed | int | None | Random seed for reproducibility. | None |
coef | dict[str, float] | None | True coefficient values for pre-fit mode. | None |
sigma | float | Residual SD for gaussian pre-fit mode (default 1.0). | 1.0 |
varying | str | Random effects handling in post-fit: "fitted" or "sample". | ‘fitted’ |
power | int | dict[str, Any] | None | Power analysis configuration. int for simple (number of sims), dict for sweeps (e.g., {"n_sims": 500, "n": [50, 100, 200]}). When set, overrides nsim and runs power analysis mode. | None |
**var_specs | object | Distribution specs for predictors (e.g., x=normal(0, 1)). | {} |
Returns:
| Name | Type | Description |
|---|---|---|
self | ModelResult | For method chaining. |
summary¶
summary(style: Literal['r', 'compact'] = 'r', digits: int = 3) -> FormattedTextGenerate a formatted model summary (R-style or compact).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
style | Literal[‘r’, ‘compact’] | "r" (default) for full R-style output, "compact" for brief. | ‘r’ |
digits | int | Decimal places for numeric values. Default 3. | 3 |
Returns:
| Name | Type | Description |
|---|---|---|
FormattedText | FormattedText | Rich display object that renders automatically in REPL and Jupyter notebooks without needing print(). |
to_effect_size¶
to_effect_size(*, include_intercept: bool = False) -> pl.DataFrameCompute standardized effect sizes from params.
Computes Cohen’s d, semi-partial r, eta-squared, and odds ratio (for binomial models).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
include_intercept | bool | Whether to include the intercept row. | False |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with added effect size columns (d, r_semi, eta_sq, |
DataFrame | and odds_ratio for binomial). |
Examples:
>>> m = model("y ~ x1 + x2", data).fit().infer()
>>> m.to_effect_size()to_odds_ratio¶
to_odds_ratio() -> pl.DataFrameTransform params to odds ratio scale (binomial GLM only).
Exponentiates estimate, ci_lower, and ci_upper columns.
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with exponentiated values. |
Examples:
>>> m = model("y ~ x", data, family="binomial").fit().infer()
>>> m.to_odds_ratio()to_response_scale¶
to_response_scale() -> pl.DataFrameTransform effects from link scale to response scale.
Applies the inverse link function to estimate and CI columns on the effects DataFrame. For example, converts log-odds to probabilities for logistic models.
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with values on response scale. |
Examples:
>>> m = model("y ~ x", data, family="binomial").fit()
>>> m.explore("x").infer()
>>> m.to_response_scale()vif¶
vif() -> pl.DataFrameCompute variance inflation factors for model predictors.
VIF measures multicollinearity — how much the variance of each coefficient is inflated due to correlation with other predictors. Only requires data (not fitting), since VIF is a property of the design matrix.
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: term, vif, ci_increase_factor. |
Examples:
>>> m = model("y ~ x1 + x2 + x3", data)
>>> m.vif()