fit - bossanova

Model fitting, diagnostics, convergence, varying parameters, and prediction.

Call chain:

model.fit() -> fit_model() -> resolve_solver() -> fit_ols_qr / fit_glm_irls / fit_lmer_pls / fit_glmer_pirls

Attributes:

Name	Type	Description
`VALID_SOLVERS`	`frozenset[str]`

Classes:

Name	Description
`FitResult`	Immutable result of the fit lifecycle.

Functions:

Name	Description
`augment_data_with_diagnostics`	Augment raw data with diagnostic columns after fit.
`build_mixed_post_fit_state`	Compute BLUPs, variance components, and emit convergence warnings.
`build_predict_grid`	Build a Cartesian-product prediction grid.
`check_convergence`	Run convergence diagnostics on a fitted mixed model.
`compute_diagnostics`	Compute model-level diagnostics as a single-row DataFrame.
`compute_metadata`	Compute model metadata as a single-row DataFrame.
`compute_optimizer_diagnostics`	Compute optimizer convergence diagnostics as a single-row DataFrame.
`compute_predictions_from_formula`	Parse a predict formula, build the grid, compute predictions, and attach grid columns.
`compute_r_squared`	Compute R-squared and adjusted R-squared from raw arrays.
`compute_varying_spread_state`	Compute VaryingSpreadState (variance components) from theta parameters.
`compute_varying_state`	Compute VaryingState (BLUPs) from fitted random effects parameters.
`execute_fit`	Execute the full fit lifecycle: bundle rebuild → fit → post-fit state → diagnostics.
`fit_glm_irls`	Fit generalized linear model using Iteratively Reweighted Least Squares.
`fit_glmer_pirls`	Fit generalized linear mixed model using Penalized IRLS.
`fit_lmer_pls`	Fit linear mixed-effects model using Penalized Least Squares.
`fit_model`	Dispatch to appropriate fitter based on model specification.
`fit_ols_qr`	Fit ordinary or weighted least squares using QR decomposition.
`get_theta_lower_bounds`	Get lower bounds for theta parameters.
`parse_fit_kwargs`	Validate and extract fitting parameters from `**kwargs`.
`parse_predict_formula`	Parse an explore-style formula and build a prediction grid.
`per_factor_re_info`	Split global RE metadata into per-factor structures and names.
`resolve_condition_values`	Resolve a :class:`Condition` to concrete values or `None`.
`resolve_solver`	Select the appropriate solver for a model configuration.
`validate_fit_method`	Validate and apply a user-specified fitting method to a ModelSpec.

Modules:

Name	Description
`convergence`	Convergence diagnostics for fitted mixed-effects models.
`diagnostics`	Model-level diagnostics computation.
`dispatch`	Solver dispatch for model fitting.
`glm`	GLM fitting via Iteratively Reweighted Least Squares (IRLS).
`glmer`	GLMM fitting via Penalized IRLS (PIRLS).
`grid`	Prediction grid construction for formula-mode predictions.
`lifecycle`	Fit lifecycle orchestration.
`lmer`	LMM fitting via Penalized Least Squares (PLS).
`ols`	OLS fitting via QR decomposition.
`predict`	Prediction operations on containers.
`varying`	Varying parameter extraction for mixed-effects models.

Attributes¶

VALID_SOLVERS¶

VALID_SOLVERS: frozenset[str] = frozenset({'qr', 'irls', 'pls', 'pirls'})

Classes¶

FitResult¶

Immutable result of the fit lifecycle.

Attributes:

Name	Type	Description
`fit`	`FitState`	Fitted model state (coefficients, residuals, etc.).
`bundle`	`DataBundle`	Data bundle used for fitting (may be rebuilt).
`formula_spec`	`object`	Learned formula spec for newdata evaluation.
`raw_data`	`DataFrame \| None`	Original data snapshot (pre-augmentation).
`augmented_data`	`DataFrame \| None`	Data with diagnostic columns, or None.
`varying_offsets`	`VaryingState \| None`	BLUPs for mixed models, or None.
`varying_spread`	`VaryingSpreadState \| None`	Variance components for mixed models, or None.

Attributes¶

augmented_data¶

augmented_data: pl.DataFrame | None

bundle¶

bundle: DataBundle

fit¶

fit: FitState

formula_spec¶

formula_spec: object

raw_data¶

raw_data: pl.DataFrame | None

varying_offsets¶

varying_offsets: VaryingState | None = None

varying_spread¶

varying_spread: VaryingSpreadState | None = None

Functions¶

augment_data_with_diagnostics¶

augment_data_with_diagnostics(*, raw_data: pl.DataFrame, fit: FitState, bundle: DataBundle) -> pl.DataFrame

Augment raw data with diagnostic columns after fit.

Adds fitted, resid, hat, std_resid, cooksd columns (names from AugmentedDataCols schema). Values are NaN for rows dropped due to missing data.

Parameters:

Name	Type	Description	Default
`raw_data`	`DataFrame`	Original data DataFrame (pre-NA-drop).	required
`fit`	`FitState`	Fitted state with residuals, fitted values, leverage.	required
`bundle`	`DataBundle`	Data bundle with valid_mask, n_total, p.	required

Returns:

Type	Description
`DataFrame`	DataFrame with diagnostic columns appended.

build_mixed_post_fit_state¶

build_mixed_post_fit_state(fit: FitState, bundle: DataBundle, data: pl.DataFrame, *, stacklevel: int = 3) -> tuple[VaryingState | None, VaryingSpreadState | None]

Compute BLUPs, variance components, and emit convergence warnings.

Orchestrates the post-fit assembly for mixed-effects models: computes VaryingState (BLUPs) and VaryingSpreadState (variance components) from the fitted parameters, then checks for convergence issues.

Parameters:

Name	Type	Description	Default
`fit`	`FitState`	Fitted model state containing theta, u, sigma.	required
`bundle`	`DataBundle`	Data bundle with RE metadata and valid mask.	required
`data`	`DataFrame`	Original training data (used for group level labels).	required
`stacklevel`	`int`	Warning stacklevel for convergence warnings. Default 3 accounts for: user → `model.fit()` → `build_mixed_post_fit_state()`.	`3`

Returns:

Type	Description
`VaryingState \| None`	A tuple `(varying_offsets, varying_spread)` where either may be
`VaryingSpreadState \| None`	None if the required fitted parameters are missing.

build_predict_grid¶

build_predict_grid(data: pl.DataFrame, focal_var: str, response_col: str, grouping_factors: tuple[str, ...], *, focal_values: list[float | str] | None = None, n_points: int | Literal['data'] = 50, varying_vars: list[str] | None = None, at: dict[str, Any] | None = None) -> pl.DataFrame

Build a Cartesian-product prediction grid.

Creates a grid where the focal variable is varied, condition variables are expanded, and all other predictors are held at reference values (mean for continuous, first sorted level for categorical). Grouping factors and the response column are excluded.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Training data (Polars DataFrame).	required
`focal_var`	`str`	The predictor to vary across the grid.	required
`response_col`	`str`	Response column name (excluded from grid).	required
`grouping_factors`	`tuple[str, ...]`	Random-effect grouping variables (excluded).	required
`focal_values`	`list[float \| str] \| None`	Explicit values for the focal variable. Overrides default linspace/unique-levels logic.	`None`
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables. Use `"data"` to use actual observed unique values.	`50`
`varying_vars`	`list[str] \| None`	Condition variables to expand (all unique levels).	`None`
`at`	`dict[str, Any] \| None`	Dict of pinned values. Scalar = single constant, list = expand.	`None`

Returns:

Type	Description
`DataFrame`	Polars DataFrame with the Cartesian-product prediction grid.

check_convergence¶

check_convergence(fit: FitState, re_meta: REInfo) -> list[ConvergenceMessage]

Run convergence diagnostics on a fitted mixed model.

Extracts theta and sigma from FitState, computes theta lower bounds and per-factor RE info from REInfo, and delegates to diagnose_convergence() for the actual diagnostic checks.

Parameters:

Name	Type	Description	Default
`fit`	`FitState`	Fitted model state (must have theta, sigma, converged).	required
`re_meta`	`REInfo`	Random effects metadata from the DataBundle.	required

Returns:

Type	Description
`list[ConvergenceMessage]`	List of ConvergenceMessage objects. Empty if theta is None.

compute_diagnostics¶

compute_diagnostics(*, model_type: str, spec: ModelSpec, bundle: DataBundle, fit: FitState, coef_for_predict: np.ndarray, varying_spread: VaryingSpreadState | None, cv: CVState | None, has_intercept: bool = True) -> pl.DataFrame

Compute model-level diagnostics as a single-row DataFrame.

Builds goodness-of-fit diagnostics from fitted model state, with columns varying by model type (lm, glm, lmer, glmer).

Parameters:

Name	Type	Description	Default
`model_type`	`str`	One of “lm”, “glm”, “lmer”, “glmer”.	required
`spec`	`ModelSpec`	Model specification (for family).	required
`bundle`	`DataBundle`	Data bundle (for n, rank, X, y, re_metadata).	required
`fit`	`FitState`	Fitted state (for coefficients, residuals, loglik, etc.).	required
`coef_for_predict`	`ndarray`	Coefficients safe for matrix multiplication (NaN replaced by 0 for rank-deficient models).	required
`varying_spread`	`VaryingSpreadState \| None`	Random effects variance components (mixed models).	required
`cv`	`CVState \| None`	Cross-validation state, or None.	required
`has_intercept`	`bool`	Whether the model includes an intercept. Affects R² computation (centered vs uncentered SS_tot).	`True`

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with model diagnostics. See
`DataFrame`	`model.diagnostics` for full column documentation.

compute_metadata¶

compute_metadata(*, bundle: DataBundle) -> pl.DataFrame

Compute model metadata as a single-row DataFrame.

Returns sample/structural info about the model: observation counts, parameter count, and group counts (for mixed models).

Parameters:

Name	Type	Description	Default
`bundle`	`DataBundle`	Data bundle (for n, n_total, p, re_metadata).	required

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with model metadata.

compute_optimizer_diagnostics¶

compute_optimizer_diagnostics(*, model_type: str, fit: FitState) -> pl.DataFrame

Compute optimizer convergence diagnostics as a single-row DataFrame.

Parameters:

Name	Type	Description	Default
`model_type`	`str`	One of “lm”, “glm”, “lmer”, “glmer”.	required
`fit`	`FitState`	Fitted state with convergence info, theta, dispersion.	required

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with optimizer diagnostics.

compute_predictions_from_formula¶

compute_predictions_from_formula(formula: str, data: pl.DataFrame, spec: object, bundle: object, fit: object, formula_spec: object, pred_type: str, varying: str, allow_new_levels: bool, n_points: int | Literal['data']) -> 'PredictionState'

Parse a predict formula, build the grid, compute predictions, and attach grid columns.

Combines parse_predict_formula, compute_predictions, and grid-column attachment into a single call for model.predict() formula mode.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Explore-style formula (e.g. `"wt ~ cyl"`).	required
`data`	`DataFrame`	Training data.	required
`spec`	`object`	Model specification.	required
`bundle`	`object`	Data bundle.	required
`fit`	`object`	Fitted model state.	required
`formula_spec`	`object`	Learned formula spec for newdata evaluation.	required
`pred_type`	`str`	Prediction scale (`"response"` or `"link"`).	required
`varying`	`str`	RE handling (`"exclude"` or `"include"`).	required
`allow_new_levels`	`bool`	If True, new groups predict at population level.	required
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables.	required

Returns:

Type	Description
`‘PredictionState’`	PredictionState with grid columns attached.

compute_r_squared¶

compute_r_squared(y: np.ndarray, residuals: np.ndarray, n: int, p: int, has_intercept: bool = True) -> tuple[float, float]

Compute R-squared and adjusted R-squared from raw arrays.

For models with an intercept, uses centered SS_tot = sum((y - mean(y))^2). For no-intercept models, uses uncentered SS_tot = sum(y^2), matching R’s summary.lm() behavior.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Response vector of shape (n,).	required
`residuals`	`ndarray`	Residual vector of shape (n,).	required
`n`	`int`	Number of observations.	required
`p`	`int`	Number of parameters (including intercept if present).	required
`has_intercept`	`bool`	Whether the model includes an intercept.	`True`

Returns:

Type	Description
`tuple[float, float]`	Tuple of (R-squared, adjusted R-squared).

compute_varying_spread_state¶

compute_varying_spread_state(theta: NDArray[np.floating], sigma: float, re_meta: REInfo, *, X: NDArray[np.floating] | None = None, X_names: tuple[str, ...] | None = None) -> VaryingSpreadState

Compute VaryingSpreadState (variance components) from theta parameters.

Extracts residual variance (sigma²), random effect variances (tau²), correlations (rho), and intraclass correlation (ICC) from the fitted theta vector using the random effects structure.

ICC is computed using Johnson (2014, PLoS ONE, eq. 10) which accounts for predictor distributions when random slopes are present. When X and X_names are provided, the formula uses E[X] and E[X²] from the data; otherwise falls back to summing intercept variances only.

Parameters:

Name	Type	Description	Default
`theta`	`NDArray[floating]`	Variance component parameters from the fitted model.	required
`sigma`	`float`	Residual standard deviation from the fitted model.	required
`re_meta`	`REInfo`	Random effects metadata (grouping vars, structure, etc.).	required
`X`	`NDArray[floating] \| None`	Fixed-effects design matrix (n × p), used for ICC computation.	`None`
`X_names`	`tuple[str, ...] \| None`	Column names for `X`, used to look up slope predictors.	`None`

Returns:

Type	Description
`VaryingSpreadState`	VaryingSpreadState container with components DataFrame and
`VaryingSpreadState`	decomposed variance quantities.

compute_varying_state¶

compute_varying_state(theta: NDArray[np.floating], u: NDArray[np.floating], re_meta: REInfo, data: pl.DataFrame | None = None) -> VaryingState

Compute VaryingState (BLUPs) from fitted random effects parameters.

Converts spherical random effects u to BLUPs b = Lambda @ u using the relative covariance factor Lambda built from theta. Constructs a grid of group/level combinations and maps BLUP values to named effects.

Parameters:

Name	Type	Description	Default
`theta`	`NDArray[floating]`	Variance component parameters from the fitted model.	required
`u`	`NDArray[floating]`	Spherical random effects vector from the fitted model.	required
`re_meta`	`REInfo`	Random effects metadata (grouping vars, structure, etc.).	required
`data`	`DataFrame \| None`	Original training data, used to extract unique group levels. If None, levels are labeled `"0"`, `"1"`, etc.	`None`

Returns:

Type	Description
`VaryingState`	VaryingState container with grid, effects dict, and group info.

execute_fit¶

execute_fit(spec: ModelSpec, bundle: DataBundle | None, data: pl.DataFrame, raw_data: pl.DataFrame | None, formula: str, custom_contrasts: dict | None, weights_col: str | None, offset_col: str | None, missing: str, is_mixed: bool, solver_override: str | None, fit_kwargs: dict) -> FitResult

Execute the full fit lifecycle: bundle rebuild → fit → post-fit state → diagnostics.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification.	required
`bundle`	`DataBundle \| None`	Existing data bundle, or None to force rebuild.	required
`data`	`DataFrame`	Current data (raw_data-restored by caller).	required
`raw_data`	`DataFrame \| None`	Original pre-augmentation snapshot, or None.	required
`formula`	`str`	Formula string for bundle building.	required
`custom_contrasts`	`dict \| None`	User contrast matrices, or None.	required
`weights_col`	`str \| None`	Weights column name, or None.	required
`offset_col`	`str \| None`	Offset column name, or None.	required
`missing`	`str`	Missing value handling (`"drop"` or `"fail"`).	required
`is_mixed`	`bool`	Whether this is a mixed-effects model.	required
`solver_override`	`str \| None`	Explicit solver, or None for auto.	required
`fit_kwargs`	`dict`	Additional kwargs for `fit_model()`.	required

Returns:

Type	Description
`FitResult`	FitResult with all state the model needs to assign.

fit_glm_irls¶

fit_glm_irls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 25, tol: float = 1e-08) -> FitState

Fit generalized linear model using Iteratively Reweighted Least Squares.

This adapter wraps the IRLS implementation in IRLS solves GLMs by iterating between computing working weights and solving a weighted least squares problem.

Initialize mu from y (or link function default)
Initialize mu from y (or link function default)
For each iteration: a. Compute working weights: W = 1 / (V(mu) * g’(mu)^2) b. Compute working response: z = eta + (y - mu) * g’(mu) c. Solve weighted least squares: beta = (X’WX)^{-1} X’Wz d. Update eta = X @ beta, mu = g^{-1}(eta)
Continue until convergence (change in deviance < tol)

gaussian: Identity variance, identity link
gaussian: Identity variance, identity link
binomial: mu(1-mu) variance, logit/probit/cloglog link
poisson: mu variance, log link
gamma: mu^2 variance, inverse/log link

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - family: Distribution family (determines variance function) - link: Link function (determines g and g’)	required
`bundle`	`DataBundle`	Data bundle containing X, y, and optional weights.	required
`max_iter`	`int`	Maximum IRLS iterations (default: 25).	`25`
`tol`	`float`	Convergence tolerance on deviance (default: 1e-8).	`1e-08`

Returns:

Type Description

FitState FitState containing: - coef: Coefficient estimates - vcov: Variance-covariance (observed Fisher information) - fitted: Predicted values on response scale - residuals: Response residuals (y - mu) - leverage: Hat matrix diagonal - df_resid: Residual degrees of freedom - loglik: Log-likelihood - dispersion: Estimated dispersion parameter - converged: Whether IRLS converged - n_iter: Number of IRLS iterations

Type	Description
`FitState`	FitState containing: - coef: Coefficient estimates - vcov: Variance-covariance (observed Fisher information) - fitted: Predicted values on response scale - residuals: Response residuals (y - mu) - leverage: Hat matrix diagonal - df_resid: Residual degrees of freedom - loglik: Log-likelihood - dispersion: Estimated dispersion parameter - converged: Whether IRLS converged - n_iter: Number of IRLS iterations

See Also:

glm: Underlying IRLS implementation

fit_glmer_pirls¶

fit_glmer_pirls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 25, max_outer_iter: int = 10000, tol: float = 1e-07, verbose: bool = False, nAGQ: int = 1, use_hessian: bool = False) -> FitState

Fit generalized linear mixed model using Penalized IRLS.

This adapter wraps the PIRLS implementation from PIRLS combines IRLS (for the GLM part) with PLS (for random effects), using Laplace approximation to integrate out the random effects.

Outer loop (BOBYQA optimization over theta): Outer loop (BOBYQA optimization over theta): For each theta: 1. Build Lambda from theta

Inner loop (PIRLS iterations):
    a. Compute working weights from current eta/mu
    b. Compute working response
    c. Solve weighted PLS for beta and u
    d. Update eta = X @ beta + Z @ Lambda @ u
    e. Update mu = g^{-1}(eta)
    f. Step-halving if deviance increased
    g. Check convergence

2. Return Laplace deviance

Select theta minimizing Laplace deviance

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - family: Distribution family - link: Link function - random_terms: Parsed random effect specifications	required
`bundle`	`DataBundle`	Data bundle containing: - X: Fixed effects design matrix - Z: Random effects design matrix (sparse) - y: Response vector - re_metadata: Grouping structure	required
`max_iter`	`int`	Maximum PIRLS iterations per theta (default: 25).	`25`
`max_outer_iter`	`int`	Maximum BOBYQA iterations (default: 10000).	`10000`
`tol`	`float`	PIRLS convergence tolerance (default: 1e-7).	`1e-07`
`verbose`	`bool`	Print optimization progress (default: False).	`False`
`nAGQ`	`int`	Quadrature points (0 or 1, default: 1).	`1`
`use_hessian`	`bool`	Use Hessian-based vcov (default: False). The default Schur complement approach matches lme4’s `vcov()` with `use.hessian=FALSE` and avoids expensive numerical differentiation. Set to True for observed-information vcov.	`False`

Returns:

Type Description

FitState FitState containing: - coef: Fixed effect coefficient estimates - vcov: Variance-covariance (observed information or Schur complement) - fitted: Predicted values on response scale (mu) - residuals: Response residuals (y - mu) - leverage: Approximate leverage values - df_resid: Residual degrees of freedom - loglik: Laplace-approximated log-likelihood - dispersion: Dispersion (1.0 for binomial/poisson) - theta: Optimized relative covariance parameters - u: Spherical random effects - converged: Whether both PIRLS and BOBYQA converged - n_iter: Number of optimizer evaluations

Type	Description
`FitState`	FitState containing: - coef: Fixed effect coefficient estimates - vcov: Variance-covariance (observed information or Schur complement) - fitted: Predicted values on response scale (mu) - residuals: Response residuals (y - mu) - leverage: Approximate leverage values - df_resid: Residual degrees of freedom - loglik: Laplace-approximated log-likelihood - dispersion: Dispersion (1.0 for binomial/poisson) - theta: Optimized relative covariance parameters - u: Spherical random effects - converged: Whether both PIRLS and BOBYQA converged - n_iter: Number of optimizer evaluations

See Also:

glmer: Underlying PIRLS implementation

fit_lmer_pls¶

fit_lmer_pls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 10000, verbose: bool = False) -> FitState

Fit linear mixed-effects model using Penalized Least Squares.

This adapter wraps the PLS implementation from PLS is the algorithm from Bates et al. (2015) used in R’s lme4 package.

Outer loop (BOBYQA optimization over theta): Outer loop (BOBYQA optimization over theta): For each theta (relative covariance parameters): 1. Build Lambda (block-diagonal Cholesky factor from theta) 2. Form S_22 = Lambda’ Z’ Z Lambda + I 3. Sparse Cholesky factorization of S_22 4. Compute Schur complement for fixed effects 5. Solve for beta (fixed effects) and u (spherical RE) 6. Compute REML or ML deviance

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - method: “reml” or “ml” (determines objective function) - random_terms: Parsed random effect specifications	required
`bundle`	`DataBundle`	Data bundle containing: - X: Fixed effects design matrix (n x p) - Z: Random effects design matrix (n x q, sparse CSC) - y: Response vector - re_metadata: Grouping structure information	required
`max_iter`	`int`	Maximum BOBYQA iterations (default: 10000).	`10000`
`verbose`	`bool`	Print optimization progress (default: False).	`False`

Returns:

Type Description

FitState FitState containing: - coef: Fixed effect coefficient estimates - vcov: Variance-covariance of fixed effects - fitted: Predicted values (fixed + random) - residuals: Response residuals (y - fitted) - leverage: Approximate leverage values - df_resid: Residual degrees of freedom - loglik: REML or ML log-likelihood - sigma: Residual standard deviation - theta: Optimized relative covariance parameters - u: Spherical random effects (unit variance) - converged: Whether optimizer converged - n_iter: Number of optimizer iterations

Type	Description
`FitState`	FitState containing: - coef: Fixed effect coefficient estimates - vcov: Variance-covariance of fixed effects - fitted: Predicted values (fixed + random) - residuals: Response residuals (y - fitted) - leverage: Approximate leverage values - df_resid: Residual degrees of freedom - loglik: REML or ML log-likelihood - sigma: Residual standard deviation - theta: Optimized relative covariance parameters - u: Spherical random effects (unit variance) - converged: Whether optimizer converged - n_iter: Number of optimizer iterations

See Also:

lmer: Underlying PLS implementation

fit_model¶

fit_model(spec: ModelSpec, bundle: DataBundle, *, solver: str | None = None, max_iter: int | None = None, max_outer_iter: int = 10000, tol: float | None = None, verbose: bool = False, nAGQ: int = 1, use_hessian: bool = False) -> FitState

Dispatch to appropriate fitter based on model specification.

This is the main entry point for fitting models. It examines the ModelSpec to determine the appropriate solver and delegates to the corresponding fitter function.

If the design matrix is rank-deficient (detected during bundle construction), the X matrix is reduced to estimable columns before fitting. After fitting, coefficients and vcov are expanded back to full size with NaN for dropped columns (matching R’s lm() behavior).

The solver selection follows the estimation method matrix:

Family	Random Effects	Method	Solver	Description
gaussian	No	ols	qr	QR decomposition
gaussian	No	ml	irls	Maximum likelihood
non-gauss	No	ml	irls	GLM via IRLS
gaussian	Yes	reml/ml	pls	Penalized least squares
non-gauss	Yes	ml	pirls	Penalized IRLS

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing formula, family, link, method, and parsed formula components.	required
`bundle`	`DataBundle`	Prepared data bundle containing design matrices (X, y, Z), column names, valid observation mask, and optional weights/offset.	required
`solver`	`str \| None`	Override solver selection. If None, auto-selected via `resolve_solver()`. Must be one of `"qr"`, `"irls"`, `"pls"`, `"pirls"`.	`None`
`max_iter`	`int \| None`	Maximum iterations (solver-specific defaults if None).	`None`
`max_outer_iter`	`int`	Maximum outer (BOBYQA) iterations for GLMER (default: 10000).	`10000`
`tol`	`float \| None`	Convergence tolerance (solver-specific defaults if None).	`None`
`verbose`	`bool`	Print optimization progress (default: False).	`False`
`nAGQ`	`int`	Quadrature points for GLMER (default: 1).	`1`
`use_hessian`	`bool`	Use Hessian-based vcov for GLMER (default: False).	`False`

Returns:

Type	Description
`FitState`	FitState containing all fitting results.

Examples:

>>> import numpy as np
>>> from containers import build_model_spec, DataBundle
>>> spec = build_model_spec(
...     formula="y ~ x",
...     response_var="y",
...     fixed_terms=["Intercept", "x"],
... )
>>> bundle = DataBundle(
...     X=np.array([[1.0, 1.0], [1.0, 2.0], [1.0, 3.0]]),
...     y=np.array([2.0, 4.0, 6.0]),
...     X_names=["Intercept", "x"],
...     y_name="y",
...     valid_mask=np.array([True, True, True]),
...     n_total=3,
... )
>>> state = fit_model(spec, bundle)
>>> state.converged
True
>>> state.coef  # [Intercept, x] = [0, 2]
array([0., 2.])

fit_ols_qr¶

fit_ols_qr(spec: ModelSpec, bundle: DataBundle) -> FitState

Fit ordinary or weighted least squares using QR decomposition.

Supports observation weights (WLS) and offset terms. When weights are present, solves the transformed system sqrt(W)*X, sqrt(W)*y via QR decomposition, which yields WLS coefficients and vcov directly. Offsets are subtracted from y before fitting and added back to fitted values.

Subtract offset from y (if present): y_adj = y - offset
Subtract offset from y (if present): y_adj = y - offset
Apply weights (if present): X_w = sqrt(w)*X, y_w = sqrt(w)*y_adj
QR decompose X_w with column pivoting for stability
Solve R * beta = Q.T @ y_w via back-substitution
Recompute original-scale: fitted = X @ beta + offset, resid = y - fitted
vcov = sigma_w^2 * (X’WX)^{-1}
Leverage from (possibly weighted) hat matrix

Matches R’s logLik.lm formula:: Matches R’s logLik.lm formula::

L = 0.5*sum(log(w)) - n/2 * (log(2*pi) + log(RSS_w/n) + 1)

The 0.5*sum(log(w)) term is the Jacobian from the weight transformation.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification (unused for OLS, included for interface consistency with other fitters).	required
`bundle`	`DataBundle`	Data bundle containing: - X: Design matrix (n x p) - y: Response vector (n,) - weights: Observation weights (n,) or None for OLS - offset: Offset vector (n,) or None	required

Returns:

Type Description

FitState FitState containing: - coef: Coefficient estimates, shape (p,) - vcov: Variance-covariance matrix, shape (p, p) - fitted: Fitted values X @ coef + offset, shape (n,) - residuals: y - fitted, shape (n,) - leverage: Hat matrix diagonal, shape (n,) - df_resid: Residual degrees of freedom (n - rank) - loglik: Gaussian log-likelihood (weighted if applicable) - sigma: Residual standard deviation - converged: Always True (closed-form solution) - n_iter: Always 1 (single step)

Type	Description
`FitState`	FitState containing: - coef: Coefficient estimates, shape (p,) - vcov: Variance-covariance matrix, shape (p, p) - fitted: Fitted values X @ coef + offset, shape (n,) - residuals: y - fitted, shape (n,) - leverage: Hat matrix diagonal, shape (n,) - df_resid: Residual degrees of freedom (n - rank) - loglik: Gaussian log-likelihood (weighted if applicable) - sigma: Residual standard deviation - converged: Always True (closed-form solution) - n_iter: Always 1 (single step)

Examples:

>>> import numpy as np
>>> from containers import build_model_spec, DataBundle
>>> spec = build_model_spec(
...     formula="y ~ x",
...     response_var="y",
...     fixed_terms=["Intercept", "x"],
... )
>>> bundle = DataBundle(
...     X=np.array([[1.0, 1.0], [1.0, 2.0], [1.0, 3.0]]),
...     y=np.array([2.0, 4.0, 6.0]),
...     X_names=["Intercept", "x"],
...     y_name="y",
...     valid_mask=np.array([True, True, True]),
...     n_total=3,
... )
>>> state = fit_ols_qr(spec, bundle)
>>> np.allclose(state.fitted + state.residuals, bundle.y)
True
>>> np.allclose(state.coef, [0.0, 2.0])  # Perfect fit: y = 2x
True

get_theta_lower_bounds¶

get_theta_lower_bounds(n_theta: int, re_structure: str, metadata: dict | None = None) -> list[float]

Get lower bounds for theta parameters.

Diagonal elements of Cholesky factor must be non-negative. Off-diagonal elements are unbounded.

Parameters:

Name	Type	Description	Default
`n_theta`	`int`	Number of theta parameters	required
`re_structure`	`str`	Random effects structure type	required
`metadata`	`dict \| None`	Optional metadata dict with ‘re_structures_list’ for crossed/nested/mixed structures	`None`

Returns:

Type	Description
`list[float]`	List of lower bounds

parse_fit_kwargs¶

parse_fit_kwargs(spec: ModelSpec, kwargs: dict[str, object], nAGQ: int | None) -> tuple[ModelSpec, str | None, dict[str, object]]

Validate and extract fitting parameters from **kwargs.

Pops solver, method, and nAGQ from kwargs, validates each, and assembles the remaining fit-specific keyword arguments into a dict suitable for fit_model().

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Current model specification (may be evolved if `method` is set).	required
`kwargs`	`dict[str, object]`	Mutable dict of user-supplied keyword arguments. Recognized keys are popped: `solver`, `method`, `max_iter`, `max_outer_iter`, `tol`, `verbose`, `nAGQ`, `use_hessian`.	required
`nAGQ`	`int \| None`	Explicit `nAGQ` parameter from the `fit()` signature (takes precedence over any value in kwargs).	required

Returns:

Type	Description
`ModelSpec`	A tuple `(updated_spec, solver_override, fit_kwargs)` where:
`str \| None`	- updated_spec has the validated method applied (if `method` was set).
`dict[str, object]`	- solver_override is the validated solver string, or None.
`tuple[ModelSpec, str \| None, dict[str, object]]`	- fit_kwargs is a dict ready to splat into `fit_model()`.

parse_predict_formula¶

parse_predict_formula(formula: str, data: pl.DataFrame, response_col: str, grouping_factors: tuple[str, ...], *, n_points: int | Literal['data'] = 50) -> tuple[pl.DataFrame, list[str]]

Parse an explore-style formula and build a prediction grid.

Translates the formula via :func:parse_explore_formula, rejects contrast formulas, and delegates to :func:build_predict_grid.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Explore-style formula (e.g. `"wt ~ cyl"`).	required
`data`	`DataFrame`	Training data.	required
`response_col`	`str`	Response column name.	required
`grouping_factors`	`tuple[str, ...]`	Random-effect grouping variables.	required
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables.	`50`

Returns:

Type	Description
`DataFrame`	Tuple of (grid DataFrame, list of grid column names for output).
`list[str]`	The grid column names are the focal var plus any condition vars
`tuple[DataFrame, list[str]]`	(the columns that vary across the grid, excluding reference-value
`tuple[DataFrame, list[str]]`	columns).

per_factor_re_info¶

per_factor_re_info(re_meta: REInfo, group_names: list[str]) -> tuple[str | list[str], list[str] | dict[str, list[str]]]

Split global RE metadata into per-factor structures and names.

For crossed/nested/mixed models, the global re_structure is a single string (e.g. “crossed”) and random_names is a concatenated list across all factors. This function splits them into per-factor structures and per-factor name dicts suitable for BLUP decomposition and convergence diagnostics.

For single-factor models, returns the originals unchanged.

Parameters:

Name	Type	Description	Default
`re_meta`	`REInfo`	Random effects metadata from the fitted model’s DataBundle.	required
`group_names`	`list[str]`	Ordered list of grouping variable names (e.g. `["subject"]` or `["subject", "item"]`).	required

Returns:

Type	Description
`str \| list[str]`	A tuple `(re_structure, random_names)` where:
`list[str] \| dict[str, list[str]]`	- For single-factor models: `(str, list[str])` — the originals.
`tuple[str \| list[str], list[str] \| dict[str, list[str]]]`	- For multi-factor models: `(list[str], dict[str, list[str]])` — per-factor structure list and a dict mapping group name to its random effect names.

resolve_condition_values¶

resolve_condition_values(cond: Condition, data: pl.DataFrame) -> list | None

Resolve a :class:Condition to concrete values or None.

Parameters:

Name	Type	Description	Default
`cond`	`Condition`	A `Condition` from :func:`parse_explore_formula`.	required
`data`	`DataFrame`	The model’s training data.	required

Returns:

Type	Description
`list \| None`	A list of concrete values if the condition specifies explicit
`list \| None`	values (`at_values`, `at_range`, `at_quantile`), or
`list \| None`	`None` for bare conditions (use all unique levels).

resolve_solver¶

resolve_solver(spec: ModelSpec) -> str

Select the appropriate solver for a model configuration.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification.	required

Returns:

Type	Description
`str`	Solver name: “qr”, “irls”, “pls”, or “pirls”.

validate_fit_method¶

validate_fit_method(spec: ModelSpec, method_str: str) -> ModelSpec

Validate and apply a user-specified fitting method to a ModelSpec.

Checks that the method is compatible with the model’s family and random-effects structure, then returns an evolved spec with the new method.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Current model specification.	required
`method_str`	`str`	User-supplied method string (e.g. `"ols"`, `"ml"`, `"reml"`). Will be lowercased.	required

Returns:

Type	Description
`ModelSpec`	Evolved ModelSpec with the validated method applied.

Modules¶

convergence¶

Convergence diagnostics for fitted mixed-effects models.

Encapsulates the repeated pattern of extracting theta, computing bounds, assembling per-factor RE info, and running diagnose_convergence().

Functions:

Name	Description
`check_convergence`	Run convergence diagnostics on a fitted mixed model.

Classes¶

Functions¶

check_convergence¶

check_convergence(fit: FitState, re_meta: REInfo) -> list[ConvergenceMessage]

Run convergence diagnostics on a fitted mixed model.

Extracts theta and sigma from FitState, computes theta lower bounds and per-factor RE info from REInfo, and delegates to diagnose_convergence() for the actual diagnostic checks.

Parameters:

Name	Type	Description	Default
`fit`	`FitState`	Fitted model state (must have theta, sigma, converged).	required
`re_meta`	`REInfo`	Random effects metadata from the DataBundle.	required

Returns:

Type	Description
`list[ConvergenceMessage]`	List of ConvergenceMessage objects. Empty if theta is None.

diagnostics¶

Model-level diagnostics computation.

Pure functions that compute model diagnostics from containers. These were extracted from model/core.py to keep the model class as thin glue.

Attributes¶

Classes¶

Functions¶

augment_data_with_diagnostics¶

augment_data_with_diagnostics(*, raw_data: pl.DataFrame, fit: FitState, bundle: DataBundle) -> pl.DataFrame

Augment raw data with diagnostic columns after fit.

Adds fitted, resid, hat, std_resid, cooksd columns (names from AugmentedDataCols schema). Values are NaN for rows dropped due to missing data.

Parameters:

Name	Type	Description	Default
`raw_data`	`DataFrame`	Original data DataFrame (pre-NA-drop).	required
`fit`	`FitState`	Fitted state with residuals, fitted values, leverage.	required
`bundle`	`DataBundle`	Data bundle with valid_mask, n_total, p.	required

Returns:

Type	Description
`DataFrame`	DataFrame with diagnostic columns appended.

compute_diagnostics¶

compute_diagnostics(*, model_type: str, spec: ModelSpec, bundle: DataBundle, fit: FitState, coef_for_predict: np.ndarray, varying_spread: VaryingSpreadState | None, cv: CVState | None, has_intercept: bool = True) -> pl.DataFrame

Compute model-level diagnostics as a single-row DataFrame.

Builds goodness-of-fit diagnostics from fitted model state, with columns varying by model type (lm, glm, lmer, glmer).

Parameters:

Name	Type	Description	Default
`model_type`	`str`	One of “lm”, “glm”, “lmer”, “glmer”.	required
`spec`	`ModelSpec`	Model specification (for family).	required
`bundle`	`DataBundle`	Data bundle (for n, rank, X, y, re_metadata).	required
`fit`	`FitState`	Fitted state (for coefficients, residuals, loglik, etc.).	required
`coef_for_predict`	`ndarray`	Coefficients safe for matrix multiplication (NaN replaced by 0 for rank-deficient models).	required
`varying_spread`	`VaryingSpreadState \| None`	Random effects variance components (mixed models).	required
`cv`	`CVState \| None`	Cross-validation state, or None.	required
`has_intercept`	`bool`	Whether the model includes an intercept. Affects R² computation (centered vs uncentered SS_tot).	`True`

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with model diagnostics. See
`DataFrame`	`model.diagnostics` for full column documentation.

compute_metadata¶

compute_metadata(*, bundle: DataBundle) -> pl.DataFrame

Compute model metadata as a single-row DataFrame.

Returns sample/structural info about the model: observation counts, parameter count, and group counts (for mixed models).

Parameters:

Name	Type	Description	Default
`bundle`	`DataBundle`	Data bundle (for n, n_total, p, re_metadata).	required

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with model metadata.

compute_optimizer_diagnostics¶

compute_optimizer_diagnostics(*, model_type: str, fit: FitState) -> pl.DataFrame

Compute optimizer convergence diagnostics as a single-row DataFrame.

Parameters:

Name	Type	Description	Default
`model_type`	`str`	One of “lm”, “glm”, “lmer”, “glmer”.	required
`fit`	`FitState`	Fitted state with convergence info, theta, dispersion.	required

Returns:

Type	Description
`DataFrame`	Single-row Polars DataFrame with optimizer diagnostics.

compute_r_squared¶

compute_r_squared(y: np.ndarray, residuals: np.ndarray, n: int, p: int, has_intercept: bool = True) -> tuple[float, float]

Compute R-squared and adjusted R-squared from raw arrays.

For models with an intercept, uses centered SS_tot = sum((y - mean(y))^2). For no-intercept models, uses uncentered SS_tot = sum(y^2), matching R’s summary.lm() behavior.

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Response vector of shape (n,).	required
`residuals`	`ndarray`	Residual vector of shape (n,).	required
`n`	`int`	Number of observations.	required
`p`	`int`	Number of parameters (including intercept if present).	required
`has_intercept`	`bool`	Whether the model includes an intercept.	`True`

Returns:

Type	Description
`tuple[float, float]`	Tuple of (R-squared, adjusted R-squared).

dispatch¶

Solver dispatch for model fitting.

Provides fit_model() which dispatches to the appropriate fitter based on model specification, and resolve_solver() which determines the solver type.

Handles rank-deficient design matrices by reducing X before fitting and expanding coefficients/vcov after, inserting NaN for dropped columns.

Functions:

Name	Description
`fit_model`	Dispatch to appropriate fitter based on model specification.
`parse_fit_kwargs`	Validate and extract fitting parameters from `**kwargs`.
`resolve_solver`	Select the appropriate solver for a model configuration.
`validate_fit_method`	Validate and apply a user-specified fitting method to a ModelSpec.

Attributes:

Name	Type	Description
`VALID_SOLVERS`	`frozenset[str]`

Attributes¶

VALID_SOLVERS¶

VALID_SOLVERS: frozenset[str] = frozenset({'qr', 'irls', 'pls', 'pirls'})

Classes¶

Functions¶

fit_model¶

fit_model(spec: ModelSpec, bundle: DataBundle, *, solver: str | None = None, max_iter: int | None = None, max_outer_iter: int = 10000, tol: float | None = None, verbose: bool = False, nAGQ: int = 1, use_hessian: bool = False) -> FitState

Dispatch to appropriate fitter based on model specification.

This is the main entry point for fitting models. It examines the ModelSpec to determine the appropriate solver and delegates to the corresponding fitter function.

The solver selection follows the estimation method matrix:

Family	Random Effects	Method	Solver	Description
gaussian	No	ols	qr	QR decomposition
gaussian	No	ml	irls	Maximum likelihood
non-gauss	No	ml	irls	GLM via IRLS
gaussian	Yes	reml/ml	pls	Penalized least squares
non-gauss	Yes	ml	pirls	Penalized IRLS

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing formula, family, link, method, and parsed formula components.	required
`bundle`	`DataBundle`	Prepared data bundle containing design matrices (X, y, Z), column names, valid observation mask, and optional weights/offset.	required
`solver`	`str \| None`	Override solver selection. If None, auto-selected via `resolve_solver()`. Must be one of `"qr"`, `"irls"`, `"pls"`, `"pirls"`.	`None`
`max_iter`	`int \| None`	Maximum iterations (solver-specific defaults if None).	`None`
`max_outer_iter`	`int`	Maximum outer (BOBYQA) iterations for GLMER (default: 10000).	`10000`
`tol`	`float \| None`	Convergence tolerance (solver-specific defaults if None).	`None`
`verbose`	`bool`	Print optimization progress (default: False).	`False`
`nAGQ`	`int`	Quadrature points for GLMER (default: 1).	`1`
`use_hessian`	`bool`	Use Hessian-based vcov for GLMER (default: False).	`False`

Returns:

Type	Description
`FitState`	FitState containing all fitting results.

Examples:

>>> import numpy as np
>>> from containers import build_model_spec, DataBundle
>>> spec = build_model_spec(
...     formula="y ~ x",
...     response_var="y",
...     fixed_terms=["Intercept", "x"],
... )
>>> bundle = DataBundle(
...     X=np.array([[1.0, 1.0], [1.0, 2.0], [1.0, 3.0]]),
...     y=np.array([2.0, 4.0, 6.0]),
...     X_names=["Intercept", "x"],
...     y_name="y",
...     valid_mask=np.array([True, True, True]),
...     n_total=3,
... )
>>> state = fit_model(spec, bundle)
>>> state.converged
True
>>> state.coef  # [Intercept, x] = [0, 2]
array([0., 2.])

parse_fit_kwargs¶

parse_fit_kwargs(spec: ModelSpec, kwargs: dict[str, object], nAGQ: int | None) -> tuple[ModelSpec, str | None, dict[str, object]]

Validate and extract fitting parameters from **kwargs.

Pops solver, method, and nAGQ from kwargs, validates each, and assembles the remaining fit-specific keyword arguments into a dict suitable for fit_model().

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Current model specification (may be evolved if `method` is set).	required
`kwargs`	`dict[str, object]`	Mutable dict of user-supplied keyword arguments. Recognized keys are popped: `solver`, `method`, `max_iter`, `max_outer_iter`, `tol`, `verbose`, `nAGQ`, `use_hessian`.	required
`nAGQ`	`int \| None`	Explicit `nAGQ` parameter from the `fit()` signature (takes precedence over any value in kwargs).	required

Returns:

Type	Description
`ModelSpec`	A tuple `(updated_spec, solver_override, fit_kwargs)` where:
`str \| None`	- updated_spec has the validated method applied (if `method` was set).
`dict[str, object]`	- solver_override is the validated solver string, or None.
`tuple[ModelSpec, str \| None, dict[str, object]]`	- fit_kwargs is a dict ready to splat into `fit_model()`.

resolve_solver¶

resolve_solver(spec: ModelSpec) -> str

Select the appropriate solver for a model configuration.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification.	required

Returns:

Type	Description
`str`	Solver name: “qr”, “irls”, “pls”, or “pirls”.

validate_fit_method¶

validate_fit_method(spec: ModelSpec, method_str: str) -> ModelSpec

Validate and apply a user-specified fitting method to a ModelSpec.

Checks that the method is compatible with the model’s family and random-effects structure, then returns an evolved spec with the new method.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Current model specification.	required
`method_str`	`str`	User-supplied method string (e.g. `"ols"`, `"ml"`, `"reml"`). Will be lowercased.	required

Returns:

Type	Description
`ModelSpec`	Evolved ModelSpec with the validated method applied.

glm¶

GLM fitting via Iteratively Reweighted Least Squares (IRLS).

Functions:

Name	Description
`fit_glm_irls`	Fit generalized linear model using Iteratively Reweighted Least Squares.

Classes¶

Functions¶

fit_glm_irls¶

fit_glm_irls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 25, tol: float = 1e-08) -> FitState

Fit generalized linear model using Iteratively Reweighted Least Squares.

This adapter wraps the IRLS implementation in IRLS solves GLMs by iterating between computing working weights and solving a weighted least squares problem.

Initialize mu from y (or link function default)
Initialize mu from y (or link function default)
For each iteration: a. Compute working weights: W = 1 / (V(mu) * g’(mu)^2) b. Compute working response: z = eta + (y - mu) * g’(mu) c. Solve weighted least squares: beta = (X’WX)^{-1} X’Wz d. Update eta = X @ beta, mu = g^{-1}(eta)
Continue until convergence (change in deviance < tol)

gaussian: Identity variance, identity link
gaussian: Identity variance, identity link
binomial: mu(1-mu) variance, logit/probit/cloglog link
poisson: mu variance, log link
gamma: mu^2 variance, inverse/log link

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - family: Distribution family (determines variance function) - link: Link function (determines g and g’)	required
`bundle`	`DataBundle`	Data bundle containing X, y, and optional weights.	required
`max_iter`	`int`	Maximum IRLS iterations (default: 25).	`25`
`tol`	`float`	Convergence tolerance on deviance (default: 1e-8).	`1e-08`

Returns:

Type Description

See Also:

glm: Underlying IRLS implementation

Modules¶

glmer¶

GLMM fitting via Penalized IRLS (PIRLS).

Functions:

Name	Description
`fit_glmer_pirls`	Fit generalized linear mixed model using Penalized IRLS.

Classes¶

Functions¶

fit_glmer_pirls¶

fit_glmer_pirls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 25, max_outer_iter: int = 10000, tol: float = 1e-07, verbose: bool = False, nAGQ: int = 1, use_hessian: bool = False) -> FitState

Fit generalized linear mixed model using Penalized IRLS.

This adapter wraps the PIRLS implementation from PIRLS combines IRLS (for the GLM part) with PLS (for random effects), using Laplace approximation to integrate out the random effects.

Outer loop (BOBYQA optimization over theta): Outer loop (BOBYQA optimization over theta): For each theta: 1. Build Lambda from theta

Inner loop (PIRLS iterations):
    a. Compute working weights from current eta/mu
    b. Compute working response
    c. Solve weighted PLS for beta and u
    d. Update eta = X @ beta + Z @ Lambda @ u
    e. Update mu = g^{-1}(eta)
    f. Step-halving if deviance increased
    g. Check convergence

2. Return Laplace deviance

Select theta minimizing Laplace deviance

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - family: Distribution family - link: Link function - random_terms: Parsed random effect specifications	required
`bundle`	`DataBundle`	Data bundle containing: - X: Fixed effects design matrix - Z: Random effects design matrix (sparse) - y: Response vector - re_metadata: Grouping structure	required
`max_iter`	`int`	Maximum PIRLS iterations per theta (default: 25).	`25`
`max_outer_iter`	`int`	Maximum BOBYQA iterations (default: 10000).	`10000`
`tol`	`float`	PIRLS convergence tolerance (default: 1e-7).	`1e-07`
`verbose`	`bool`	Print optimization progress (default: False).	`False`
`nAGQ`	`int`	Quadrature points (0 or 1, default: 1).	`1`
`use_hessian`	`bool`	Use Hessian-based vcov (default: False). The default Schur complement approach matches lme4’s `vcov()` with `use.hessian=FALSE` and avoids expensive numerical differentiation. Set to True for observed-information vcov.	`False`

Returns:

Type Description

See Also:

glmer: Underlying PIRLS implementation

grid¶

Prediction grid construction for formula-mode predictions.

Provides parse_predict_formula() which translates an explore-style formula into a prediction grid (Polars DataFrame), and build_predict_grid() which assembles the Cartesian-product grid from column specifications.

Shared by model.predict() (formula mode) and viz/predict.py (plot_predict).

Functions:

Name	Description
`build_predict_grid`	Build a Cartesian-product prediction grid.
`compute_predictions_from_formula`	Parse a predict formula, build the grid, compute predictions, and attach grid columns.
`parse_predict_formula`	Parse an explore-style formula and build a prediction grid.
`resolve_condition_values`	Resolve a :class:`Condition` to concrete values or `None`.

Classes¶

Functions¶

build_predict_grid¶

build_predict_grid(data: pl.DataFrame, focal_var: str, response_col: str, grouping_factors: tuple[str, ...], *, focal_values: list[float | str] | None = None, n_points: int | Literal['data'] = 50, varying_vars: list[str] | None = None, at: dict[str, Any] | None = None) -> pl.DataFrame

Build a Cartesian-product prediction grid.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Training data (Polars DataFrame).	required
`focal_var`	`str`	The predictor to vary across the grid.	required
`response_col`	`str`	Response column name (excluded from grid).	required
`grouping_factors`	`tuple[str, ...]`	Random-effect grouping variables (excluded).	required
`focal_values`	`list[float \| str] \| None`	Explicit values for the focal variable. Overrides default linspace/unique-levels logic.	`None`
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables. Use `"data"` to use actual observed unique values.	`50`
`varying_vars`	`list[str] \| None`	Condition variables to expand (all unique levels).	`None`
`at`	`dict[str, Any] \| None`	Dict of pinned values. Scalar = single constant, list = expand.	`None`

Returns:

Type	Description
`DataFrame`	Polars DataFrame with the Cartesian-product prediction grid.

compute_predictions_from_formula¶

compute_predictions_from_formula(formula: str, data: pl.DataFrame, spec: object, bundle: object, fit: object, formula_spec: object, pred_type: str, varying: str, allow_new_levels: bool, n_points: int | Literal['data']) -> 'PredictionState'

Parse a predict formula, build the grid, compute predictions, and attach grid columns.

Combines parse_predict_formula, compute_predictions, and grid-column attachment into a single call for model.predict() formula mode.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Explore-style formula (e.g. `"wt ~ cyl"`).	required
`data`	`DataFrame`	Training data.	required
`spec`	`object`	Model specification.	required
`bundle`	`object`	Data bundle.	required
`fit`	`object`	Fitted model state.	required
`formula_spec`	`object`	Learned formula spec for newdata evaluation.	required
`pred_type`	`str`	Prediction scale (`"response"` or `"link"`).	required
`varying`	`str`	RE handling (`"exclude"` or `"include"`).	required
`allow_new_levels`	`bool`	If True, new groups predict at population level.	required
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables.	required

Returns:

Type	Description
`‘PredictionState’`	PredictionState with grid columns attached.

parse_predict_formula¶

parse_predict_formula(formula: str, data: pl.DataFrame, response_col: str, grouping_factors: tuple[str, ...], *, n_points: int | Literal['data'] = 50) -> tuple[pl.DataFrame, list[str]]

Parse an explore-style formula and build a prediction grid.

Translates the formula via :func:parse_explore_formula, rejects contrast formulas, and delegates to :func:build_predict_grid.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Explore-style formula (e.g. `"wt ~ cyl"`).	required
`data`	`DataFrame`	Training data.	required
`response_col`	`str`	Response column name.	required
`grouping_factors`	`tuple[str, ...]`	Random-effect grouping variables.	required
`n_points`	`int \| Literal[‘data’]`	Number of grid points for continuous focal variables.	`50`

Returns:

Type	Description
`DataFrame`	Tuple of (grid DataFrame, list of grid column names for output).
`list[str]`	The grid column names are the focal var plus any condition vars
`tuple[DataFrame, list[str]]`	(the columns that vary across the grid, excluding reference-value
`tuple[DataFrame, list[str]]`	columns).

resolve_condition_values¶

resolve_condition_values(cond: Condition, data: pl.DataFrame) -> list | None

Resolve a :class:Condition to concrete values or None.

Parameters:

Name	Type	Description	Default
`cond`	`Condition`	A `Condition` from :func:`parse_explore_formula`.	required
`data`	`DataFrame`	The model’s training data.	required

Returns:

Type	Description
`list \| None`	A list of concrete values if the condition specifies explicit
`list \| None`	values (`at_values`, `at_range`, `at_quantile`), or
`list \| None`	`None` for bare conditions (use all unique levels).

lifecycle¶

Fit lifecycle orchestration.

Owns the multi-step fit sequence: bundle rebuild → fit → post-fit state → diagnostics augmentation. Called by model.fit() so the model class stays a thin facade.

Classes:

Name	Description
`FitResult`	Immutable result of the fit lifecycle.

Functions:

Name	Description
`execute_fit`	Execute the full fit lifecycle: bundle rebuild → fit → post-fit state → diagnostics.

Classes¶

FitResult¶

Immutable result of the fit lifecycle.

Attributes:

Name	Type	Description
`fit`	`FitState`	Fitted model state (coefficients, residuals, etc.).
`bundle`	`DataBundle`	Data bundle used for fitting (may be rebuilt).
`formula_spec`	`object`	Learned formula spec for newdata evaluation.
`raw_data`	`DataFrame \| None`	Original data snapshot (pre-augmentation).
`augmented_data`	`DataFrame \| None`	Data with diagnostic columns, or None.
`varying_offsets`	`VaryingState \| None`	BLUPs for mixed models, or None.
`varying_spread`	`VaryingSpreadState \| None`	Variance components for mixed models, or None.

Attributes¶

augmented_data¶

augmented_data: pl.DataFrame | None

bundle¶

bundle: DataBundle

fit¶

fit: FitState

formula_spec¶

formula_spec: object

raw_data¶

raw_data: pl.DataFrame | None

varying_offsets¶

varying_offsets: VaryingState | None = None

varying_spread¶

varying_spread: VaryingSpreadState | None = None

Functions¶

execute_fit¶

execute_fit(spec: ModelSpec, bundle: DataBundle | None, data: pl.DataFrame, raw_data: pl.DataFrame | None, formula: str, custom_contrasts: dict | None, weights_col: str | None, offset_col: str | None, missing: str, is_mixed: bool, solver_override: str | None, fit_kwargs: dict) -> FitResult

Execute the full fit lifecycle: bundle rebuild → fit → post-fit state → diagnostics.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification.	required
`bundle`	`DataBundle \| None`	Existing data bundle, or None to force rebuild.	required
`data`	`DataFrame`	Current data (raw_data-restored by caller).	required
`raw_data`	`DataFrame \| None`	Original pre-augmentation snapshot, or None.	required
`formula`	`str`	Formula string for bundle building.	required
`custom_contrasts`	`dict \| None`	User contrast matrices, or None.	required
`weights_col`	`str \| None`	Weights column name, or None.	required
`offset_col`	`str \| None`	Offset column name, or None.	required
`missing`	`str`	Missing value handling (`"drop"` or `"fail"`).	required
`is_mixed`	`bool`	Whether this is a mixed-effects model.	required
`solver_override`	`str \| None`	Explicit solver, or None for auto.	required
`fit_kwargs`	`dict`	Additional kwargs for `fit_model()`.	required

Returns:

Type	Description
`FitResult`	FitResult with all state the model needs to assign.

lmer¶

LMM fitting via Penalized Least Squares (PLS).

Functions:

Name	Description
`fit_lmer_pls`	Fit linear mixed-effects model using Penalized Least Squares.

Classes¶

Functions¶

fit_lmer_pls¶

fit_lmer_pls(spec: ModelSpec, bundle: DataBundle, *, max_iter: int = 10000, verbose: bool = False) -> FitState

Fit linear mixed-effects model using Penalized Least Squares.

This adapter wraps the PLS implementation from PLS is the algorithm from Bates et al. (2015) used in R’s lme4 package.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification containing: - method: “reml” or “ml” (determines objective function) - random_terms: Parsed random effect specifications	required
`bundle`	`DataBundle`	Data bundle containing: - X: Fixed effects design matrix (n x p) - Z: Random effects design matrix (n x q, sparse CSC) - y: Response vector - re_metadata: Grouping structure information	required
`max_iter`	`int`	Maximum BOBYQA iterations (default: 10000).	`10000`
`verbose`	`bool`	Print optimization progress (default: False).	`False`

Returns:

Type Description

See Also:

lmer: Underlying PLS implementation

ols¶

OLS fitting via QR decomposition.

Functions:

Name	Description
`fit_ols_qr`	Fit ordinary or weighted least squares using QR decomposition.

Classes¶

Functions¶

fit_ols_qr¶

fit_ols_qr(spec: ModelSpec, bundle: DataBundle) -> FitState

Fit ordinary or weighted least squares using QR decomposition.

Subtract offset from y (if present): y_adj = y - offset
Subtract offset from y (if present): y_adj = y - offset
Apply weights (if present): X_w = sqrt(w)*X, y_w = sqrt(w)*y_adj
QR decompose X_w with column pivoting for stability
Solve R * beta = Q.T @ y_w via back-substitution
Recompute original-scale: fitted = X @ beta + offset, resid = y - fitted
vcov = sigma_w^2 * (X’WX)^{-1}
Leverage from (possibly weighted) hat matrix

Matches R’s logLik.lm formula:: Matches R’s logLik.lm formula::

L = 0.5*sum(log(w)) - n/2 * (log(2*pi) + log(RSS_w/n) + 1)

The 0.5*sum(log(w)) term is the Jacobian from the weight transformation.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification (unused for OLS, included for interface consistency with other fitters).	required
`bundle`	`DataBundle`	Data bundle containing: - X: Design matrix (n x p) - y: Response vector (n,) - weights: Observation weights (n,) or None for OLS - offset: Offset vector (n,) or None	required

Returns:

Type Description

Examples:

>>> import numpy as np
>>> from containers import build_model_spec, DataBundle
>>> spec = build_model_spec(
...     formula="y ~ x",
...     response_var="y",
...     fixed_terms=["Intercept", "x"],
... )
>>> bundle = DataBundle(
...     X=np.array([[1.0, 1.0], [1.0, 2.0], [1.0, 3.0]]),
...     y=np.array([2.0, 4.0, 6.0]),
...     X_names=["Intercept", "x"],
...     y_name="y",
...     valid_mask=np.array([True, True, True]),
...     n_total=3,
... )
>>> state = fit_ols_qr(spec, bundle)
>>> np.allclose(state.fitted + state.residuals, bundle.y)
True
>>> np.allclose(state.coef, [0.0, 2.0])  # Perfect fit: y = 2x
True

predict¶

Prediction operations on containers.

Pure functions for computing predictions on new data, including random effects contribution for mixed models. Extracted from model/core.py.

Classes¶

Functions¶

build_X_for_newdata¶

build_X_for_newdata(formula_spec: FormulaSpec | None, X_names: tuple[str, ...], newdata: pl.DataFrame) -> NDArray[np.float64]

Build design matrix X for new data.

Uses the stored FormulaSpec to properly handle factors, transformations (log, poly, center), and interactions. This ensures new data is encoded consistently with the training data.

When no FormulaSpec is available (e.g. simulation-only workflows), falls back to manual column stacking.

Parameters:

Name	Type	Description	Default
`formula_spec`	`FormulaSpec \| None`	Encoding state from `build_design_matrices()`, or `None` for the fallback path.	required
`X_names`	`tuple[str, ...]`	Column names of the training design matrix.	required
`newdata`	`DataFrame`	New data for prediction as a Polars DataFrame.	required

Returns:

Type	Description
`NDArray[float64]`	Design matrix with same columns as training X, shape
`NDArray[float64]`	`(n_new, p)`.

build_re_covariates¶

build_re_covariates(newdata: pl.DataFrame, factor_names: list[str], valid_indices: NDArray[np.intp], formula_spec: FormulaSpec | None, X_names: tuple[str, ...]) -> NDArray[np.float64]

Build random effects covariate matrix for valid newdata rows.

For each random effect term (e.g. Intercept, slope variable), extracts the appropriate covariate values from newdata for the valid rows.

Parameters:

Name	Type	Description	Default
`newdata`	`DataFrame`	New data DataFrame.	required
`factor_names`	`list[str]`	Names of random effect terms for this grouping factor (e.g. `["Intercept", "x"]`).	required
`valid_indices`	`NDArray[intp]`	Integer indices of valid (non-NA) rows in newdata.	required
`formula_spec`	`FormulaSpec \| None`	FormulaSpec for proper design matrix encoding, or `None` for the fallback path.	required
`X_names`	`tuple[str, ...]`	Column names from the training design matrix.	required

Returns:

Type	Description
`NDArray[float64]`	Array of shape `(n_valid, n_re)` with covariate values.

compute_predictions¶

compute_predictions(spec: ModelSpec, bundle: DataBundle, fit: FitState, formula_spec: FormulaSpec | None, training_data: pl.DataFrame | None, newdata: pl.DataFrame | None, pred_type: Literal['response', 'link'], *, varying: Literal['exclude', 'include'] = 'exclude', allow_new_levels: bool = False) -> PredictionState

Compute predictions for given data.

For training data (newdata=None), returns fitted values directly. For new data, builds the design matrix, computes the linear predictor, optionally adds random effects, and applies the inverse link function.

Parameters:

Name	Type	Description	Default
`spec`	`ModelSpec`	Model specification (family, link, etc.).	required
`bundle`	`DataBundle`	Training data bundle (X, X_names, rank_info, re_metadata).	required
`fit`	`FitState`	Fitted state (coefficients, theta, u, fitted values).	required
`formula_spec`	`FormulaSpec \| None`	FormulaSpec for encoding new data, or `None`.	required
`training_data`	`DataFrame \| None`	Original training DataFrame for group-level mapping, or `None`.	required
`newdata`	`DataFrame \| None`	Data for prediction. If `None`, uses training data fitted values.	required
`pred_type`	`Literal[‘response’, ‘link’]`	Prediction scale (`"response"` or `"link"`).	required
`varying`	`Literal[‘exclude’, ‘include’]`	How to handle random effects for mixed models. `"exclude"` for population-level, `"include"` for conditional predictions with BLUPs.	`‘exclude’`
`allow_new_levels`	`bool`	If `True`, new groups predict at population level. If `False`, raises ValueError for unseen groups.	`False`

Returns:

Type	Description
`PredictionState`	PredictionState with fitted values and optional link-scale values.

compute_re_contribution¶

compute_re_contribution(re_meta: REInfo, theta: NDArray[np.floating], u: NDArray[np.floating], training_data: pl.DataFrame | None, newdata: pl.DataFrame, valid_mask: NDArray[np.bool_], allow_new_levels: bool, formula_spec: FormulaSpec | None, X_names: tuple[str, ...]) -> NDArray[np.float64]

Compute random effects contribution for new data predictions.

For each valid observation in newdata, maps group labels to trained group indices and computes the BLUP contribution as the dot product of the random effects design row and the group’s estimated BLUPs.

Parameters:

Name	Type	Description	Default
`re_meta`	`REInfo`	Random effects metadata from the DataBundle.	required
`theta`	`NDArray[floating]`	Variance parameters (relative scale) from FitState.	required
`u`	`NDArray[floating]`	Spherical random effects from FitState.	required
`training_data`	`DataFrame \| None`	Original training data (Polars DataFrame) for extracting known group levels, or `None`.	required
`newdata`	`DataFrame`	New data with grouping columns.	required
`valid_mask`	`NDArray[bool_]`	Boolean mask of valid (non-NA) rows in the design matrix, shape `(n,)`.	required
`allow_new_levels`	`bool`	If `True`, new groups get 0 RE contribution (population-level prediction). If `False`, raises ValueError.	required
`formula_spec`	`FormulaSpec \| None`	FormulaSpec for proper design matrix encoding (passed through to `build_re_covariates`).	required
`X_names`	`tuple[str, ...]`	Column names of the training design matrix (passed through to `build_re_covariates`).	required

Returns:

Type	Description
`NDArray[float64]`	Array of RE contributions for valid rows only, shape `(n_valid,)`.

resolve_coef_for_predict¶

resolve_coef_for_predict(coef: NDArray[np.floating], rank_info: RankInfo | None) -> NDArray[np.floating]

Coefficients safe for matrix multiplication (NaN -> 0).

When rank-deficient columns produce NaN coefficients, those NaN values must be zeroed out for X @ coef to work correctly. The dropped columns contribute nothing to predictions.

Parameters:

Name	Type	Description	Default
`coef`	`NDArray[floating]`	Coefficient array, shape `(p,)`.	required
`rank_info`	`RankInfo \| None`	Rank deficiency info from the DataBundle, or `None` if the design is full rank.	required

Returns:

Type	Description
`NDArray[floating]`	Coefficient array with NaN replaced by 0 when the design is
`NDArray[floating]`	rank-deficient, or the original array unchanged.

validate_newdata_groups¶

validate_newdata_groups(re_meta: REInfo, training_data: pl.DataFrame | None, newdata: pl.DataFrame, allow_new_levels: bool) -> None

Validate grouping columns and levels in new data for mixed models.

Ensures that all grouping variables exist in newdata and that group levels are a subset of those seen during training (unless allow_new_levels is True).

Called from compute_predictions when varying="include" for mixed models, ensuring group structure is valid before attempting to compute BLUP contributions.

Parameters:

Name	Type	Description	Default
`re_meta`	`REInfo`	Random effects metadata from the DataBundle.	required
`training_data`	`DataFrame \| None`	Original training DataFrame for extracting known group levels, or `None`.	required
`newdata`	`DataFrame`	New data for prediction as a Polars DataFrame.	required
`allow_new_levels`	`bool`	If `True`, skip the unseen-level check.	required

varying¶

Varying parameter extraction for mixed-effects models.

Extracts BLUP (Best Linear Unbiased Predictor) computation and variance component decomposition from the model class into pure functions on containers. These operations convert fitted spherical random effects into interpretable group-level parameters.

per_factor_re_info: Split global RE metadata into per-factor structures. per_factor_re_info: Split global RE metadata into per-factor structures. compute_varying_state: Compute BLUPs from theta and u via Lambda matrix. compute_varying_spread_state: Extract variance components (tau², rho, ICC).

Functions:

Name	Description
`build_mixed_post_fit_state`	Compute BLUPs, variance components, and emit convergence warnings.
`compute_varying_spread_state`	Compute VaryingSpreadState (variance components) from theta parameters.
`compute_varying_state`	Compute VaryingState (BLUPs) from fitted random effects parameters.
`per_factor_re_info`	Split global RE metadata into per-factor structures and names.

Attributes¶

Classes¶

Functions¶

build_mixed_post_fit_state¶

build_mixed_post_fit_state(fit: FitState, bundle: DataBundle, data: pl.DataFrame, *, stacklevel: int = 3) -> tuple[VaryingState | None, VaryingSpreadState | None]

Compute BLUPs, variance components, and emit convergence warnings.

Orchestrates the post-fit assembly for mixed-effects models: computes VaryingState (BLUPs) and VaryingSpreadState (variance components) from the fitted parameters, then checks for convergence issues.

Parameters:

Name	Type	Description	Default
`fit`	`FitState`	Fitted model state containing theta, u, sigma.	required
`bundle`	`DataBundle`	Data bundle with RE metadata and valid mask.	required
`data`	`DataFrame`	Original training data (used for group level labels).	required
`stacklevel`	`int`	Warning stacklevel for convergence warnings. Default 3 accounts for: user → `model.fit()` → `build_mixed_post_fit_state()`.	`3`

Returns:

Type	Description
`VaryingState \| None`	A tuple `(varying_offsets, varying_spread)` where either may be
`VaryingSpreadState \| None`	None if the required fitted parameters are missing.

compute_varying_spread_state¶

compute_varying_spread_state(theta: NDArray[np.floating], sigma: float, re_meta: REInfo, *, X: NDArray[np.floating] | None = None, X_names: tuple[str, ...] | None = None) -> VaryingSpreadState

Compute VaryingSpreadState (variance components) from theta parameters.

Extracts residual variance (sigma²), random effect variances (tau²), correlations (rho), and intraclass correlation (ICC) from the fitted theta vector using the random effects structure.

Parameters:

Name	Type	Description	Default
`theta`	`NDArray[floating]`	Variance component parameters from the fitted model.	required
`sigma`	`float`	Residual standard deviation from the fitted model.	required
`re_meta`	`REInfo`	Random effects metadata (grouping vars, structure, etc.).	required
`X`	`NDArray[floating] \| None`	Fixed-effects design matrix (n × p), used for ICC computation.	`None`
`X_names`	`tuple[str, ...] \| None`	Column names for `X`, used to look up slope predictors.	`None`

Returns:

Type	Description
`VaryingSpreadState`	VaryingSpreadState container with components DataFrame and
`VaryingSpreadState`	decomposed variance quantities.

compute_varying_state¶

compute_varying_state(theta: NDArray[np.floating], u: NDArray[np.floating], re_meta: REInfo, data: pl.DataFrame | None = None) -> VaryingState

Compute VaryingState (BLUPs) from fitted random effects parameters.

Parameters:

Name	Type	Description	Default
`theta`	`NDArray[floating]`	Variance component parameters from the fitted model.	required
`u`	`NDArray[floating]`	Spherical random effects vector from the fitted model.	required
`re_meta`	`REInfo`	Random effects metadata (grouping vars, structure, etc.).	required
`data`	`DataFrame \| None`	Original training data, used to extract unique group levels. If None, levels are labeled `"0"`, `"1"`, etc.	`None`

Returns:

Type	Description
`VaryingState`	VaryingState container with grid, effects dict, and group info.

per_factor_re_info¶

per_factor_re_info(re_meta: REInfo, group_names: list[str]) -> tuple[str | list[str], list[str] | dict[str, list[str]]]

Split global RE metadata into per-factor structures and names.

For single-factor models, returns the originals unchanged.

Parameters:

Name	Type	Description	Default
`re_meta`	`REInfo`	Random effects metadata from the fitted model’s DataBundle.	required
`group_names`	`list[str]`	Ordered list of grouping variable names (e.g. `["subject"]` or `["subject", "item"]`).	required

Returns:

Type	Description
`str \| list[str]`	A tuple `(re_structure, random_names)` where:
`list[str] \| dict[str, list[str]]`	- For single-factor models: `(str, list[str])` — the originals.
`tuple[str \| list[str], list[str] \| dict[str, list[str]]]`	- For multi-factor models: `(list[str], dict[str, list[str]])` — per-factor structure list and a dict mapping group name to its random effect names.