Model comparison: information criteria, likelihood ratio tests, and composite tables.
Call chain:
bossanova.compare(m1, m2, ...) -> compare() (composite AIC/BIC/loglik table)
bossanova.lrt(m1, m2) -> lrt() (nested model likelihood ratio test)Functions:
| Name | Description |
|---|---|
compare_aic | Compare models by AIC with delta-AIC and Akaike weights. |
compare_bic | Compare models by BIC with delta-BIC and Schwarz weights. |
Modules:
| Name | Description |
|---|---|
compare | Model comparison utilities for nested model testing. |
cv | Cross-validation comparison with Nadeau-Bengio corrected t-test. |
deviance | Deviance test implementation for comparing nested GLM models. |
f_test | F-test implementation for comparing nested lm models. |
helpers | Shared helpers for model comparison. |
ic | Information criterion comparison for model selection. |
lrt | Likelihood ratio test for comparing nested mixed models. |
lrt_compare | Likelihood ratio test implementation for comparing nested mixed models. |
refit | REML-to-ML refit helper for mixed model comparison. |
Functions¶
compare_aic¶
compare_aic(models: list[Any]) -> pl.DataFrameCompare models by AIC with delta-AIC and Akaike weights.
Models are sorted by AIC (best first). Delta-AIC is the difference from the best model. Akaike weights represent the relative likelihood of each model being the best, following Burnham & Anderson (2002):
w_i = exp(-0.5 * delta_i) / sum(exp(-0.5 * delta_j))Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted model objects (at least 2). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: model, npar, loglik, deviance, AIC, BIC, |
DataFrame | delta_AIC, weight. Sorted by AIC ascending (best model first). |
compare_bic¶
compare_bic(models: list[Any]) -> pl.DataFrameCompare models by BIC with delta-BIC and Schwarz weights.
Models are sorted by BIC (best first). Delta-BIC is the difference from the best model. Schwarz weights use the same formula as Akaike weights but applied to BIC differences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted model objects (at least 2). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: model, npar, loglik, deviance, AIC, BIC, |
DataFrame | delta_BIC, weight. Sorted by BIC ascending (best model first). |
Modules¶
compare¶
Model comparison utilities for nested model testing.
This module provides the main compare() entry point that dispatches to
the appropriate comparison strategy based on model type:
lm: F-test (
f_test.py)glm: Deviance test (
deviance.py)lmer/glmer: Likelihood ratio test (
lrt_compare.py)Any: Cross-validation with Nadeau-Bengio correction (
cv.py)
Examples:
from bossanova import model, compare, load_dataset
mtcars = load_dataset("mtcars")
compact = model("mpg ~ 1", mtcars).fit()
augmented = model("mpg ~ wt", mtcars).fit()
compare(compact, augmented)
# +───────────+──────────+─────────+────+─────────+─────────+──────────+───────+
# | model | df_resid | rss | df | ss | F | p_value | PRE |
# +───────────+──────────+─────────+────+─────────+─────────+──────────+───────+
# | mpg ~ 1 | 31 | 1126.05 | | | | | |
# | mpg ~ wt | 30 | 278.32 | 1 | 847.73 | 91.375 | 1.29e-10 | 0.753 |
# +───────────+──────────+─────────+────+─────────+─────────+──────────+───────+Notes: For single-parameter comparisons, the F-statistic equals t^2: F = t^2 where t is the t-statistic for the added parameter. This identity is verified in the parity tests.
Functions:
| Name | Description |
|---|---|
compare | Compare nested statistical models. |
Functions¶
compare¶
compare(*models: Any, method: str = 'auto', sort: bool = True, test: str = 'chisq', refit: bool = False, cv: int | str = 5, seed: int | None = None, metric: str = 'mse', digits: int | None = None, holdout_group: str | None = None) -> pl.DataFrameCompare nested statistical models.
Performs sequential hypothesis tests comparing nested models. The appropriate test method is inferred from model type:
lm: F-test (equivalent to R’s anova())
glm: Deviance test (chi-squared)
lmer/glmer: Likelihood ratio test
cv: Cross-validation with Nadeau-Bengio corrected t-test
Cross-type comparisons (e.g., glm vs glmer) are supported when models share the same family. The fixed-only model is treated as nested within the mixed model via zero variance components.
| Model Pair | Hypothesis Test | AIC/BIC | CV |
|---|---|---|---|
| lm vs lm | F-test | Yes | Yes |
| glm vs glm | Deviance | Yes | Yes |
| lmer vs lmer | LRT | Yes | Yes |
| glmer vs glmer | LRT | Yes | Yes |
| lm vs lmer | LRT | Yes | Yes |
| glm vs glmer | LRT | Yes | Yes |
| Different families | — | Yes | Yes |
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*models | Any | Two or more fitted model objects. | () |
method | str | Comparison method. Options: - “auto”: Infer from model type (default) - “f”: F-test (for lm) - “lrt”: Likelihood ratio test (for mixed models) - “deviance”: Deviance test (for glm) - “cv”: Cross-validation comparison (any model type) - “aic”: AIC comparison with delta-AIC and Akaike weights - “bic”: BIC comparison with delta-BIC and Schwarz weights | ‘auto’ |
sort | bool | If True, sort models by complexity before comparing. This ensures proper nesting order. Default True. | True |
refit | bool | If True and models are lmer/glmer with REML estimation, automatically refit with ML for valid LRT comparison. Original models are not mutated. Default False. Note: REML models are accepted without refitting when all models share the same fixed effects (valid for comparing random effects structures only). | False |
test | str | For GLM deviance comparisons only. Options: - “chisq”: Chi-squared test (default) - “f”: F-test (for quasi-families with estimated dispersion) | ‘chisq’ |
cv | int | str | For CV comparison only. Number of folds or “loo” for leave-one-out. | 5 |
seed | int | None | For CV comparison only. Random seed for reproducible splits. | None |
metric | str | For CV comparison only. Error metric (“mse”, “rmse”, “mae”). | ‘mse’ |
digits | int | None | Number of significant figures for float columns. Default None uses the global setting from set_display_digits() (4 by default). Pass 0 to disable rounding entirely. | None |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with comparison results. Columns depend on method: |
DataFrame | For F-test (lm): - model: Formula string - PRE: Proportional reduction in error - F: F-statistic - rss: Residual sum of squares - ss: Sum of squares explained - df: Degrees of freedom for comparison - df_resid: Residual degrees of freedom - p_value: p-value |
DataFrame | For deviance test (glm): - model: Formula string - chi2 (or F): Test statistic - dev_diff: Deviance reduction - deviance: Residual deviance - df: Degrees of freedom for comparison - df_resid: Residual degrees of freedom - p_value: p-value |
DataFrame | For LRT (lmer/glmer): - model: Formula string - chi2: Likelihood ratio chi-squared statistic - npar: Number of parameters - AIC: Akaike Information Criterion - BIC: Bayesian Information Criterion - loglik: Log-likelihood - deviance: Deviance (-2 * loglik) - df: Degrees of freedom for comparison - p_value: p-value |
DataFrame | For CV comparison: - model: Formula string - PRE: Proportional reduction in error - t_stat: Corrected t-statistic - cv_score: Mean CV error - cv_se: Standard error of CV error - diff: Difference from reference (first model) - diff_se: Nadeau-Bengio corrected standard error - p_value: Two-sided p-value |
DataFrame | For AIC/BIC comparison: - model: Formula string - npar: Number of estimated parameters - loglik: Log-likelihood - deviance: Deviance (-2 * loglik) - AIC/BIC: Information criteria - delta_AIC/delta_BIC: Difference from best model - weight: Akaike/Schwarz weight (model probability) |
Examples:
from bossanova import model, compare, load_dataset
mtcars = load_dataset("mtcars")
sleepstudy = load_dataset("sleepstudy")
# Models are auto-fitted if needed (calls .fit() with defaults)
compare(model("mpg ~ 1", mtcars), model("mpg ~ wt", mtcars))
# Equivalent to explicit .fit() calls:
compare(model("mpg ~ 1", mtcars).fit(), model("mpg ~ wt", mtcars).fit())
# Model objects are fitted in-place, so they're usable after compare()
compact = model("mpg ~ 1", mtcars)
full = model("mpg ~ wt", mtcars)
compare(compact, full)
full.params # Works - model was auto-fitted
# glm example (deviance test)
compare(
model("am ~ 1", mtcars, family="binomial"),
model("am ~ wt", mtcars, family="binomial"),
)
# lmer example (likelihood ratio test)
compare(
model("Reaction ~ Days + (1|Subject)", sleepstudy).fit(method="ML"),
model("Reaction ~ Days + (Days|Subject)", sleepstudy).fit(method="ML"),
)
# CV example (Nadeau-Bengio corrected)
compare(
model("mpg ~ 1", mtcars),
model("mpg ~ wt", mtcars),
method="cv", cv=5, seed=42,
)Notes: For single-parameter comparisons in lm models: F = t^2 where t is the t-statistic for the added parameter. This identity is fundamental and verified in parity tests.
For GLM, the deviance difference follows a chi-squared distribution: chi2 = deviance_compact - deviance_augmented with df = df_compact - df_augmented degrees of freedom.
For mixed models (lmer/glmer), the LRT statistic is:
chi2 = 2 * (loglik_augmented - loglik_compact)
with df = npar_augmented - npar_compact degrees of freedom.
REML models are accepted when all models share the same fixed
effects (comparing random effects structures only). Otherwise,
ML estimation is required — use refit=True to auto-refit.
A warning is emitted when any model has singular (boundary) variance components, as the chi-squared p-values may be conservative. See Self & Liang (1987).
For CV comparison, the Nadeau-Bengio correction accounts for overlapping training sets in k-fold CV: var_corrected = var(diff) * (1/k + n_test/n_train) This prevents the underestimation of variance that occurs with naive paired t-tests on CV folds.
See Also:
lrt: Likelihood ratio test for mixed models
cv¶
Cross-validation comparison with Nadeau-Bengio corrected t-test.
Compares models using k-fold (or LOO) cross-validation with the corrected variance estimator from Nadeau & Bengio (2003) to account for overlapping training sets.
Supports group-aware splitting via holdout_group to prevent data leakage
from random-effects structure.
Functions:
| Name | Description |
|---|---|
compare_cv | Compare models using cross-validation with Nadeau-Bengio corrected t-test. |
compute_cv_fold_score | Compute CV score for a single fold. |
Attributes¶
Classes¶
Functions¶
compare_cv¶
compare_cv(models: list[Any], cv: int | str = 5, seed: int | None = None, metric: str = 'mse', holdout_group: str | None = None) -> pl.DataFrameCompare models using cross-validation with Nadeau-Bengio corrected t-test.
Uses the corrected variance estimator from Nadeau & Bengio (2003) to account for the non-independence of CV folds (overlapping training sets).
When models have random effects, uses group-aware fold splitting to prevent
data leakage. For crossed random effects, holdout_group must be specified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted models to compare. | required |
cv | int | str | Number of folds, or “loo” for leave-one-out. | 5 |
seed | int | None | Random seed for reproducible fold splits. | None |
metric | str | Error metric (“mse”, “rmse”, “mae”). | ‘mse’ |
holdout_group | str | None | Column name to use for group-aware fold splitting. Required for crossed random effects. Can reference any column in the model’s data, even if not in the formula. | None |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: model, PRE, t_stat, cv_score, cv_se, |
DataFrame | diff, diff_se, p_value. |
Notes: The Nadeau-Bengio correction accounts for the fact that k-fold CV training sets overlap by approximately (k-2)/(k-1) fraction.
With group-aware splits, fold sizes may be unequal. The correction uses the mean n_test/n_train ratio across folds.
Nadeau & Bengio (2003) “Inference for the Generalization Error” Nadeau & Bengio (2003) “Inference for the Generalization Error”
compute_cv_fold_score¶
compute_cv_fold_score(model: Any, train_idx: np.ndarray, test_idx: np.ndarray, metric: str) -> floatCompute CV score for a single fold.
Refits the model on training data and evaluates on test data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted model (lm or glm) to clone and refit. | required |
train_idx | ndarray | Training set indices. | required |
test_idx | ndarray | Test set indices. | required |
metric | str | Error metric (“mse”, “rmse”, “mae”). | required |
Returns:
| Type | Description |
|---|---|
float | Test set error for this fold. |
deviance¶
Deviance test implementation for comparing nested GLM models.
Performs sequential deviance tests matching R’s
anova(glm1, glm2, test="Chisq") behavior.
Functions:
| Name | Description |
|---|---|
compare_deviance | Perform deviance tests for nested glm models. |
Attributes¶
Classes¶
Functions¶
compare_deviance¶
compare_deviance(models: list[Any], test: str = 'chisq') -> pl.DataFramePerform deviance tests for nested glm models.
Computes deviance differences comparing each model to the previous (simpler) model in the sequence. This matches R’s anova(glm1, glm2, test=“Chisq”).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted glm models, sorted by complexity (simplest first). | required |
test | str | Test type: “chisq” for chi-squared test (default), “f” for F-test. | ‘chisq’ |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns (test=“chisq”): |
DataFrame | - model: Formula string |
DataFrame | - chi2: Chi-squared statistic (deviance difference) |
DataFrame | - dev_diff: Deviance difference (reduction) |
DataFrame | - deviance: Residual deviance |
DataFrame | - df: Degrees of freedom for this comparison |
DataFrame | - df_resid: Residual degrees of freedom |
DataFrame | - p_value: p-value from chi-squared distribution |
DataFrame | DataFrame with columns (test=“f”): |
DataFrame | - model: Formula string |
DataFrame | - F: F-statistic |
DataFrame | - dev_diff: Deviance difference (reduction) |
DataFrame | - deviance: Residual deviance |
DataFrame | - df: Degrees of freedom for this comparison |
DataFrame | - df_resid: Residual degrees of freedom |
DataFrame | - p_value: p-value from F distribution |
Notes: The deviance difference follows a chi-squared distribution with df degrees of freedom under the null hypothesis that the compact model is adequate.
For quasi-families (where dispersion is estimated), the F-test is more appropriate and accounts for overdispersion.
f_test¶
F-test implementation for comparing nested lm models.
Performs sequential F-tests matching R’s anova(m1, m2, ...) behavior.
Functions:
| Name | Description |
|---|---|
compare_f_test | Perform sequential F-tests for nested lm models. |
compute_rss | Compute residual sum of squares for a model. |
Attributes¶
Classes¶
Functions¶
compare_f_test¶
compare_f_test(models: list[Any]) -> pl.DataFramePerform sequential F-tests for nested lm models.
Computes F-statistics comparing each model to the previous (simpler) model in the sequence. This matches R’s anova(m1, m2, ...) behavior.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted lm models, sorted by complexity (simplest first). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: |
DataFrame | - model: Formula string |
DataFrame | - PRE: Proportional reduction in error |
DataFrame | - F: F-statistic |
DataFrame | - rss: Residual sum of squares |
DataFrame | - ss: Sum of squares explained (RSS_compact - RSS_augmented) |
DataFrame | - df: Degrees of freedom for this comparison (df_compact - df_augmented) |
DataFrame | - df_resid: Residual degrees of freedom |
DataFrame | - p_value: p-value from F distribution |
Notes: The F-statistic is computed as: F = (SS / df) / (RSS_augmented / df_resid_augmented)
For a single-parameter comparison, F = t^2 where t is the t-statistic for the added parameter in the augmented model.
compute_rss¶
compute_rss(model: Any) -> floatCompute residual sum of squares for a model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | Fitted lm model. | required |
Returns:
| Type | Description |
|---|---|
float | RSS = sum(residuals^2). |
helpers¶
Shared helpers for model comparison.
Provides validation, sorting, nesting checks, and method inference used across all comparison strategies (F-test, deviance, LRT, CV).
Functions:
| Name | Description |
|---|---|
check_nested | Check that models appear to be nested and raise error if not. |
get_model_family_group | Get the family group for cross-type compatibility checking. |
get_model_type | Get the model type string for a fitted model. |
resolve_method | Infer the appropriate comparison method for a model type. |
sort_models_by_complexity | Sort models from simplest (fewest parameters) to most complex. |
validate_models | Validate that models can be compared, auto-fitting if needed. |
Functions¶
check_nested¶
check_nested(df_diffs: list[float | int | None], stat_diffs: list[float | None], model_formulas: list[str], method: str) -> NoneCheck that models appear to be nested and raise error if not.
Non-nested models are detected by:
df <= 0: No additional parameters in augmented model
stat_diff < 0: Augmented model fits worse (negative SS, deviance reduction, or chi2)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_diffs | list[float | int | None] | List of df differences for each comparison (None for first model). | required |
stat_diffs | list[float | None] | List of statistic differences (SS, dev_diff, or chi2). | required |
model_formulas | list[str] | List of model formula strings. | required |
method | str | Comparison method (“f”, “deviance”, or “lrt”). | required |
get_model_family_group¶
get_model_family_group(model: Any) -> strGet the family group for cross-type compatibility checking.
Models in the same family group can be compared via LRT because the fixed-only model is nested within the mixed model (zero variance components). Models in different family groups have incompatible likelihoods and cannot be compared via hypothesis tests.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted model instance. | required |
Returns:
| Type | Description |
|---|---|
str | Family group string: ‘gaussian’ for lm/lmer, or the family name |
str | (e.g. ‘binomial’, ‘poisson’, ‘gamma’) for glm/glmer. |
get_model_type¶
get_model_type(model: Any) -> strGet the model type string for a fitted model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted model instance. | required |
Returns:
| Type | Description |
|---|---|
str | Model type: ‘lm’, ‘glm’, ‘lmer’, or ‘glmer’. |
resolve_method¶
resolve_method(model: Any, *, model_types: set[str] | None = None) -> strInfer the appropriate comparison method for a model type.
For same-type comparisons, reads model._model_type to determine
the method. For cross-type comparisons (lm+lmer or glm+glmer),
returns "lrt" since the fixed-only model is nested within the
mixed model via zero variance components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted model object with a _model_type attribute (one of "lm", "glm", "lmer", "glmer"). | required |
model_types | set[str] | None | Set of all model types being compared. When provided, enables cross-type method resolution. | None |
Returns:
| Type | Description |
|---|---|
str | Comparison method: "f" for lm, "lrt" for lmer/glmer |
str | (and cross-type lm+lmer or glm+glmer), "deviance" for glm. |
sort_models_by_complexity¶
sort_models_by_complexity(models: tuple[Any, ...]) -> list[Any]Sort models from simplest (fewest parameters) to most complex.
This matches R’s anova() behavior which orders models by df.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | tuple[Any, ...] | Tuple of fitted model objects. | required |
Returns:
| Type | Description |
|---|---|
list[Any] | List of models sorted by number of parameters (ascending). |
validate_models¶
validate_models(models: tuple[Any, ...], method: str = 'auto', refit: bool = False) -> tuple[Any, ...]Validate that models can be compared, auto-fitting if needed.
Checks:
At least 2 models provided
All models are fitted (auto-fits unfitted models)
All models are the same type
All models have the same response variable
All models have the same number of observations
For LRT: All models use ML estimation (not REML), unless refit=True or all models share the same fixed effects (REML LRT is valid for comparing random effects structures with identical fixed effects)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | tuple[Any, ...] | Tuple of model objects (fitted or unfitted). | required |
method | str | Comparison method (used to check REML for LRT). | ‘auto’ |
refit | bool | If True, skip REML validation (models will be refitted). | False |
Returns:
| Type | Description |
|---|---|
tuple[Any, ...] | Tuple of fitted model objects. |
ic¶
Information criterion comparison for model selection.
Computes AIC/BIC tables with delta-IC and Akaike/Schwarz weights, following Burnham & Anderson (2002). Unlike hypothesis-test comparisons (F-test, LRT, deviance), IC comparison does not require model nesting — any set of models fitted to the same data can be compared.
Attributes¶
Classes¶
Functions¶
compare_aic¶
compare_aic(models: list[Any]) -> pl.DataFrameCompare models by AIC with delta-AIC and Akaike weights.
Models are sorted by AIC (best first). Delta-AIC is the difference from the best model. Akaike weights represent the relative likelihood of each model being the best, following Burnham & Anderson (2002):
w_i = exp(-0.5 * delta_i) / sum(exp(-0.5 * delta_j))Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted model objects (at least 2). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: model, npar, loglik, deviance, AIC, BIC, |
DataFrame | delta_AIC, weight. Sorted by AIC ascending (best model first). |
compare_bic¶
compare_bic(models: list[Any]) -> pl.DataFrameCompare models by BIC with delta-BIC and Schwarz weights.
Models are sorted by BIC (best first). Delta-BIC is the difference from the best model. Schwarz weights use the same formula as Akaike weights but applied to BIC differences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted model objects (at least 2). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: model, npar, loglik, deviance, AIC, BIC, |
DataFrame | delta_BIC, weight. Sorted by BIC ascending (best model first). |
lrt¶
Likelihood ratio test for comparing nested mixed models.
Functions:
| Name | Description |
|---|---|
lrt | Likelihood ratio test for comparing nested mixed models. |
Functions¶
lrt¶
lrt(*models: Any, sort: bool = True) -> pl.DataFrameLikelihood ratio test for comparing nested mixed models.
Convenience wrapper for compare(*models, method="lrt") specifically
designed for comparing lmer and glmer models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*models | Any | Two or more fitted mixed models to compare. Models should be nested (simpler model is a subset of complex model). | () |
sort | bool | If True (default), sort models by complexity (fewest parameters first), matching R’s anova() behavior. Set to False to preserve input order. | True |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: - model: Model formula string - chi2: Chi-squared statistic (deviance difference) - npar: Number of parameters - AIC: Akaike Information Criterion - BIC: Bayesian Information Criterion - loglik: Log-likelihood - deviance: -2 * log-likelihood - df: Degrees of freedom (parameter difference) - p_value: P-value from chi-squared distribution |
Notes:
For valid likelihood ratio tests with different fixed effects, models
must be fit with ML estimation (not REML). REML is accepted when all
models share the same fixed effects (comparing random effects structures
only). A ValueError is raised otherwise — refit models with
method='ml' or use compare(..., refit=True) to auto-refit.
Examples:
from bossanova import model, lrt
m1 = model("Reaction ~ Days + (1|Subject)", data=sleepstudy).fit(method="ML")
m2 = model("Reaction ~ Days + (Days|Subject)", data=sleepstudy).fit(method="ML")
lrt(m1, m2)
# ┌───────────────────────────────┬───────┬──────┬────────┬────────┬─────────┬──────────┬────┬──────────┐
# │ model ┆ chi2 ┆ npar ┆ AIC ┆ BIC ┆ loglik ┆ deviance ┆ df ┆ p_value │
# ├───────────────────────────────┼───────┼──────┼────────┼────────┼─────────┼──────────┼────┼──────────┤
# │ Reaction ~ Days + (1|Subject) ┆ ┆ 4 ┆ 1802.1 ┆ 1814.8 ┆ -897.04 ┆ 1794.1 ┆ ┆ │
# │ Reaction ~ Days + (Days|Subj) ┆ 42.14 ┆ 6 ┆ 1763.9 ┆ 1783.1 ┆ -875.97 ┆ 1751.9 ┆ 2 ┆ 7.07e-10 │
# └───────────────────────────────┴───────┴──────┴────────┴────────┴─────────┴──────────┴────┴──────────┘lrt_compare¶
Likelihood ratio test implementation for comparing nested mixed models.
Performs sequential LRT comparisons matching lme4’s anova() behavior.
Functions:
| Name | Description |
|---|---|
compare_lrt | Perform likelihood ratio tests for nested lmer/glmer models. |
get_n_params | Get total number of estimated parameters for LRT comparison. |
Attributes¶
Classes¶
Functions¶
compare_lrt¶
compare_lrt(models: list[Any]) -> pl.DataFramePerform likelihood ratio tests for nested lmer/glmer models.
Computes likelihood ratio chi-squared statistics comparing each model to the previous (simpler) model. This matches lme4’s anova() behavior.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models | list[Any] | List of fitted lmer/glmer models, sorted by complexity. All models must use ML estimation (validated by compare()). | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: |
DataFrame | - model: Formula string |
DataFrame | - chi2: Likelihood ratio chi-squared statistic |
DataFrame | - npar: Number of parameters |
DataFrame | - AIC: Akaike Information Criterion |
DataFrame | - BIC: Bayesian Information Criterion |
DataFrame | - loglik: Log-likelihood |
DataFrame | - deviance: Deviance (-2 * loglik) |
DataFrame | - df: Degrees of freedom for comparison |
DataFrame | - p_value: p-value from chi-squared distribution |
Notes: The LRT statistic is: chi2 = 2 * (loglik_aug - loglik_compact) which follows a chi-squared distribution with df = npar_aug - npar_compact under the null hypothesis that the simpler model is adequate.
ML estimation is required for valid LRT when models have different fixed effects. REML is accepted when all models share the same fixed effects (comparing random effects structures only).
A warning is emitted when any model has singular (boundary) variance components, as the standard chi-squared null distribution may be conservative. See Self & Liang (1987).
get_n_params¶
get_n_params(model: Any) -> intGet total number of estimated parameters for LRT comparison.
Handles all model types (lm, glm, lmer, glmer) for cross-type LRT comparisons where a fixed-only model is nested within a mixed model via zero variance components.
Parameter counting follows R’s logLik() convention:
Fixed effects (p)
Variance parameters (len(theta), mixed models only)
Dispersion/sigma (+1 for Gaussian and other estimated-dispersion families)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted model instance. | required |
Returns:
| Type | Description |
|---|---|
int | Total parameter count. |
refit¶
REML-to-ML refit helper for mixed model comparison.
Provides refit_with_ml() which creates a new model fitted with ML
estimation when the original uses REML, enabling valid likelihood ratio tests.
Functions:
| Name | Description |
|---|---|
refit_with_ml | Refit a mixed model with ML estimation if it uses REML. |
Functions¶
refit_with_ml¶
refit_with_ml(model: Any) -> AnyRefit a mixed model with ML estimation if it uses REML.
Creates a new model instance and fits with method=“ML”. The original model is not mutated.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | Any | A fitted lmer or glmer model. | required |
Returns:
| Type | Description |
|---|---|
Any | If model uses REML: a new model fitted with ML. |
Any | If model uses ML: the original model unchanged. |