Statistical inference utilities for bossanova.
Classes:
| Name | Description |
|---|---|
CellInfo | Information about factor cells for Welch-style inference. |
Chi2TestResult | Result container for chi-square test. |
FTestResult | Result container for F-test. |
InferenceResult | Results from coefficient inference computation. |
MixedModelProtocol | Protocol for mixed-effects models compatible with profile likelihood. |
TTestResult | Result container for t-test. |
Functions:
| Name | Description |
|---|---|
adjust_pvalues | Adjust p-values for multiple comparisons. |
build_cholesky_with_derivs | Build Cholesky factor L and its derivatives w.r.t. theta. |
compute_aic | Compute Akaike Information Criterion. |
compute_akaike_weights | Compute Akaike weights from information criterion values. |
compute_bic | Compute Bayesian Information Criterion. |
compute_cell_info | Compute cell-based variance information for Welch inference. |
compute_chi2_test | Compute Wald chi-square test for L @ β = 0. |
compute_ci | Compute confidence interval bounds. |
compute_coefficient_inference | Compute inference statistics for regression coefficients. |
compute_contrast_variance | Compute variance-covariance of linear contrasts L @ β. |
compute_cooks_distance | Compute Cook’s distance for influence. |
compute_cr_vcov | Compute cluster-robust covariance matrix for Gaussian mixed models. |
compute_deviance | Compute deviance from log-likelihood. |
compute_f_pvalue | Compute p-value from F-statistic. |
compute_f_test | Compute F-test for linear hypothesis L @ β = 0. |
compute_glm_cr_vcov | Compute cluster-robust covariance matrix for non-Gaussian mixed models. |
compute_glm_hc_vcov | Compute heteroscedasticity-consistent covariance matrix for GLM. |
compute_hc_vcov | Compute heteroscedasticity-consistent covariance matrix. |
compute_leverage | Compute diagonal of hat matrix (leverage values). |
compute_mvt_critical | Compute multivariate-t critical value for simultaneous inference. |
compute_pvalue | Compute p-values from test statistics. |
compute_satterthwaite_df | Compute Satterthwaite degrees of freedom for each fixed effect. |
compute_satterthwaite_summary_table | Compute full coefficient table with Satterthwaite df and p-values. |
compute_satterthwaite_t_test | Compute t-statistics, p-values, and confidence intervals. |
compute_sd_jacobian | Compute SDs and Jacobian of SDs w.r.t. varpar = [theta, sigma]. |
compute_se_from_vcov | Compute standard errors from variance-covariance matrix. |
compute_sigma_se_wald | Compute Wald standard error for sigma. |
compute_studentized_residuals | Compute internally studentized (standardized) residuals. |
compute_t_critical | Compute t-distribution critical value for confidence interval. |
compute_t_test | Compute t-test for a single contrast L @ β = 0. |
compute_tukey_critical | Compute Tukey HSD critical value for pairwise comparisons. |
compute_vif | Compute variance inflation factors. |
compute_wald_ci_varying | Compute Wald CIs for variance components on SD scale. |
compute_wald_statistic | Compute Wald statistic for testing L @ β = 0. |
compute_welch_satterthwaite_df_per_coef | Compute per-coefficient Welch-Satterthwaite degrees of freedom. |
compute_z_critical | Compute z-distribution critical value for confidence interval. |
convert_theta_ci_to_sd | Convert theta-scale CIs to SD-scale CIs. |
delta_method_se | Compute standard errors for predictions via delta method. |
extract_ci_bound | Extract CI bound by finding where spline equals target zeta. |
extract_factors_from_formula | Extract factor (categorical) column names from a model formula. |
format_pvalue_with_stars | Format p-value with R-style significance codes. |
parse_conf_int | Parse flexible confidence interval input to float. |
profile_likelihood | Compute profile likelihood confidence intervals for variance components. |
profile_theta_parameter | Profile a single theta parameter bidirectionally from its MLE. |
satterthwaite_df_for_contrasts | Compute Satterthwaite degrees of freedom for arbitrary contrasts. |
Modules:
| Name | Description |
|---|---|
diagnostics | Residuals, influence measures, and leverage statistics. |
estimation | Standard errors, confidence intervals, and coefficient inference. |
hypothesis | Statistical hypothesis tests (F, chi-square, t) for linear contrasts. |
information_criteria | Deviance, AIC, BIC, and Akaike weights. |
multiplicity | Multiplicity adjustment for contrast confidence intervals. |
profile | Profile likelihood for variance components in mixed-effects models. |
sandwich | Sandwich covariance matrix estimators for robust standard errors. |
satterthwaite | Satterthwaite degrees of freedom approximation for linear mixed models. |
wald_variance | Wald confidence intervals for variance components using delta method. |
welch | Welch-Satterthwaite degrees of freedom for unequal variance inference. |
Classes¶
CellInfo¶
CellInfo(cell_labels: list[tuple[str, ...]], cell_variances: NDArray[np.float64], cell_counts: NDArray[np.int64], cell_indices: NDArray[np.int64], factor_columns: list[str]) -> NoneInformation about factor cells for Welch-style inference.
When using errors='unequal_var', residual variances are computed
within each cell defined by the crossing of all factors in the model.
This implements the Welch-James approach for factorial designs.
Attributes:
| Name | Type | Description |
|---|---|---|
cell_labels | list[tuple[str, ...]] | Labels for each cell as tuples of factor levels. For single factor: [(“A”,), (“B”,), ...] For multiple factors: [(“A”, “X”), (“A”, “Y”), (“B”, “X”), ...] |
cell_variances | NDArray[float64] | Sample variance of residuals within each cell. |
cell_counts | NDArray[int64] | Number of observations in each cell. |
cell_indices | NDArray[int64] | Cell membership for each observation (0-indexed). |
factor_columns | list[str] | Names of factor columns used to define cells. |
Attributes¶
cell_counts¶
cell_counts: NDArray[np.int64]cell_indices¶
cell_indices: NDArray[np.int64]cell_labels¶
cell_labels: list[tuple[str, ...]]cell_variances¶
cell_variances: NDArray[np.float64]factor_columns¶
factor_columns: list[str]Chi2TestResult¶
Chi2TestResult(num_df: int, chi2: float, p_value: float) -> NoneResult container for chi-square test.
Attributes:
| Name | Type | Description |
|---|---|---|
num_df | int | Degrees of freedom (q, number of contrasts). |
chi2 | float | Wald chi-square statistic. |
p_value | float | P-value from chi-square distribution. |
Attributes¶
chi2¶
chi2: floatnum_df¶
num_df: intp_value¶
p_value: floatFTestResult¶
FTestResult(num_df: int, den_df: float, F_value: float, p_value: float) -> NoneResult container for F-test.
Attributes:
| Name | Type | Description |
|---|---|---|
num_df | int | Numerator degrees of freedom (q, number of contrasts). |
den_df | float | Denominator degrees of freedom (residual df). |
F_value | float | F-statistic value. |
p_value | float | P-value from F-distribution. |
Attributes¶
F_value¶
F_value: floatden_df¶
den_df: floatnum_df¶
num_df: intp_value¶
p_value: floatInferenceResult¶
InferenceResult(se: np.ndarray, statistic: np.ndarray, p_values: np.ndarray, ci_lower: np.ndarray, ci_upper: np.ndarray, df_values: list[float]) -> NoneResults from coefficient inference computation.
Attributes:
| Name | Type | Description |
|---|---|---|
se | ndarray | Standard errors for each coefficient. |
statistic | ndarray | Test statistics (t or z values). |
p_values | ndarray | Two-tailed p-values. |
ci_lower | ndarray | Lower confidence interval bounds. |
ci_upper | ndarray | Upper confidence interval bounds. |
df_values | list[float] | Degrees of freedom for each coefficient (inf for z-distribution). |
Attributes¶
ci_lower¶
ci_lower: np.ndarrayci_upper¶
ci_upper: np.ndarraydf_values¶
df_values: list[float]p_values¶
p_values: np.ndarrayse¶
se: np.ndarraystatistic¶
statistic: np.ndarrayMixedModelProtocol¶
Bases: Protocol
Protocol for mixed-effects models compatible with profile likelihood.
This protocol defines the minimal interface required for profile_likelihood(), supporting both legacy lmer/glmer classes and the new unified model() class.
The protocol is runtime checkable, allowing isinstance() checks if needed, but the function primarily uses duck typing for flexibility.
Functions:
| Name | Description |
|---|---|
get_theta_lower_bounds | Get lower bounds for theta parameters. |
Functions¶
get_theta_lower_bounds¶
get_theta_lower_bounds(n_theta: int) -> list[float]Get lower bounds for theta parameters.
TTestResult¶
TTestResult(estimate: float, se: float, t_value: float, df: float, p_value: float) -> NoneResult container for t-test.
Attributes:
| Name | Type | Description |
|---|---|---|
estimate | float | Contrast estimate. |
se | float | Standard error. |
t_value | float | t-statistic. |
df | float | Degrees of freedom. |
p_value | float | Two-tailed p-value. |
Attributes¶
df¶
df: floatestimate¶
estimate: floatp_value¶
p_value: floatse¶
se: floatt_value¶
t_value: floatFunctions¶
adjust_pvalues¶
adjust_pvalues(pvalues: np.ndarray, method: str = 'none') -> np.ndarrayAdjust p-values for multiple comparisons.
Implements standard p-value adjustment methods matching R’s p.adjust().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pvalues | ndarray | Raw p-values, shape (n,). | required |
method | str | Adjustment method. One of: - “none”: No adjustment (return raw p-values) - “bonferroni”: Bonferroni correction (p * n) - “holm”: Holm step-down (default in many R packages) - “hochberg”: Hochberg step-up - “bh” or “fdr”: Benjamini-Hochberg FDR control - “by”: Benjamini-Yekutieli FDR (conservative, handles dependence) | ‘none’ |
Returns:
| Type | Description |
|---|---|
ndarray | Adjusted p-values, clipped to [0, 1]. |
Examples:
>>> p = np.array([0.01, 0.04, 0.03, 0.005])
>>> adjust_pvalues(p, method="bonferroni")
array([0.04, 0.16, 0.12, 0.02])
>>> adjust_pvalues(p, method="holm")
array([0.03, 0.08, 0.06, 0.02])build_cholesky_with_derivs¶
build_cholesky_with_derivs(theta_group: np.ndarray, n_re: int, structure: str) -> tuple[np.ndarray, list[np.ndarray]]Build Cholesky factor L and its derivatives w.r.t. theta.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta_group | ndarray | Theta values for this group. | required |
n_re | int | Number of random effects. | required |
structure | str | “intercept”, “diagonal”, or “slope”. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (L, dL_dtheta) where: |
list[ndarray] | - L: Cholesky factor, shape (n_re, n_re) |
tuple[ndarray, list[ndarray]] | - dL_dtheta: List of derivative matrices, one per theta element |
compute_aic¶
compute_aic(loglik: float, k: int) -> floatCompute Akaike Information Criterion.
AIC = -2 * loglik + 2 * k. Trades off model fit against complexity, penalizing by 2 per parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
k | int | Number of estimated parameters (including intercept and any scale/variance parameters). | required |
Returns:
| Type | Description |
|---|---|
float | AIC value (lower is better). |
Examples:
>>> compute_aic(-100.0, k=3)
206.0
>>> compute_aic(-50.0, k=5)
110.0compute_akaike_weights¶
compute_akaike_weights(ic_values: np.ndarray) -> np.ndarrayCompute Akaike weights from information criterion values.
Weights quantify relative probability that each model is the best model in the set. Works with AIC or BIC values.
w_i = exp(-0.5 * delta_i) / sum(exp(-0.5 * delta_j))
where delta_i = IC_i - min(IC).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ic_values | ndarray | Array of information criterion values (AIC or BIC) for each model, shape (m,) where m is the number of models. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Array of weights summing to 1.0, shape (m,). |
Examples:
>>> weights = compute_akaike_weights(np.array([100.0, 102.0, 110.0]))
>>> np.isclose(weights.sum(), 1.0)
True
>>> weights[0] > weights[1] > weights[2]
Truecompute_bic¶
compute_bic(loglik: float, k: int, n: int) -> floatCompute Bayesian Information Criterion.
BIC = -2 * loglik + k * log(n). Penalizes complexity more heavily than AIC for n >= 8, preferring simpler models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
k | int | Number of estimated parameters. | required |
n | int | Number of observations. | required |
Returns:
| Type | Description |
|---|---|
float | BIC value (lower is better). |
Examples:
>>> compute_bic(-100.0, k=3, n=100)
213.81...
>>> compute_bic(-100.0, k=3, n=50)
211.73...compute_cell_info¶
compute_cell_info(residuals: NDArray[np.float64], data: pl.DataFrame, factor_columns: list[str]) -> CellInfoCompute cell-based variance information for Welch inference.
For multi-factor models, cells are defined by the crossing of all factors. This implements the Welch-James approach for factorial designs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | NDArray[float64] | OLS residuals, shape (n,). | required |
data | DataFrame | DataFrame with factor columns. | required |
factor_columns | list[str] | List of factor column names. | required |
Returns:
| Type | Description |
|---|---|
CellInfo | CellInfo containing variance information for each cell. |
Examples:
import numpy as np
import polars as pl
df = pl.DataFrame({
"group": ["A", "A", "B", "B", "B"],
"x": [1.0, 2.0, 3.0, 4.0, 5.0],
})
residuals = np.array([0.1, -0.1, 0.2, -0.1, -0.1])
info = compute_cell_info(residuals, df, ["group"])
info.cell_labels
# [('A',), ('B',)]
info.cell_counts
# array([2, 3])compute_chi2_test¶
compute_chi2_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray) -> Chi2TestResultCompute Wald chi-square test for L @ β = 0.
Used for GLMs and GLMMs where we don’t have residual df. The Wald statistic is asymptotically chi-squared distributed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p). | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
Chi2TestResult | Chi2TestResult containing num_df, chi2, p_value. |
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]])
>>> coef = np.array([0.5, 1.2, 0.8])
>>> vcov = np.diag([0.1, 0.15, 0.12])
>>> result = compute_chi2_test(L, coef, vcov)
>>> result.num_df
2compute_ci¶
compute_ci(estimate: np.ndarray, se: np.ndarray, critical: float | np.ndarray, alternative: str = 'two-sided') -> tuple[np.ndarray, np.ndarray]Compute confidence interval bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimate | ndarray | Point estimates. | required |
se | ndarray | Standard errors. | required |
critical | float | ndarray | Critical value(s) from distribution (t or z). Scalar or per-coefficient array that broadcasts with estimate and se. | required |
alternative | str | “two-sided” (default), “greater”, or “less”. - “greater”: lower bound only, upper = +inf - “less”: upper bound only, lower = -inf | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
tuple[ndarray, ndarray] | Tuple of (ci_lower, ci_upper) arrays with confidence bounds. |
Examples:
>>> estimate = np.array([1.5, 2.0])
>>> se = np.array([0.2, 0.3])
>>> critical = 1.96
>>> ci_lower, ci_upper = compute_ci(estimate, se, critical)compute_coefficient_inference¶
compute_coefficient_inference(coef: np.ndarray, vcov: np.ndarray, df: float | np.ndarray | None, conf_level: float = 0.95, null: float = 0.0, alternative: str = 'two-sided') -> InferenceResultCompute inference statistics for regression coefficients.
Orchestrates computation of standard errors, test statistics, p-values, and confidence intervals for coefficient estimates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coef | ndarray | Coefficient estimates, shape (p,). | required |
vcov | ndarray | Variance-covariance matrix, shape (p, p). | required |
df | float | ndarray | None | Degrees of freedom for t-distribution. Scalar (same df for all coefficients), array of shape (p,) for per-coefficient df, or None for z-distribution. | required |
conf_level | float | Confidence level for intervals (default 0.95). | 0.95 |
null | float | Null hypothesis value (default 0.0). Statistic = (coef - null) / se. | 0.0 |
alternative | str | Direction of the alternative hypothesis. - “two-sided”: H₁: β ≠ null (default) - “greater”: H₁: β > null - “less”: H₁: β < null | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
InferenceResult | InferenceResult with se, statistic, p_values, ci_lower, ci_upper, df_values. |
Examples:
>>> coef = np.array([1.5, 2.0])
>>> vcov = np.array([[0.04, 0.01], [0.01, 0.09]])
>>> result = compute_coefficient_inference(coef, vcov, df=20)
>>> result.se.shape
(2,)compute_contrast_variance¶
compute_contrast_variance(L: np.ndarray, vcov: np.ndarray) -> np.ndarrayCompute variance-covariance of linear contrasts L @ β.
For a contrast matrix L and coefficient vcov V, computes: Var(L @ β) = L @ V @ L’
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Variance-covariance of contrasts, shape (q, q). |
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> var_L = compute_contrast_variance(L, vcov)
>>> var_L.shape
(2, 2)compute_cooks_distance¶
compute_cooks_distance(residuals: np.ndarray, hat: np.ndarray, sigma: float, p: int) -> np.ndarrayCompute Cook’s distance for influence.
Cook’s distance measures the influence of each observation on the fitted coefficients.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | ndarray | Raw residuals of shape (n,). | required |
hat | ndarray | Leverage values (diagonal of hat matrix) of shape (n,). | required |
sigma | float | Residual standard error. | required |
p | int | Number of parameters (including intercept). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Cook’s distances of shape (n,). |
Note: Cook’s distance formula (matching R’s cooks.distance()): D_i = (e_i² / (p * σ²)) * (h_i / (1 - h_i)²) where e_i is the raw residual, σ is sigma, h_i is leverage.
Values > 1 are traditionally considered influential. More conservatively, D_i > 4/(n-p) is used.
Examples:
>>> D = compute_cooks_distance(residuals, hat, sigma=2.5, p=3)
>>> influential = np.where(D > 1)[0]compute_cr_vcov¶
compute_cr_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], cluster_ids: NDArray[np.intp], XtX_inv: NDArray[np.float64], cr_type: str = 'CR0') -> NDArray[np.float64]Compute cluster-robust covariance matrix for Gaussian mixed models.
Implements the sandwich estimator clustered by random-effects grouping
variable: (X'X)^{-1} M (X'X)^{-1} where M sums outer products of
per-cluster score vectors.
Matches sandwich::vcovCL() in R.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Marginal residuals y - X @ beta, shape (n,). | required |
cluster_ids | NDArray[intp] | Integer cluster labels, shape (n,). Values in [0, G). | required |
XtX_inv | NDArray[float64] | (X'X)^{-1} from OLS, shape (p, p). | required |
cr_type | str | Type of CR estimator: - “CR0”: No small-sample adjustment. - “CR1”: G/(G-1) * (n-1)/(n-p) correction. - “CR2”: Bell-McCaffrey bias-reduced (square-root leverage). - “CR3”: Jackknife (inverse leverage, most conservative). | ‘CR0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Cluster-robust covariance matrix, shape (p, p). |
compute_deviance¶
compute_deviance(loglik: float) -> floatCompute deviance from log-likelihood.
Deviance = -2 * loglik. Measures lack of fit relative to a saturated model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
Returns:
| Type | Description |
|---|---|
float | Deviance value (always >= 0 for valid log-likelihoods). |
Examples:
>>> compute_deviance(-100.0)
200.0
>>> compute_deviance(0.0)
0.0compute_f_pvalue¶
compute_f_pvalue(f_stat: float, df1: int, df2: float) -> floatCompute p-value from F-statistic.
Uses the upper tail of the F-distribution: P(F > f_stat | df1, df2).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f_stat | float | F-statistic value. | required |
df1 | int | Numerator degrees of freedom (number of restrictions). | required |
df2 | float | Denominator degrees of freedom (residual df). | required |
Returns:
| Type | Description |
|---|---|
float | Upper-tail p-value from F-distribution. |
Examples:
>>> p = compute_f_pvalue(4.0, df1=2, df2=97)
>>> 0 < p < 0.05
True
>>> compute_f_pvalue(0.0, df1=1, df2=100)
1.0compute_f_test¶
compute_f_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray, df_resid: float) -> FTestResultCompute F-test for linear hypothesis L @ β = 0.
Uses the Wald statistic divided by the number of constraints to produce an F-statistic with appropriate degrees of freedom.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p) where q is the number of constraints (contrasts) and p is the number of coefficients. | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of coefficients, shape (p, p). | required |
df_resid | float | Residual degrees of freedom (denominator df for F-test). | required |
Returns:
| Type | Description |
|---|---|
FTestResult | FTestResult containing num_df, den_df, F_value, p_value. |
Note: The Wald statistic is: W = (L @ β)’ @ (L @ V @ L’)⁻¹ @ (L @ β)
The F-statistic is F = W / q, distributed as F(q, df_resid).
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]]) # 2 contrasts
>>> coef = np.array([10.0, 12.0, 15.0])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> result = compute_f_test(L, coef, vcov, df_resid=97)
>>> result.num_df
2compute_glm_cr_vcov¶
compute_glm_cr_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], irls_weights: NDArray[np.float64], cluster_ids: NDArray[np.intp], XtWX_inv: NDArray[np.float64], cr_type: str = 'CR0') -> NDArray[np.float64]Compute cluster-robust covariance matrix for non-Gaussian mixed models.
Implements the weighted sandwich estimator clustered by random-effects grouping variable. Only CR0 and CR1 are supported (CR2/CR3 require leverage adjustments that are not well-defined for GLM).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Response residuals y - mu, shape (n,). | required |
irls_weights | NDArray[float64] | IRLS weights from GLM fit, shape (n,). | required |
cluster_ids | NDArray[intp] | Integer cluster labels, shape (n,). Values in [0, G). | required |
XtWX_inv | NDArray[float64] | (X'WX)^{-1} from GLM fit, shape (p, p). | required |
cr_type | str | Type of CR estimator: - “CR0”: No small-sample adjustment (default). - “CR1”: G/(G-1) * (n-1)/(n-p) correction. | ‘CR0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Cluster-robust covariance matrix, shape (p, p). |
compute_glm_hc_vcov¶
compute_glm_hc_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], irls_weights: NDArray[np.float64], XtWX_inv: NDArray[np.float64], hc_type: str = 'HC0') -> NDArray[np.float64]Compute heteroscedasticity-consistent covariance matrix for GLM.
(X’WX)^{-1} X’W Omega W X (X’WX)^{-1} (X’WX)^{-1} X’W Omega W X (X’WX)^{-1}
where W are the IRLS weights and Omega accounts for variance misspecification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Response residuals (y - mu), shape (n,). | required |
irls_weights | NDArray[float64] | IRLS weights from GLM fit, shape (n,). | required |
XtWX_inv | NDArray[float64] | (X’WX)^{-1} from GLM fit, shape (p, p). | required |
hc_type | str | Type of HC estimator: - “HC0”: No small-sample adjustment (default for GLM) - “HC1”: df correction | ‘HC0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Heteroscedasticity-consistent covariance matrix, shape (p, p). |
compute_hc_vcov¶
compute_hc_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], XtX_inv: NDArray[np.float64], hc_type: str = 'HC3') -> NDArray[np.float64]Compute heteroscedasticity-consistent covariance matrix.
Implements the sandwich estimator: (X’X)^{-1} X’ Omega X (X’X)^{-1} where Omega is a diagonal matrix of squared (possibly adjusted) residuals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | OLS residuals, shape (n,). | required |
XtX_inv | NDArray[float64] | (X’X)^{-1} from OLS, shape (p, p). | required |
hc_type | str | Type of HC estimator: - “HC0”: White’s original (no adjustment) - “HC1”: df correction - “HC2”: Leverage adjustment - “HC3”: Squared leverage (default, most conservative) | ‘HC3’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Heteroscedasticity-consistent covariance matrix, shape (p, p). |
compute_leverage¶
compute_leverage(X: np.ndarray, weights: np.ndarray | None = None, XtWX_inv: np.ndarray | None = None) -> np.ndarrayCompute diagonal of hat matrix (leverage values).
For the unweighted case (OLS), uses efficient QR decomposition. For the weighted case (GLM), can use pre-computed (X’WX)^{-1} to avoid redundant QR decomposition when this matrix is already available from IRLS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | ndarray | Design matrix of shape (n, p). | required |
weights | ndarray | None | Optional observation weights for weighted hat matrix (GLM). | None |
XtWX_inv | ndarray | None | Optional pre-computed (X’WX)^{-1} of shape (p, p). When provided with weights, avoids QR decomposition by computing leverage directly as h_i = (W^{1/2} x_i)’ (X’WX)^{-1} (W^{1/2} x_i). This is more efficient when (X’WX)^{-1} is already available (e.g., from IRLS). | None |
Returns:
| Type | Description |
|---|---|
ndarray | Diagonal of hat matrix (leverage values), shape (n,). |
Note: Unweighted case: H = X(X’X)^{-1} X’, computed via QR as diag(H) = ||Q[i, :]||². Weighted case: H = W^{1/2} X (X’WX)^{-1} X’ W^{1/2}. The sum of leverage values equals p (number of predictors).
Examples:
>>> X = np.array([[1, 2], [1, 3], [1, 4]])
>>> h = compute_leverage(X)
>>> np.sum(h) # Should equal p (number of predictors)
2.0>>> # GLM with pre-computed XtWX_inv (avoids redundant QR)
>>> weights = np.array([1.0, 0.5, 0.8])
>>> XtWX = X.T @ np.diag(weights) @ X
>>> XtWX_inv = np.linalg.inv(XtWX)
>>> h = compute_leverage(X, weights=weights, XtWX_inv=XtWX_inv)compute_mvt_critical¶
compute_mvt_critical(conf_level: float, corr: np.ndarray, df: float, *, tol: float = 1e-06) -> floatCompute multivariate-t critical value for simultaneous inference.
Finds c such that P(-c < T_i < c for all i) = conf_level where
T ~ MVT(df, corr). Uses scipy’s Genz-Bretz CDF integration with
Brent root-finding, matching R’s mvtnorm::qmvt().
For df = inf (asymptotic), uses multivariate normal instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_level | float | Confidence level (e.g. 0.95). | required |
corr | ndarray | Correlation matrix of contrast estimates, shape (m, m). | required |
df | float | Degrees of freedom. Use np.inf for asymptotic. | required |
tol | float | Root-finding tolerance (default 1e-6). | 1e-06 |
Returns:
| Type | Description |
|---|---|
float | Critical value for MVT-adjusted confidence intervals. |
Examples:
>>> corr = np.array([[1.0, 0.5], [0.5, 1.0]])
>>> crit = compute_mvt_critical(0.95, corr, df=50)
>>> crit > 1.96 # wider than unadjusted
Truecompute_pvalue¶
compute_pvalue(statistic: np.ndarray, df: float | np.ndarray | None = None, alternative: str = 'two-sided') -> np.ndarrayCompute p-values from test statistics.
Uses t-distribution if df is provided, otherwise uses normal distribution. Supports two-sided, greater, and less alternatives.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
statistic | ndarray | Test statistics (t or z values). | required |
df | float | ndarray | None | Degrees of freedom for t-distribution (scalar or per-coefficient array). If None, uses normal distribution. | None |
alternative | str | Direction of the alternative hypothesis. - “two-sided”: H₁: β ≠ null (default) - “greater”: H₁: β > null (upper tail) - “less”: H₁: β < null (lower tail) | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
ndarray | P-values for the specified alternative. |
Examples:
>>> statistic = np.array([2.5, -1.8])
>>> p = compute_pvalue(statistic, df=20)
>>> all((p >= 0) & (p <= 1))
Truecompute_satterthwaite_df¶
compute_satterthwaite_df(vcov_beta: np.ndarray, jacobian_vcov: np.ndarray, hessian_deviance: np.ndarray, min_df: float = 1.0, max_df: float = 1000000.0, tol: float = 1e-08) -> np.ndarrayCompute Satterthwaite degrees of freedom for each fixed effect.
Implements the Satterthwaite approximation for denominator degrees of freedom:
df_j = 2 * (SE(β̂_j)²)² / Var(SE(β̂_j)²)Where Var(SE(β̂_j)²) is computed via the delta method using the Jacobian of Vcov(beta) w.r.t. variance parameters and the inverse Hessian of the deviance function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vcov_beta | ndarray | Variance-covariance matrix of fixed effects, shape (p, p). | required |
jacobian_vcov | ndarray | Jacobian of Vcov(beta) w.r.t. variance parameters, shape (p, p, k) where k = number of variance parameters. | required |
hessian_deviance | ndarray | Hessian of deviance w.r.t. variance parameters, shape (k, k). | required |
min_df | float | Minimum allowable df (default 1.0). | 1.0 |
max_df | float | Maximum allowable df (default 1e6). | 1000000.0 |
tol | float | Tolerance for determining positive eigenvalues in Hessian. | 1e-08 |
Returns:
| Type | Description |
|---|---|
ndarray | Array of degrees of freedom for each fixed effect, shape (p,). |
Notes:
Uses Moore-Penrose pseudo-inverse for the Hessian to handle negative or near-zero eigenvalues (convergence issues)
Warns when negative eigenvalues are detected
Clips df to [min_df, max_df] range for numerical stability
Returns max_df when variance of variance is near zero
Examples:
# After fitting lmer model and computing Hessian/Jacobian
df = compute_satterthwaite_df(vcov_beta, jac, hess)
# df[0] is the denominator df for testing beta[0]compute_satterthwaite_summary_table¶
compute_satterthwaite_summary_table(beta: np.ndarray, beta_names: list[str], vcov_beta: np.ndarray, vcov_varpar: np.ndarray, jac_list: list[np.ndarray]) -> dict[str, list]Compute full coefficient table with Satterthwaite df and p-values.
Computes t-statistics, degrees of freedom, and p-values for all fixed effects using Satterthwaite’s approximation.
This is a convenience function that combines compute_satterthwaite_df()
and compute_satterthwaite_t_test() into a single summary table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
beta | ndarray | Fixed effects estimates, shape (p,). | required |
beta_names | list[str] | Names of fixed effects. | required |
vcov_beta | ndarray | Variance-covariance matrix of beta, shape (p, p). | required |
vcov_varpar | ndarray | Variance-covariance matrix of variance parameters. | required |
jac_list | list[ndarray] | List of Jacobian matrices, each shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
dict[str, list] | Dictionary with keys: |
dict[str, list] | - ‘Parameter’: List of parameter names |
dict[str, list] | - ‘Estimate’: Coefficient estimates |
dict[str, list] | - ‘Std. Error’: Standard errors |
dict[str, list] | - ‘df’: Satterthwaite degrees of freedom |
dict[str, list] | - ‘t value’: t-statistics |
dict[str, list] | - 'Pr(> |
Notes:
Computes one t-test per fixed effect
Uses standard t-distribution for p-values
All computations vectorized for efficiency
Examples:
# After fitting lmer and computing Hessian/Jacobian
table = compute_satterthwaite_summary_table(
beta, beta_names, vcov_beta, vcov_varpar, jac_list
)
import pandas as pd
df = pd.DataFrame(table)
print(df)compute_satterthwaite_t_test¶
compute_satterthwaite_t_test(beta: np.ndarray, se: np.ndarray, df: np.ndarray, conf_level: float = 0.95) -> dict[str, np.ndarray]Compute t-statistics, p-values, and confidence intervals.
Uses Satterthwaite degrees of freedom for each coefficient to compute t-test statistics and p-values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
beta | ndarray | Coefficient estimates, shape (p,). | required |
se | ndarray | Standard errors, shape (p,). | required |
df | ndarray | Satterthwaite degrees of freedom for each coefficient, shape (p,). | required |
conf_level | float | Confidence level for intervals (default 0.95 for 95% CI). | 0.95 |
Returns:
| Type | Description |
|---|---|
dict[str, ndarray] | Dictionary with: |
dict[str, ndarray] | - ‘statistic’: t-statistics, shape (p,) |
dict[str, ndarray] | - ‘p_value’: Two-tailed p-values, shape (p,) |
dict[str, ndarray] | - ‘ci_lower’: Lower confidence bounds, shape (p,) |
dict[str, ndarray] | - ‘ci_upper’: Upper confidence bounds, shape (p,) |
Examples:
# After computing df from compute_satterthwaite_df()
se = np.sqrt(np.diag(vcov_beta))
results = compute_satterthwaite_t_test(beta, se, df)
print(results['p_value']) # Two-tailed p-valuescompute_sd_jacobian¶
compute_sd_jacobian(theta: np.ndarray, sigma: float, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str]) -> tuple[np.ndarray, np.ndarray]Compute SDs and Jacobian of SDs w.r.t. varpar = [theta, sigma].
Sigma = sigma^2 * L @ L.T Sigma = sigma^2 * L @ L.T SD[i] = sqrt(Sigma[i,i]) = sigma * sqrt((L @ L.T)[i,i])
The Jacobian has shape (n_sds, n_theta + 1) where the last column corresponds to sigma.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta | ndarray | Theta parameters (relative Cholesky factors). | required |
sigma | float | Residual standard deviation. | required |
group_names | list[str] | Grouping factor names. | required |
random_names | list[str] | dict[str, list[str]] | Random effect names per group. | required |
re_structure | str | list[str] | dict[str, str] | RE structure per group. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (sds, jacobian) where: |
ndarray | - sds: Array of SD values, length = n_variance_components + 1 |
tuple[ndarray, ndarray] | - jacobian: Shape (n_sds, n_theta + 1) |
compute_se_from_vcov¶
compute_se_from_vcov(vcov: np.ndarray) -> np.ndarrayCompute standard errors from variance-covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vcov | ndarray | Variance-covariance matrix of parameter estimates, shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Standard errors for each parameter, shape (p,). |
Examples:
>>> vcov = np.array([[0.04, 0.01], [0.01, 0.09]])
>>> se = compute_se_from_vcov(vcov)
>>> np.allclose(se, [0.2, 0.3])
Truecompute_sigma_se_wald¶
compute_sigma_se_wald(model: MixedModelProtocol) -> floatCompute Wald standard error for sigma.
Uses the delta method approximation from the Hessian.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | MixedModelProtocol | Fitted mixed-effects model with Satterthwaite inference. | required |
Returns:
| Type | Description |
|---|---|
float | Standard error of sigma. |
compute_studentized_residuals¶
compute_studentized_residuals(residuals: np.ndarray, hat: np.ndarray, sigma: float) -> np.ndarrayCompute internally studentized (standardized) residuals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | ndarray | Raw residuals of shape (n,). | required |
hat | ndarray | Leverage values (diagonal of hat matrix) of shape (n,). | required |
sigma | float | Residual standard error. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Studentized residuals of shape (n,). |
Note: Formula: r_i / (σ * sqrt(1 - h_i)). Also called “internally studentized” or “standardized” residuals. They have approximately unit variance under model assumptions.
Examples:
>>> residuals = np.array([0.5, -0.3, 0.8])
>>> hat = np.array([0.2, 0.3, 0.5])
>>> sigma = 1.2
>>> r_std = compute_studentized_residuals(residuals, hat, sigma)compute_t_critical¶
compute_t_critical(conf_int: float, df: float | np.ndarray, alternative: str = 'two-sided') -> float | np.ndarrayCompute t-distribution critical value for confidence interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | Confidence level (0 < conf_int < 1). | required |
df | float | ndarray | Degrees of freedom (scalar or per-coefficient array). | required |
alternative | str | “two-sided” (default), “greater”, or “less”. | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
float | ndarray | Critical value(s) from t-distribution. Scalar if df is scalar, |
float | ndarray | array if df is array. |
Examples:
>>> crit = compute_t_critical(0.95, df=10)
>>> abs(crit - 2.228) < 0.01
Truecompute_t_test¶
compute_t_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray, df: float) -> TTestResultCompute t-test for a single contrast L @ β = 0.
For a single contrast (q=1), returns t-statistic instead of F.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast vector of shape (1, p) or (p,). | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
df | float | Degrees of freedom for t-distribution. | required |
Returns:
| Type | Description |
|---|---|
TTestResult | TTestResult containing estimate, se, t_value, df, p_value. |
Examples:
>>> L = np.array([-1, 1, 0]) # Compare coef[1] to coef[0]
>>> coef = np.array([10.0, 12.0, 15.0])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> result = compute_t_test(L, coef, vcov, df=97)
>>> result.estimate
2.0compute_tukey_critical¶
compute_tukey_critical(conf_level: float, k: int, df: float) -> floatCompute Tukey HSD critical value for pairwise comparisons.
Uses the studentized range distribution to compute the critical value
for simultaneous pairwise comparisons of k group means. Matches
R’s qtukey(conf_level, k, df) / sqrt(2).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_level | float | Confidence level (e.g. 0.95). | required |
k | int | Number of group means being compared (family size). | required |
df | float | Degrees of freedom. Use np.inf for asymptotic (z-based). | required |
Returns:
| Type | Description |
|---|---|
float | Critical value for Tukey-adjusted confidence intervals. |
Examples:
>>> crit = compute_tukey_critical(0.95, k=3, df=28)
>>> abs(crit - 2.474) < 0.001
Truecompute_vif¶
compute_vif(X: np.ndarray, X_names: list[str]) -> pl.DataFrameCompute variance inflation factors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | ndarray | Design matrix of shape (n, p), including intercept if present. | required |
X_names | list[str] | Column names for design matrix. | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: [term, vif, ci_increase_factor] |
DataFrame | where ci_increase_factor = sqrt(vif). |
Note: VIF measures how much the variance of a coefficient is inflated due to multicollinearity. VIF_j = 1 / (1 - R²_j) where R²_j is from regressing predictor j on all other predictors.
Excludes intercept column. Rules of thumb: VIF > 10: severe multicollinearity, VIF > 5: moderate multicollinearity, VIF < 5: acceptable.
Examples:
>>> X = np.array([[1, 2, 3], [1, 3, 4], [1, 4, 5]])
>>> names = ['Intercept', 'x1', 'x2']
>>> vif_df = compute_vif(X, names)compute_wald_ci_varying¶
compute_wald_ci_varying(theta: np.ndarray, sigma: float, vcov_varpar: np.ndarray, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str], conf_level: float = 0.95) -> tuple[np.ndarray, np.ndarray]Compute Wald CIs for variance components on SD scale.
Uses the delta method to transform asymptotic variance from varpar (theta, sigma) scale to the standard deviation scale.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta | ndarray | Optimized theta parameters (relative scale). | required |
sigma | float | Residual standard deviation. | required |
vcov_varpar | ndarray | Asymptotic covariance of [theta, sigma], shape (k, k) where k = len(theta) + 1. This is 2 * H^{-1} from Satterthwaite. | required |
group_names | list[str] | Names of grouping factors. | required |
random_names | list[str] | dict[str, list[str]] | Names of random effects per group. | required |
re_structure | str | list[str] | dict[str, str] | RE structure type(s) per group. | required |
conf_level | float | Confidence level (default 0.95). | 0.95 |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (ci_lower, ci_upper) arrays on SD scale. |
ndarray | Length = number of variance components + 1 (for residual). |
Notes: The delta method approximation assumes:
Asymptotic normality of theta and sigma estimates
Sample size large enough for the approximation to hold
For variance components near zero (boundary), profile likelihood CIs are more accurate as they respect the parameter bounds.
compute_wald_statistic¶
compute_wald_statistic(contrast_values: np.ndarray, contrast_vcov: np.ndarray) -> floatCompute Wald statistic for testing L @ β = 0.
Computes: W = (L @ β)’ @ (L @ V @ L’)⁻¹ @ (L @ β)
Uses solve for efficiency, with pinv fallback for singular matrices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contrast_values | ndarray | Contrast estimates L @ β, shape (q,). | required |
contrast_vcov | ndarray | Variance of contrasts L @ V @ L’, shape (q, q). | required |
Returns:
| Type | Description |
|---|---|
float | Wald statistic value. |
Examples:
>>> contrast_values = np.array([2.0, 5.0])
>>> contrast_vcov = np.array([[2.5, 0.5], [0.5, 3.0]])
>>> W = compute_wald_statistic(contrast_values, contrast_vcov)
>>> W > 0
Truecompute_welch_satterthwaite_df_per_coef¶
compute_welch_satterthwaite_df_per_coef(X: NDArray[np.float64], cell_info: CellInfo) -> NDArray[np.float64]Compute per-coefficient Welch-Satterthwaite degrees of freedom.
For each coefficient j, decomposes its sandwich variance into per-cell contributions and applies the Satterthwaite approximation independently. This gives genuinely different df values for different coefficients.
For a two-group treatment-coded design y ~ factor(group):
Intercept (reference group mean): df = n_ref - 1
Slope (group difference): classic two-group Welch formula
Algorithm for each coefficient j::
W = X @ (X'X)^{-1} # (n, p) projection weights
cell_W_sq[k, j] = sum_{i in cell_k} W[i,j]^2
v[k, j] = s_k^2 * cell_W_sq[k, j] # cell k's contribution
df_j = (sum_k v[k,j])^2 / sum_k [v[k,j]^2 / (n_k - 1)]Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
cell_info | CellInfo | Cell-based variance information from compute_cell_info(). | required |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Per-coefficient degrees of freedom, shape (p,). |
Notes:
This is a bossanova extension with no direct R equivalent. R’s
sandwich/emmeans/coeftest use residual df with HC SEs.
Bossanova computes proper Satterthwaite df per coefficient, which
for a two-group model exactly matches the classic Welch t-test
(t.test(var.equal=FALSE) / scipy.stats.ttest_ind).
For symmetric designs (equal n and variance), all df are equal
The total variance
sum_k v[k,j]equalsvcov[j,j]fromcompute_welch_vcovLoops over cells (typically 2-5), vectorized over coefficients
Examples:
import numpy as np
# Two-group treatment-coded design
X = np.column_stack([np.ones(20), np.array([0]*10 + [1]*10)])
# ... with cell_info from compute_cell_info ...
df = compute_welch_satterthwaite_df_per_coef(X, cell_info)
# df[0] ≈ n_ref - 1 (intercept)
# df[1] ≈ classic Welch df (slope)compute_z_critical¶
compute_z_critical(conf_int: float, alternative: str = 'two-sided') -> floatCompute z-distribution critical value for confidence interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | Confidence level (0 < conf_int < 1). | required |
alternative | str | “two-sided” (default), “greater”, or “less”. | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
float | Critical value from standard normal distribution. |
Examples:
>>> crit = compute_z_critical(0.95)
>>> abs(crit - 1.96) < 0.01
Trueconvert_theta_ci_to_sd¶
convert_theta_ci_to_sd(ci_theta: dict[str, tuple[float, float]], theta_opt: np.ndarray, sigma_opt: float, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str]) -> tuple[dict[str, tuple[float, float]], np.ndarray, np.ndarray]Convert theta-scale CIs to SD-scale CIs.
For variance components, SD = sigma * |theta| for simple cases. For slope models with Cholesky factors, we use the diagonal elements of sigma^2 * L @ L.T.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ci_theta | dict[str, tuple[float, float]] | CIs on theta scale, keyed by parameter name. | required |
theta_opt | ndarray | Optimal theta values. | required |
sigma_opt | float | Optimal sigma value. | required |
group_names | list[str] | Grouping factor names. | required |
random_names | list[str] | dict[str, list[str]] | Random effect names. | required |
re_structure | str | list[str] | dict[str, str] | RE structure per group. | required |
Returns:
| Type | Description |
|---|---|
dict[str, tuple[float, float]] | Tuple of (ci_sd dict, ci_lower_sd array, ci_upper_sd array). |
ndarray | Arrays are ordered to match result_varying rows. |
delta_method_se¶
delta_method_se(X_pred: np.ndarray, vcov: np.ndarray) -> np.ndarrayCompute standard errors for predictions via delta method.
For predictions Xβ, the variance is: Var(Xβ) = X Var(β) X^T
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X_pred | ndarray | Design matrix for predictions, shape (n, p). | required |
vcov | ndarray | Variance-covariance matrix of parameter estimates, shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Standard errors for each prediction, shape (n,). |
Note: Uses efficient computation: SE_i = sqrt(sum((X_pred @ vcov) * X_pred, axis=1)) which avoids forming the full quadratic form.
Examples:
>>> X_new = np.array([[1, 2], [1, 3]])
>>> vcov = np.array([[0.1, 0.01], [0.01, 0.05]])
>>> se = delta_method_se(X_new, vcov)extract_ci_bound¶
extract_ci_bound(spline: UnivariateSpline, zeta_target: float, lower_bound: float, upper_guess: float) -> floatExtract CI bound by finding where spline equals target zeta.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spline | UnivariateSpline | Reverse spline (zeta -> param). | required |
zeta_target | float | Target zeta value (negative for lower, positive for upper). | required |
lower_bound | float | Parameter lower bound. | required |
upper_guess | float | Upper search bound guess. | required |
Returns:
| Type | Description |
|---|---|
float | Parameter value where zeta = zeta_target. |
extract_factors_from_formula¶
extract_factors_from_formula(formula: str, data: pl.DataFrame) -> list[str]Extract factor (categorical) column names from a model formula.
Parses the RHS of the formula and identifies which columns are categorical (String, Categorical, or Enum type in polars).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula | str | Model formula like “y ~ A + B + x” or “y ~ A * B + center(x)”. | required |
data | DataFrame | DataFrame containing the columns referenced in the formula. | required |
Returns:
| Type | Description |
|---|---|
list[str] | List of column names that are categorical/factor variables. |
Examples:
import polars as pl
df = pl.DataFrame({
"y": [1.0, 2.0, 3.0],
"group": ["A", "B", "A"],
"x": [0.1, 0.2, 0.3],
})
extract_factors_from_formula("y ~ group + x", df)
# ['group']
# Multiple factors
df2 = df.with_columns(pl.col("group").alias("treatment"))
extract_factors_from_formula("y ~ group * treatment", df2)
# ['group', 'treatment']Notes:
Handles interaction terms (*, :) by extracting component columns
Handles transforms like center(x) by extracting the inner column
Skips intercept terms (1, 0, -1)
Only returns columns that exist in the data
format_pvalue_with_stars¶
format_pvalue_with_stars(p_val: float) -> strFormat p-value with R-style significance codes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
p_val | float | The p-value to format. | required |
Returns:
| Type | Description |
|---|---|
str | Formatted string with p-value and significance stars (fixed 12-char width). |
str | Uses R’s significance coding: ‘’ < 0.001 < '’ < 0.01 < '’ < 0.05 < ‘.’ < 0.1 |
Examples:
>>> format_pvalue_with_stars(0.0001)
'< 0.001 *** '
>>> format_pvalue_with_stars(0.023)
' 0.023 * 'parse_conf_int¶
parse_conf_int(conf_int: float | int | str) -> floatParse flexible confidence interval input to float.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | int | str | Confidence interval specification. Accepts: float in (0, 1): treated as confidence level, int in (1, 99]: converted to proportion (e.g., 95 -> 0.95), str: parsed as number with optional ‘%’ (e.g., “95%”, “95”). | required |
Returns:
| Type | Description |
|---|---|
float | Confidence level as proportion in (0, 1). |
Examples:
>>> parse_conf_int(0.95)
0.95
>>> parse_conf_int(95)
0.95
>>> parse_conf_int("95%")
0.95profile_likelihood¶
profile_likelihood(model: MixedModelProtocol, conf_level: float = 0.95, threshold: float = 4.0, n_steps: int = 15, verbose: bool = False) -> dictCompute profile likelihood confidence intervals for variance components.
Profiles theta parameters (relative Cholesky factors) to compute profile likelihood CIs that are more accurate than Wald CIs, especially when variance components are near the boundary (zero).
Sigma (residual SD) is derived analytically at each profile step. Sigma CI is computed via the Wald approximation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | MixedModelProtocol | Fitted mixed-effects model (lmer, glmer, or unified model). Must have attributes: _theta, _sigma, _deviance, _deviance_fn, _group_names, _random_names, and methods get_theta_lower_bounds(), _get_re_structure(). | required |
conf_level | float | Confidence level for CIs (default 0.95). | 0.95 |
threshold | float | Maximum | zeta |
n_steps | int | Number of steps per direction from MLE (default 15). | 15 |
verbose | bool | Print progress messages. | False |
Returns:
| Type | Description |
|---|---|
dict | Dict with keys matching ProfileState fields: table, spline_forward, |
dict | spline_reverse, ci_theta, ci_sd, ci_lower_sd, ci_upper_sd, |
dict | conf_level, dev_opt, threshold. The caller constructs ProfileState |
dict | to respect the maths-never-imports-containers boundary. |
Notes:
Profile CIs respect parameter bounds (variances >= 0)
Computational cost: ~2 * n_steps * n_theta deviance evaluations
For large models, consider using Wald CIs (cheaper) or bootstrap
Examples:
>>> # Legacy API
>>> m = lmer("y ~ x + (1|group)", data).fit()
>>> prof = profile_likelihood(m)
>>> prof.ci_sd # CIs on SD scale
{'group:Intercept': (15.21, 40.23), 'Residual': (22.83, 28.69)}>>> # New unified API
>>> m = model("y ~ x + (1|group)", data).fit()
>>> prof = profile_likelihood(m)profile_theta_parameter¶
profile_theta_parameter(param_idx: int, param_name: str, param_opt: float, theta_opt: np.ndarray, dev_opt: float, deviance_fn: Callable[[np.ndarray], float], lower_bounds: np.ndarray, n_steps: int, threshold: float, verbose: bool) -> dictProfile a single theta parameter bidirectionally from its MLE.
At each step, fixes the focal theta and re-optimizes all other theta elements to find the conditional MLE deviance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_idx | int | Index of parameter in theta vector. | required |
param_name | str | Human-readable name. | required |
param_opt | float | Optimal value of this parameter. | required |
theta_opt | ndarray | Full optimal theta vector. | required |
dev_opt | float | Optimal deviance. | required |
deviance_fn | Callable[[ndarray], float] | Function theta -> deviance (model’s internal fn). | required |
lower_bounds | ndarray | Lower bounds for ALL theta parameters. | required |
n_steps | int | Steps in each direction. | required |
threshold | float | Maximum | zeta |
verbose | bool | Print progress. | required |
Returns:
| Type | Description |
|---|---|
dict | Dict with ‘rows’, ‘focal’, ‘zeta’ lists. |
satterthwaite_df_for_contrasts¶
satterthwaite_df_for_contrasts(L: np.ndarray, vcov_beta: np.ndarray, vcov_varpar: np.ndarray, jac_list: list[np.ndarray], min_df: float = 1.0, max_df: float = 1000000.0) -> np.ndarrayCompute Satterthwaite degrees of freedom for arbitrary contrasts.
This generalizes compute_satterthwaite_df() to handle arbitrary linear contrasts
L @ β, not just individual coefficients. This is needed for emmeans where
each EMM is a linear combination of coefficients.
The Satterthwaite formula for contrast L is:
df = 2 * var_contrast² / Var(var_contrast)var_contrast = L @ Vcov(β) @ L.T var_contrast = L @ Vcov(β) @ L.T grad[i] = L @ Jac_i @ L.T (gradient w.r.t. variance parameter i) Var(var_contrast) = grad @ Vcov(varpar) @ grad
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix, shape (n_contrasts, n_coef). Each row is a contrast. For a single contrast, shape can be (n_coef,). | required |
vcov_beta | ndarray | Variance-covariance matrix of fixed effects, shape (p, p). | required |
vcov_varpar | ndarray | Variance-covariance matrix of variance parameters, shape (k, k). This is 2 * H^{-1} where H is the Hessian of deviance. | required |
jac_list | list[ndarray] | List of k Jacobian matrices, each shape (p, p). jac_list[i] = ∂Vcov(β)/∂varpar_i. | required |
min_df | float | Minimum allowable df (default 1.0). | 1.0 |
max_df | float | Maximum allowable df (default 1e6). | 1000000.0 |
Returns:
| Type | Description |
|---|---|
ndarray | Array of degrees of freedom for each contrast, shape (n_contrasts,). |
ndarray | For a single contrast (1D input), returns a scalar float. |
Examples:
# Compute df for EMMs where X_ref is the prediction matrix
df = satterthwaite_df_for_contrasts(
X_ref, vcov_beta, vcov_varpar, jac_list
)
# df[i] is the denominator df for EMM iModules¶
diagnostics¶
Residuals, influence measures, and leverage statistics.
Functions:
| Name | Description |
|---|---|
compute_cooks_distance | Compute Cook’s distance for influence. |
compute_leverage | Compute diagonal of hat matrix (leverage values). |
compute_studentized_residuals | Compute internally studentized (standardized) residuals. |
compute_vif | Compute variance inflation factors. |
Attributes¶
Classes¶
Functions¶
compute_cooks_distance¶
compute_cooks_distance(residuals: np.ndarray, hat: np.ndarray, sigma: float, p: int) -> np.ndarrayCompute Cook’s distance for influence.
Cook’s distance measures the influence of each observation on the fitted coefficients.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | ndarray | Raw residuals of shape (n,). | required |
hat | ndarray | Leverage values (diagonal of hat matrix) of shape (n,). | required |
sigma | float | Residual standard error. | required |
p | int | Number of parameters (including intercept). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Cook’s distances of shape (n,). |
Note: Cook’s distance formula (matching R’s cooks.distance()): D_i = (e_i² / (p * σ²)) * (h_i / (1 - h_i)²) where e_i is the raw residual, σ is sigma, h_i is leverage.
Values > 1 are traditionally considered influential. More conservatively, D_i > 4/(n-p) is used.
Examples:
>>> D = compute_cooks_distance(residuals, hat, sigma=2.5, p=3)
>>> influential = np.where(D > 1)[0]compute_leverage¶
compute_leverage(X: np.ndarray, weights: np.ndarray | None = None, XtWX_inv: np.ndarray | None = None) -> np.ndarrayCompute diagonal of hat matrix (leverage values).
For the unweighted case (OLS), uses efficient QR decomposition. For the weighted case (GLM), can use pre-computed (X’WX)^{-1} to avoid redundant QR decomposition when this matrix is already available from IRLS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | ndarray | Design matrix of shape (n, p). | required |
weights | ndarray | None | Optional observation weights for weighted hat matrix (GLM). | None |
XtWX_inv | ndarray | None | Optional pre-computed (X’WX)^{-1} of shape (p, p). When provided with weights, avoids QR decomposition by computing leverage directly as h_i = (W^{1/2} x_i)’ (X’WX)^{-1} (W^{1/2} x_i). This is more efficient when (X’WX)^{-1} is already available (e.g., from IRLS). | None |
Returns:
| Type | Description |
|---|---|
ndarray | Diagonal of hat matrix (leverage values), shape (n,). |
Note: Unweighted case: H = X(X’X)^{-1} X’, computed via QR as diag(H) = ||Q[i, :]||². Weighted case: H = W^{1/2} X (X’WX)^{-1} X’ W^{1/2}. The sum of leverage values equals p (number of predictors).
Examples:
>>> X = np.array([[1, 2], [1, 3], [1, 4]])
>>> h = compute_leverage(X)
>>> np.sum(h) # Should equal p (number of predictors)
2.0>>> # GLM with pre-computed XtWX_inv (avoids redundant QR)
>>> weights = np.array([1.0, 0.5, 0.8])
>>> XtWX = X.T @ np.diag(weights) @ X
>>> XtWX_inv = np.linalg.inv(XtWX)
>>> h = compute_leverage(X, weights=weights, XtWX_inv=XtWX_inv)compute_studentized_residuals¶
compute_studentized_residuals(residuals: np.ndarray, hat: np.ndarray, sigma: float) -> np.ndarrayCompute internally studentized (standardized) residuals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | ndarray | Raw residuals of shape (n,). | required |
hat | ndarray | Leverage values (diagonal of hat matrix) of shape (n,). | required |
sigma | float | Residual standard error. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Studentized residuals of shape (n,). |
Note: Formula: r_i / (σ * sqrt(1 - h_i)). Also called “internally studentized” or “standardized” residuals. They have approximately unit variance under model assumptions.
Examples:
>>> residuals = np.array([0.5, -0.3, 0.8])
>>> hat = np.array([0.2, 0.3, 0.5])
>>> sigma = 1.2
>>> r_std = compute_studentized_residuals(residuals, hat, sigma)compute_vif¶
compute_vif(X: np.ndarray, X_names: list[str]) -> pl.DataFrameCompute variance inflation factors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | ndarray | Design matrix of shape (n, p), including intercept if present. | required |
X_names | list[str] | Column names for design matrix. | required |
Returns:
| Type | Description |
|---|---|
DataFrame | DataFrame with columns: [term, vif, ci_increase_factor] |
DataFrame | where ci_increase_factor = sqrt(vif). |
Note: VIF measures how much the variance of a coefficient is inflated due to multicollinearity. VIF_j = 1 / (1 - R²_j) where R²_j is from regressing predictor j on all other predictors.
Excludes intercept column. Rules of thumb: VIF > 10: severe multicollinearity, VIF > 5: moderate multicollinearity, VIF < 5: acceptable.
Examples:
>>> X = np.array([[1, 2, 3], [1, 3, 4], [1, 4, 5]])
>>> names = ['Intercept', 'x1', 'x2']
>>> vif_df = compute_vif(X, names)estimation¶
Standard errors, confidence intervals, and coefficient inference.
Classes:
| Name | Description |
|---|---|
InferenceResult | Results from coefficient inference computation. |
Functions:
| Name | Description |
|---|---|
compute_ci | Compute confidence interval bounds. |
compute_coefficient_inference | Compute inference statistics for regression coefficients. |
compute_se_from_vcov | Compute standard errors from variance-covariance matrix. |
compute_t_critical | Compute t-distribution critical value for confidence interval. |
compute_z_critical | Compute z-distribution critical value for confidence interval. |
delta_method_se | Compute standard errors for predictions via delta method. |
parse_conf_int | Parse flexible confidence interval input to float. |
Classes¶
InferenceResult¶
InferenceResult(se: np.ndarray, statistic: np.ndarray, p_values: np.ndarray, ci_lower: np.ndarray, ci_upper: np.ndarray, df_values: list[float]) -> NoneResults from coefficient inference computation.
Attributes:
| Name | Type | Description |
|---|---|---|
se | ndarray | Standard errors for each coefficient. |
statistic | ndarray | Test statistics (t or z values). |
p_values | ndarray | Two-tailed p-values. |
ci_lower | ndarray | Lower confidence interval bounds. |
ci_upper | ndarray | Upper confidence interval bounds. |
df_values | list[float] | Degrees of freedom for each coefficient (inf for z-distribution). |
Attributes¶
ci_lower¶
ci_lower: np.ndarrayci_upper¶
ci_upper: np.ndarraydf_values¶
df_values: list[float]p_values¶
p_values: np.ndarrayse¶
se: np.ndarraystatistic¶
statistic: np.ndarrayFunctions¶
compute_ci¶
compute_ci(estimate: np.ndarray, se: np.ndarray, critical: float | np.ndarray, alternative: str = 'two-sided') -> tuple[np.ndarray, np.ndarray]Compute confidence interval bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimate | ndarray | Point estimates. | required |
se | ndarray | Standard errors. | required |
critical | float | ndarray | Critical value(s) from distribution (t or z). Scalar or per-coefficient array that broadcasts with estimate and se. | required |
alternative | str | “two-sided” (default), “greater”, or “less”. - “greater”: lower bound only, upper = +inf - “less”: upper bound only, lower = -inf | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
tuple[ndarray, ndarray] | Tuple of (ci_lower, ci_upper) arrays with confidence bounds. |
Examples:
>>> estimate = np.array([1.5, 2.0])
>>> se = np.array([0.2, 0.3])
>>> critical = 1.96
>>> ci_lower, ci_upper = compute_ci(estimate, se, critical)compute_coefficient_inference¶
compute_coefficient_inference(coef: np.ndarray, vcov: np.ndarray, df: float | np.ndarray | None, conf_level: float = 0.95, null: float = 0.0, alternative: str = 'two-sided') -> InferenceResultCompute inference statistics for regression coefficients.
Orchestrates computation of standard errors, test statistics, p-values, and confidence intervals for coefficient estimates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coef | ndarray | Coefficient estimates, shape (p,). | required |
vcov | ndarray | Variance-covariance matrix, shape (p, p). | required |
df | float | ndarray | None | Degrees of freedom for t-distribution. Scalar (same df for all coefficients), array of shape (p,) for per-coefficient df, or None for z-distribution. | required |
conf_level | float | Confidence level for intervals (default 0.95). | 0.95 |
null | float | Null hypothesis value (default 0.0). Statistic = (coef - null) / se. | 0.0 |
alternative | str | Direction of the alternative hypothesis. - “two-sided”: H₁: β ≠ null (default) - “greater”: H₁: β > null - “less”: H₁: β < null | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
InferenceResult | InferenceResult with se, statistic, p_values, ci_lower, ci_upper, df_values. |
Examples:
>>> coef = np.array([1.5, 2.0])
>>> vcov = np.array([[0.04, 0.01], [0.01, 0.09]])
>>> result = compute_coefficient_inference(coef, vcov, df=20)
>>> result.se.shape
(2,)compute_se_from_vcov¶
compute_se_from_vcov(vcov: np.ndarray) -> np.ndarrayCompute standard errors from variance-covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vcov | ndarray | Variance-covariance matrix of parameter estimates, shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Standard errors for each parameter, shape (p,). |
Examples:
>>> vcov = np.array([[0.04, 0.01], [0.01, 0.09]])
>>> se = compute_se_from_vcov(vcov)
>>> np.allclose(se, [0.2, 0.3])
Truecompute_t_critical¶
compute_t_critical(conf_int: float, df: float | np.ndarray, alternative: str = 'two-sided') -> float | np.ndarrayCompute t-distribution critical value for confidence interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | Confidence level (0 < conf_int < 1). | required |
df | float | ndarray | Degrees of freedom (scalar or per-coefficient array). | required |
alternative | str | “two-sided” (default), “greater”, or “less”. | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
float | ndarray | Critical value(s) from t-distribution. Scalar if df is scalar, |
float | ndarray | array if df is array. |
Examples:
>>> crit = compute_t_critical(0.95, df=10)
>>> abs(crit - 2.228) < 0.01
Truecompute_z_critical¶
compute_z_critical(conf_int: float, alternative: str = 'two-sided') -> floatCompute z-distribution critical value for confidence interval.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | Confidence level (0 < conf_int < 1). | required |
alternative | str | “two-sided” (default), “greater”, or “less”. | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
float | Critical value from standard normal distribution. |
Examples:
>>> crit = compute_z_critical(0.95)
>>> abs(crit - 1.96) < 0.01
Truedelta_method_se¶
delta_method_se(X_pred: np.ndarray, vcov: np.ndarray) -> np.ndarrayCompute standard errors for predictions via delta method.
For predictions Xβ, the variance is: Var(Xβ) = X Var(β) X^T
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X_pred | ndarray | Design matrix for predictions, shape (n, p). | required |
vcov | ndarray | Variance-covariance matrix of parameter estimates, shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Standard errors for each prediction, shape (n,). |
Note: Uses efficient computation: SE_i = sqrt(sum((X_pred @ vcov) * X_pred, axis=1)) which avoids forming the full quadratic form.
Examples:
>>> X_new = np.array([[1, 2], [1, 3]])
>>> vcov = np.array([[0.1, 0.01], [0.01, 0.05]])
>>> se = delta_method_se(X_new, vcov)parse_conf_int¶
parse_conf_int(conf_int: float | int | str) -> floatParse flexible confidence interval input to float.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_int | float | int | str | Confidence interval specification. Accepts: float in (0, 1): treated as confidence level, int in (1, 99]: converted to proportion (e.g., 95 -> 0.95), str: parsed as number with optional ‘%’ (e.g., “95%”, “95”). | required |
Returns:
| Type | Description |
|---|---|
float | Confidence level as proportion in (0, 1). |
Examples:
>>> parse_conf_int(0.95)
0.95
>>> parse_conf_int(95)
0.95
>>> parse_conf_int("95%")
0.95hypothesis¶
Statistical hypothesis tests (F, chi-square, t) for linear contrasts.
Classes:
| Name | Description |
|---|---|
Chi2TestResult | Result container for chi-square test. |
FTestResult | Result container for F-test. |
TTestResult | Result container for t-test. |
Functions:
| Name | Description |
|---|---|
adjust_pvalues | Adjust p-values for multiple comparisons. |
compute_chi2_test | Compute Wald chi-square test for L @ β = 0. |
compute_contrast_variance | Compute variance-covariance of linear contrasts L @ β. |
compute_f_pvalue | Compute p-value from F-statistic. |
compute_f_test | Compute F-test for linear hypothesis L @ β = 0. |
compute_pvalue | Compute p-values from test statistics. |
compute_t_test | Compute t-test for a single contrast L @ β = 0. |
compute_wald_statistic | Compute Wald statistic for testing L @ β = 0. |
format_pvalue_with_stars | Format p-value with R-style significance codes. |
Classes¶
Chi2TestResult¶
Chi2TestResult(num_df: int, chi2: float, p_value: float) -> NoneResult container for chi-square test.
Attributes:
| Name | Type | Description |
|---|---|---|
num_df | int | Degrees of freedom (q, number of contrasts). |
chi2 | float | Wald chi-square statistic. |
p_value | float | P-value from chi-square distribution. |
Attributes¶
chi2¶
chi2: floatnum_df¶
num_df: intp_value¶
p_value: floatFTestResult¶
FTestResult(num_df: int, den_df: float, F_value: float, p_value: float) -> NoneResult container for F-test.
Attributes:
| Name | Type | Description |
|---|---|---|
num_df | int | Numerator degrees of freedom (q, number of contrasts). |
den_df | float | Denominator degrees of freedom (residual df). |
F_value | float | F-statistic value. |
p_value | float | P-value from F-distribution. |
Attributes¶
F_value¶
F_value: floatden_df¶
den_df: floatnum_df¶
num_df: intp_value¶
p_value: floatTTestResult¶
TTestResult(estimate: float, se: float, t_value: float, df: float, p_value: float) -> NoneResult container for t-test.
Attributes:
| Name | Type | Description |
|---|---|---|
estimate | float | Contrast estimate. |
se | float | Standard error. |
t_value | float | t-statistic. |
df | float | Degrees of freedom. |
p_value | float | Two-tailed p-value. |
Attributes¶
df¶
df: floatestimate¶
estimate: floatp_value¶
p_value: floatse¶
se: floatt_value¶
t_value: floatFunctions¶
adjust_pvalues¶
adjust_pvalues(pvalues: np.ndarray, method: str = 'none') -> np.ndarrayAdjust p-values for multiple comparisons.
Implements standard p-value adjustment methods matching R’s p.adjust().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pvalues | ndarray | Raw p-values, shape (n,). | required |
method | str | Adjustment method. One of: - “none”: No adjustment (return raw p-values) - “bonferroni”: Bonferroni correction (p * n) - “holm”: Holm step-down (default in many R packages) - “hochberg”: Hochberg step-up - “bh” or “fdr”: Benjamini-Hochberg FDR control - “by”: Benjamini-Yekutieli FDR (conservative, handles dependence) | ‘none’ |
Returns:
| Type | Description |
|---|---|
ndarray | Adjusted p-values, clipped to [0, 1]. |
Examples:
>>> p = np.array([0.01, 0.04, 0.03, 0.005])
>>> adjust_pvalues(p, method="bonferroni")
array([0.04, 0.16, 0.12, 0.02])
>>> adjust_pvalues(p, method="holm")
array([0.03, 0.08, 0.06, 0.02])compute_chi2_test¶
compute_chi2_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray) -> Chi2TestResultCompute Wald chi-square test for L @ β = 0.
Used for GLMs and GLMMs where we don’t have residual df. The Wald statistic is asymptotically chi-squared distributed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p). | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
Chi2TestResult | Chi2TestResult containing num_df, chi2, p_value. |
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]])
>>> coef = np.array([0.5, 1.2, 0.8])
>>> vcov = np.diag([0.1, 0.15, 0.12])
>>> result = compute_chi2_test(L, coef, vcov)
>>> result.num_df
2compute_contrast_variance¶
compute_contrast_variance(L: np.ndarray, vcov: np.ndarray) -> np.ndarrayCompute variance-covariance of linear contrasts L @ β.
For a contrast matrix L and coefficient vcov V, computes: Var(L @ β) = L @ V @ L’
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
ndarray | Variance-covariance of contrasts, shape (q, q). |
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> var_L = compute_contrast_variance(L, vcov)
>>> var_L.shape
(2, 2)compute_f_pvalue¶
compute_f_pvalue(f_stat: float, df1: int, df2: float) -> floatCompute p-value from F-statistic.
Uses the upper tail of the F-distribution: P(F > f_stat | df1, df2).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
f_stat | float | F-statistic value. | required |
df1 | int | Numerator degrees of freedom (number of restrictions). | required |
df2 | float | Denominator degrees of freedom (residual df). | required |
Returns:
| Type | Description |
|---|---|
float | Upper-tail p-value from F-distribution. |
Examples:
>>> p = compute_f_pvalue(4.0, df1=2, df2=97)
>>> 0 < p < 0.05
True
>>> compute_f_pvalue(0.0, df1=1, df2=100)
1.0compute_f_test¶
compute_f_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray, df_resid: float) -> FTestResultCompute F-test for linear hypothesis L @ β = 0.
Uses the Wald statistic divided by the number of constraints to produce an F-statistic with appropriate degrees of freedom.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix of shape (q, p) where q is the number of constraints (contrasts) and p is the number of coefficients. | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of coefficients, shape (p, p). | required |
df_resid | float | Residual degrees of freedom (denominator df for F-test). | required |
Returns:
| Type | Description |
|---|---|
FTestResult | FTestResult containing num_df, den_df, F_value, p_value. |
Note: The Wald statistic is: W = (L @ β)’ @ (L @ V @ L’)⁻¹ @ (L @ β)
The F-statistic is F = W / q, distributed as F(q, df_resid).
Examples:
>>> L = np.array([[-1, 1, 0], [-1, 0, 1]]) # 2 contrasts
>>> coef = np.array([10.0, 12.0, 15.0])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> result = compute_f_test(L, coef, vcov, df_resid=97)
>>> result.num_df
2compute_pvalue¶
compute_pvalue(statistic: np.ndarray, df: float | np.ndarray | None = None, alternative: str = 'two-sided') -> np.ndarrayCompute p-values from test statistics.
Uses t-distribution if df is provided, otherwise uses normal distribution. Supports two-sided, greater, and less alternatives.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
statistic | ndarray | Test statistics (t or z values). | required |
df | float | ndarray | None | Degrees of freedom for t-distribution (scalar or per-coefficient array). If None, uses normal distribution. | None |
alternative | str | Direction of the alternative hypothesis. - “two-sided”: H₁: β ≠ null (default) - “greater”: H₁: β > null (upper tail) - “less”: H₁: β < null (lower tail) | ‘two-sided’ |
Returns:
| Type | Description |
|---|---|
ndarray | P-values for the specified alternative. |
Examples:
>>> statistic = np.array([2.5, -1.8])
>>> p = compute_pvalue(statistic, df=20)
>>> all((p >= 0) & (p <= 1))
Truecompute_t_test¶
compute_t_test(L: np.ndarray, coef: np.ndarray, vcov: np.ndarray, df: float) -> TTestResultCompute t-test for a single contrast L @ β = 0.
For a single contrast (q=1), returns t-statistic instead of F.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast vector of shape (1, p) or (p,). | required |
coef | ndarray | Coefficient estimates of shape (p,). | required |
vcov | ndarray | Variance-covariance matrix of shape (p, p). | required |
df | float | Degrees of freedom for t-distribution. | required |
Returns:
| Type | Description |
|---|---|
TTestResult | TTestResult containing estimate, se, t_value, df, p_value. |
Examples:
>>> L = np.array([-1, 1, 0]) # Compare coef[1] to coef[0]
>>> coef = np.array([10.0, 12.0, 15.0])
>>> vcov = np.diag([1.0, 1.5, 2.0])
>>> result = compute_t_test(L, coef, vcov, df=97)
>>> result.estimate
2.0compute_wald_statistic¶
compute_wald_statistic(contrast_values: np.ndarray, contrast_vcov: np.ndarray) -> floatCompute Wald statistic for testing L @ β = 0.
Computes: W = (L @ β)’ @ (L @ V @ L’)⁻¹ @ (L @ β)
Uses solve for efficiency, with pinv fallback for singular matrices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contrast_values | ndarray | Contrast estimates L @ β, shape (q,). | required |
contrast_vcov | ndarray | Variance of contrasts L @ V @ L’, shape (q, q). | required |
Returns:
| Type | Description |
|---|---|
float | Wald statistic value. |
Examples:
>>> contrast_values = np.array([2.0, 5.0])
>>> contrast_vcov = np.array([[2.5, 0.5], [0.5, 3.0]])
>>> W = compute_wald_statistic(contrast_values, contrast_vcov)
>>> W > 0
Trueformat_pvalue_with_stars¶
format_pvalue_with_stars(p_val: float) -> strFormat p-value with R-style significance codes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
p_val | float | The p-value to format. | required |
Returns:
| Type | Description |
|---|---|
str | Formatted string with p-value and significance stars (fixed 12-char width). |
str | Uses R’s significance coding: ‘’ < 0.001 < '’ < 0.01 < '’ < 0.05 < ‘.’ < 0.1 |
Examples:
>>> format_pvalue_with_stars(0.0001)
'< 0.001 *** '
>>> format_pvalue_with_stars(0.023)
' 0.023 * 'information_criteria¶
Deviance, AIC, BIC, and Akaike weights.
Functions:
| Name | Description |
|---|---|
compute_aic | Compute Akaike Information Criterion. |
compute_akaike_weights | Compute Akaike weights from information criterion values. |
compute_bic | Compute Bayesian Information Criterion. |
compute_deviance | Compute deviance from log-likelihood. |
Functions¶
compute_aic¶
compute_aic(loglik: float, k: int) -> floatCompute Akaike Information Criterion.
AIC = -2 * loglik + 2 * k. Trades off model fit against complexity, penalizing by 2 per parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
k | int | Number of estimated parameters (including intercept and any scale/variance parameters). | required |
Returns:
| Type | Description |
|---|---|
float | AIC value (lower is better). |
Examples:
>>> compute_aic(-100.0, k=3)
206.0
>>> compute_aic(-50.0, k=5)
110.0compute_akaike_weights¶
compute_akaike_weights(ic_values: np.ndarray) -> np.ndarrayCompute Akaike weights from information criterion values.
Weights quantify relative probability that each model is the best model in the set. Works with AIC or BIC values.
w_i = exp(-0.5 * delta_i) / sum(exp(-0.5 * delta_j))
where delta_i = IC_i - min(IC).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ic_values | ndarray | Array of information criterion values (AIC or BIC) for each model, shape (m,) where m is the number of models. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Array of weights summing to 1.0, shape (m,). |
Examples:
>>> weights = compute_akaike_weights(np.array([100.0, 102.0, 110.0]))
>>> np.isclose(weights.sum(), 1.0)
True
>>> weights[0] > weights[1] > weights[2]
Truecompute_bic¶
compute_bic(loglik: float, k: int, n: int) -> floatCompute Bayesian Information Criterion.
BIC = -2 * loglik + k * log(n). Penalizes complexity more heavily than AIC for n >= 8, preferring simpler models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
k | int | Number of estimated parameters. | required |
n | int | Number of observations. | required |
Returns:
| Type | Description |
|---|---|
float | BIC value (lower is better). |
Examples:
>>> compute_bic(-100.0, k=3, n=100)
213.81...
>>> compute_bic(-100.0, k=3, n=50)
211.73...compute_deviance¶
compute_deviance(loglik: float) -> floatCompute deviance from log-likelihood.
Deviance = -2 * loglik. Measures lack of fit relative to a saturated model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loglik | float | Log-likelihood value. | required |
Returns:
| Type | Description |
|---|---|
float | Deviance value (always >= 0 for valid log-likelihoods). |
Examples:
>>> compute_deviance(-100.0)
200.0
>>> compute_deviance(0.0)
0.0multiplicity¶
Multiplicity adjustment for contrast confidence intervals.
Provides critical value computation for multiplicity-adjusted CIs, matching R’s emmeans defaults:
Tukey HSD for pairwise contrasts (studentized range).
Multivariate-t (MVT) for sequential, treatment, sum, helmert contrasts (Genz-Bretz CDF + root-finding).
No adjustment for polynomial contrasts.
compute_tukey_critical: Tukey HSD critical value via studentized range. compute_tukey_critical: Tukey HSD critical value via studentized range. compute_mvt_critical: MVT critical value via CDF inversion.
Functions:
| Name | Description |
|---|---|
compute_mvt_critical | Compute multivariate-t critical value for simultaneous inference. |
compute_tukey_critical | Compute Tukey HSD critical value for pairwise comparisons. |
Functions¶
compute_mvt_critical¶
compute_mvt_critical(conf_level: float, corr: np.ndarray, df: float, *, tol: float = 1e-06) -> floatCompute multivariate-t critical value for simultaneous inference.
Finds c such that P(-c < T_i < c for all i) = conf_level where
T ~ MVT(df, corr). Uses scipy’s Genz-Bretz CDF integration with
Brent root-finding, matching R’s mvtnorm::qmvt().
For df = inf (asymptotic), uses multivariate normal instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_level | float | Confidence level (e.g. 0.95). | required |
corr | ndarray | Correlation matrix of contrast estimates, shape (m, m). | required |
df | float | Degrees of freedom. Use np.inf for asymptotic. | required |
tol | float | Root-finding tolerance (default 1e-6). | 1e-06 |
Returns:
| Type | Description |
|---|---|
float | Critical value for MVT-adjusted confidence intervals. |
Examples:
>>> corr = np.array([[1.0, 0.5], [0.5, 1.0]])
>>> crit = compute_mvt_critical(0.95, corr, df=50)
>>> crit > 1.96 # wider than unadjusted
Truecompute_tukey_critical¶
compute_tukey_critical(conf_level: float, k: int, df: float) -> floatCompute Tukey HSD critical value for pairwise comparisons.
Uses the studentized range distribution to compute the critical value
for simultaneous pairwise comparisons of k group means. Matches
R’s qtukey(conf_level, k, df) / sqrt(2).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conf_level | float | Confidence level (e.g. 0.95). | required |
k | int | Number of group means being compared (family size). | required |
df | float | Degrees of freedom. Use np.inf for asymptotic (z-based). | required |
Returns:
| Type | Description |
|---|---|
float | Critical value for Tukey-adjusted confidence intervals. |
Examples:
>>> crit = compute_tukey_critical(0.95, k=3, df=28)
>>> abs(crit - 2.474) < 0.001
Trueprofile¶
Profile likelihood for variance components in mixed-effects models.
This module implements the profile likelihood algorithm for computing confidence intervals for variance components (theta parameters and sigma) in linear mixed-effects models.
The algorithm follows MixedModels.jl and lme4:
For each theta parameter, step bidirectionally from the MLE
At each step, fix that theta and re-optimize all other thetas
Sigma is derived analytically at each step (profile concentrated likelihood)
Compute the signed likelihood root: zeta = sign * sqrt(dev - dev_opt)
Fit B-splines for interpolation
Extract CIs where |zeta| = sqrt(chi2(conf_level, df=1))
Bates et al. (2015). Fitting Linear Mixed-Effects Models Using lme4.
Bates et al. (2015). Fitting Linear Mixed-Effects Models Using lme4.
MixedModels.jl: https://
github .com /JuliaStats /MixedModels .jl
Classes:
| Name | Description |
|---|---|
MixedModelProtocol | Protocol for mixed-effects models compatible with profile likelihood. |
Functions:
| Name | Description |
|---|---|
compute_sigma_se_wald | Compute Wald standard error for sigma. |
convert_theta_ci_to_sd | Convert theta-scale CIs to SD-scale CIs. |
extract_ci_bound | Extract CI bound by finding where spline equals target zeta. |
profile_likelihood | Compute profile likelihood confidence intervals for variance components. |
profile_theta_parameter | Profile a single theta parameter bidirectionally from its MLE. |
Classes¶
MixedModelProtocol¶
Bases: Protocol
Protocol for mixed-effects models compatible with profile likelihood.
This protocol defines the minimal interface required for profile_likelihood(), supporting both legacy lmer/glmer classes and the new unified model() class.
The protocol is runtime checkable, allowing isinstance() checks if needed, but the function primarily uses duck typing for flexibility.
Functions:
| Name | Description |
|---|---|
get_theta_lower_bounds | Get lower bounds for theta parameters. |
Functions¶
get_theta_lower_bounds¶
get_theta_lower_bounds(n_theta: int) -> list[float]Get lower bounds for theta parameters.
Functions¶
compute_sigma_se_wald¶
compute_sigma_se_wald(model: MixedModelProtocol) -> floatCompute Wald standard error for sigma.
Uses the delta method approximation from the Hessian.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | MixedModelProtocol | Fitted mixed-effects model with Satterthwaite inference. | required |
Returns:
| Type | Description |
|---|---|
float | Standard error of sigma. |
convert_theta_ci_to_sd¶
convert_theta_ci_to_sd(ci_theta: dict[str, tuple[float, float]], theta_opt: np.ndarray, sigma_opt: float, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str]) -> tuple[dict[str, tuple[float, float]], np.ndarray, np.ndarray]Convert theta-scale CIs to SD-scale CIs.
For variance components, SD = sigma * |theta| for simple cases. For slope models with Cholesky factors, we use the diagonal elements of sigma^2 * L @ L.T.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ci_theta | dict[str, tuple[float, float]] | CIs on theta scale, keyed by parameter name. | required |
theta_opt | ndarray | Optimal theta values. | required |
sigma_opt | float | Optimal sigma value. | required |
group_names | list[str] | Grouping factor names. | required |
random_names | list[str] | dict[str, list[str]] | Random effect names. | required |
re_structure | str | list[str] | dict[str, str] | RE structure per group. | required |
Returns:
| Type | Description |
|---|---|
dict[str, tuple[float, float]] | Tuple of (ci_sd dict, ci_lower_sd array, ci_upper_sd array). |
ndarray | Arrays are ordered to match result_varying rows. |
extract_ci_bound¶
extract_ci_bound(spline: UnivariateSpline, zeta_target: float, lower_bound: float, upper_guess: float) -> floatExtract CI bound by finding where spline equals target zeta.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spline | UnivariateSpline | Reverse spline (zeta -> param). | required |
zeta_target | float | Target zeta value (negative for lower, positive for upper). | required |
lower_bound | float | Parameter lower bound. | required |
upper_guess | float | Upper search bound guess. | required |
Returns:
| Type | Description |
|---|---|
float | Parameter value where zeta = zeta_target. |
profile_likelihood¶
profile_likelihood(model: MixedModelProtocol, conf_level: float = 0.95, threshold: float = 4.0, n_steps: int = 15, verbose: bool = False) -> dictCompute profile likelihood confidence intervals for variance components.
Profiles theta parameters (relative Cholesky factors) to compute profile likelihood CIs that are more accurate than Wald CIs, especially when variance components are near the boundary (zero).
Sigma (residual SD) is derived analytically at each profile step. Sigma CI is computed via the Wald approximation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | MixedModelProtocol | Fitted mixed-effects model (lmer, glmer, or unified model). Must have attributes: _theta, _sigma, _deviance, _deviance_fn, _group_names, _random_names, and methods get_theta_lower_bounds(), _get_re_structure(). | required |
conf_level | float | Confidence level for CIs (default 0.95). | 0.95 |
threshold | float | Maximum | zeta |
n_steps | int | Number of steps per direction from MLE (default 15). | 15 |
verbose | bool | Print progress messages. | False |
Returns:
| Type | Description |
|---|---|
dict | Dict with keys matching ProfileState fields: table, spline_forward, |
dict | spline_reverse, ci_theta, ci_sd, ci_lower_sd, ci_upper_sd, |
dict | conf_level, dev_opt, threshold. The caller constructs ProfileState |
dict | to respect the maths-never-imports-containers boundary. |
Notes:
Profile CIs respect parameter bounds (variances >= 0)
Computational cost: ~2 * n_steps * n_theta deviance evaluations
For large models, consider using Wald CIs (cheaper) or bootstrap
Examples:
>>> # Legacy API
>>> m = lmer("y ~ x + (1|group)", data).fit()
>>> prof = profile_likelihood(m)
>>> prof.ci_sd # CIs on SD scale
{'group:Intercept': (15.21, 40.23), 'Residual': (22.83, 28.69)}>>> # New unified API
>>> m = model("y ~ x + (1|group)", data).fit()
>>> prof = profile_likelihood(m)profile_theta_parameter¶
profile_theta_parameter(param_idx: int, param_name: str, param_opt: float, theta_opt: np.ndarray, dev_opt: float, deviance_fn: Callable[[np.ndarray], float], lower_bounds: np.ndarray, n_steps: int, threshold: float, verbose: bool) -> dictProfile a single theta parameter bidirectionally from its MLE.
At each step, fixes the focal theta and re-optimizes all other theta elements to find the conditional MLE deviance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_idx | int | Index of parameter in theta vector. | required |
param_name | str | Human-readable name. | required |
param_opt | float | Optimal value of this parameter. | required |
theta_opt | ndarray | Full optimal theta vector. | required |
dev_opt | float | Optimal deviance. | required |
deviance_fn | Callable[[ndarray], float] | Function theta -> deviance (model’s internal fn). | required |
lower_bounds | ndarray | Lower bounds for ALL theta parameters. | required |
n_steps | int | Steps in each direction. | required |
threshold | float | Maximum | zeta |
verbose | bool | Print progress. | required |
Returns:
| Type | Description |
|---|---|
dict | Dict with ‘rows’, ‘focal’, ‘zeta’ lists. |
sandwich¶
Sandwich covariance matrix estimators for robust standard errors.
Provides:
HC0-HC3: Heteroscedasticity-consistent estimators for OLS/GLM.
CR0-CR3: Cluster-robust estimators for mixed models, clustering by the random-effects grouping variable.
Functions:
| Name | Description |
|---|---|
compute_cr_vcov | Compute cluster-robust covariance matrix for Gaussian mixed models. |
compute_glm_cr_vcov | Compute cluster-robust covariance matrix for non-Gaussian mixed models. |
compute_glm_hc_vcov | Compute heteroscedasticity-consistent covariance matrix for GLM. |
compute_hc_vcov | Compute heteroscedasticity-consistent covariance matrix. |
Functions¶
compute_cr_vcov¶
compute_cr_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], cluster_ids: NDArray[np.intp], XtX_inv: NDArray[np.float64], cr_type: str = 'CR0') -> NDArray[np.float64]Compute cluster-robust covariance matrix for Gaussian mixed models.
Implements the sandwich estimator clustered by random-effects grouping
variable: (X'X)^{-1} M (X'X)^{-1} where M sums outer products of
per-cluster score vectors.
Matches sandwich::vcovCL() in R.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Marginal residuals y - X @ beta, shape (n,). | required |
cluster_ids | NDArray[intp] | Integer cluster labels, shape (n,). Values in [0, G). | required |
XtX_inv | NDArray[float64] | (X'X)^{-1} from OLS, shape (p, p). | required |
cr_type | str | Type of CR estimator: - “CR0”: No small-sample adjustment. - “CR1”: G/(G-1) * (n-1)/(n-p) correction. - “CR2”: Bell-McCaffrey bias-reduced (square-root leverage). - “CR3”: Jackknife (inverse leverage, most conservative). | ‘CR0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Cluster-robust covariance matrix, shape (p, p). |
compute_glm_cr_vcov¶
compute_glm_cr_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], irls_weights: NDArray[np.float64], cluster_ids: NDArray[np.intp], XtWX_inv: NDArray[np.float64], cr_type: str = 'CR0') -> NDArray[np.float64]Compute cluster-robust covariance matrix for non-Gaussian mixed models.
Implements the weighted sandwich estimator clustered by random-effects grouping variable. Only CR0 and CR1 are supported (CR2/CR3 require leverage adjustments that are not well-defined for GLM).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Response residuals y - mu, shape (n,). | required |
irls_weights | NDArray[float64] | IRLS weights from GLM fit, shape (n,). | required |
cluster_ids | NDArray[intp] | Integer cluster labels, shape (n,). Values in [0, G). | required |
XtWX_inv | NDArray[float64] | (X'WX)^{-1} from GLM fit, shape (p, p). | required |
cr_type | str | Type of CR estimator: - “CR0”: No small-sample adjustment (default). - “CR1”: G/(G-1) * (n-1)/(n-p) correction. | ‘CR0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Cluster-robust covariance matrix, shape (p, p). |
compute_glm_hc_vcov¶
compute_glm_hc_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], irls_weights: NDArray[np.float64], XtWX_inv: NDArray[np.float64], hc_type: str = 'HC0') -> NDArray[np.float64]Compute heteroscedasticity-consistent covariance matrix for GLM.
(X’WX)^{-1} X’W Omega W X (X’WX)^{-1} (X’WX)^{-1} X’W Omega W X (X’WX)^{-1}
where W are the IRLS weights and Omega accounts for variance misspecification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | Response residuals (y - mu), shape (n,). | required |
irls_weights | NDArray[float64] | IRLS weights from GLM fit, shape (n,). | required |
XtWX_inv | NDArray[float64] | (X’WX)^{-1} from GLM fit, shape (p, p). | required |
hc_type | str | Type of HC estimator: - “HC0”: No small-sample adjustment (default for GLM) - “HC1”: df correction | ‘HC0’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Heteroscedasticity-consistent covariance matrix, shape (p, p). |
compute_hc_vcov¶
compute_hc_vcov(X: NDArray[np.float64], residuals: NDArray[np.float64], XtX_inv: NDArray[np.float64], hc_type: str = 'HC3') -> NDArray[np.float64]Compute heteroscedasticity-consistent covariance matrix.
Implements the sandwich estimator: (X’X)^{-1} X’ Omega X (X’X)^{-1} where Omega is a diagonal matrix of squared (possibly adjusted) residuals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
residuals | NDArray[float64] | OLS residuals, shape (n,). | required |
XtX_inv | NDArray[float64] | (X’X)^{-1} from OLS, shape (p, p). | required |
hc_type | str | Type of HC estimator: - “HC0”: White’s original (no adjustment) - “HC1”: df correction - “HC2”: Leverage adjustment - “HC3”: Squared leverage (default, most conservative) | ‘HC3’ |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Heteroscedasticity-consistent covariance matrix, shape (p, p). |
satterthwaite¶
Satterthwaite degrees of freedom approximation for linear mixed models.
Functions:
| Name | Description |
|---|---|
compute_satterthwaite_df | Compute Satterthwaite degrees of freedom for each fixed effect. |
compute_satterthwaite_summary_table | Compute full coefficient table with Satterthwaite df and p-values. |
compute_satterthwaite_t_test | Compute t-statistics, p-values, and confidence intervals. |
satterthwaite_df_for_contrasts | Compute Satterthwaite degrees of freedom for arbitrary contrasts. |
Functions¶
compute_satterthwaite_df¶
compute_satterthwaite_df(vcov_beta: np.ndarray, jacobian_vcov: np.ndarray, hessian_deviance: np.ndarray, min_df: float = 1.0, max_df: float = 1000000.0, tol: float = 1e-08) -> np.ndarrayCompute Satterthwaite degrees of freedom for each fixed effect.
Implements the Satterthwaite approximation for denominator degrees of freedom:
df_j = 2 * (SE(β̂_j)²)² / Var(SE(β̂_j)²)Where Var(SE(β̂_j)²) is computed via the delta method using the Jacobian of Vcov(beta) w.r.t. variance parameters and the inverse Hessian of the deviance function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vcov_beta | ndarray | Variance-covariance matrix of fixed effects, shape (p, p). | required |
jacobian_vcov | ndarray | Jacobian of Vcov(beta) w.r.t. variance parameters, shape (p, p, k) where k = number of variance parameters. | required |
hessian_deviance | ndarray | Hessian of deviance w.r.t. variance parameters, shape (k, k). | required |
min_df | float | Minimum allowable df (default 1.0). | 1.0 |
max_df | float | Maximum allowable df (default 1e6). | 1000000.0 |
tol | float | Tolerance for determining positive eigenvalues in Hessian. | 1e-08 |
Returns:
| Type | Description |
|---|---|
ndarray | Array of degrees of freedom for each fixed effect, shape (p,). |
Notes:
Uses Moore-Penrose pseudo-inverse for the Hessian to handle negative or near-zero eigenvalues (convergence issues)
Warns when negative eigenvalues are detected
Clips df to [min_df, max_df] range for numerical stability
Returns max_df when variance of variance is near zero
Examples:
# After fitting lmer model and computing Hessian/Jacobian
df = compute_satterthwaite_df(vcov_beta, jac, hess)
# df[0] is the denominator df for testing beta[0]compute_satterthwaite_summary_table¶
compute_satterthwaite_summary_table(beta: np.ndarray, beta_names: list[str], vcov_beta: np.ndarray, vcov_varpar: np.ndarray, jac_list: list[np.ndarray]) -> dict[str, list]Compute full coefficient table with Satterthwaite df and p-values.
Computes t-statistics, degrees of freedom, and p-values for all fixed effects using Satterthwaite’s approximation.
This is a convenience function that combines compute_satterthwaite_df()
and compute_satterthwaite_t_test() into a single summary table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
beta | ndarray | Fixed effects estimates, shape (p,). | required |
beta_names | list[str] | Names of fixed effects. | required |
vcov_beta | ndarray | Variance-covariance matrix of beta, shape (p, p). | required |
vcov_varpar | ndarray | Variance-covariance matrix of variance parameters. | required |
jac_list | list[ndarray] | List of Jacobian matrices, each shape (p, p). | required |
Returns:
| Type | Description |
|---|---|
dict[str, list] | Dictionary with keys: |
dict[str, list] | - ‘Parameter’: List of parameter names |
dict[str, list] | - ‘Estimate’: Coefficient estimates |
dict[str, list] | - ‘Std. Error’: Standard errors |
dict[str, list] | - ‘df’: Satterthwaite degrees of freedom |
dict[str, list] | - ‘t value’: t-statistics |
dict[str, list] | - 'Pr(> |
Notes:
Computes one t-test per fixed effect
Uses standard t-distribution for p-values
All computations vectorized for efficiency
Examples:
# After fitting lmer and computing Hessian/Jacobian
table = compute_satterthwaite_summary_table(
beta, beta_names, vcov_beta, vcov_varpar, jac_list
)
import pandas as pd
df = pd.DataFrame(table)
print(df)compute_satterthwaite_t_test¶
compute_satterthwaite_t_test(beta: np.ndarray, se: np.ndarray, df: np.ndarray, conf_level: float = 0.95) -> dict[str, np.ndarray]Compute t-statistics, p-values, and confidence intervals.
Uses Satterthwaite degrees of freedom for each coefficient to compute t-test statistics and p-values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
beta | ndarray | Coefficient estimates, shape (p,). | required |
se | ndarray | Standard errors, shape (p,). | required |
df | ndarray | Satterthwaite degrees of freedom for each coefficient, shape (p,). | required |
conf_level | float | Confidence level for intervals (default 0.95 for 95% CI). | 0.95 |
Returns:
| Type | Description |
|---|---|
dict[str, ndarray] | Dictionary with: |
dict[str, ndarray] | - ‘statistic’: t-statistics, shape (p,) |
dict[str, ndarray] | - ‘p_value’: Two-tailed p-values, shape (p,) |
dict[str, ndarray] | - ‘ci_lower’: Lower confidence bounds, shape (p,) |
dict[str, ndarray] | - ‘ci_upper’: Upper confidence bounds, shape (p,) |
Examples:
# After computing df from compute_satterthwaite_df()
se = np.sqrt(np.diag(vcov_beta))
results = compute_satterthwaite_t_test(beta, se, df)
print(results['p_value']) # Two-tailed p-valuessatterthwaite_df_for_contrasts¶
satterthwaite_df_for_contrasts(L: np.ndarray, vcov_beta: np.ndarray, vcov_varpar: np.ndarray, jac_list: list[np.ndarray], min_df: float = 1.0, max_df: float = 1000000.0) -> np.ndarrayCompute Satterthwaite degrees of freedom for arbitrary contrasts.
This generalizes compute_satterthwaite_df() to handle arbitrary linear contrasts
L @ β, not just individual coefficients. This is needed for emmeans where
each EMM is a linear combination of coefficients.
The Satterthwaite formula for contrast L is:
df = 2 * var_contrast² / Var(var_contrast)var_contrast = L @ Vcov(β) @ L.T var_contrast = L @ Vcov(β) @ L.T grad[i] = L @ Jac_i @ L.T (gradient w.r.t. variance parameter i) Var(var_contrast) = grad @ Vcov(varpar) @ grad
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | ndarray | Contrast matrix, shape (n_contrasts, n_coef). Each row is a contrast. For a single contrast, shape can be (n_coef,). | required |
vcov_beta | ndarray | Variance-covariance matrix of fixed effects, shape (p, p). | required |
vcov_varpar | ndarray | Variance-covariance matrix of variance parameters, shape (k, k). This is 2 * H^{-1} where H is the Hessian of deviance. | required |
jac_list | list[ndarray] | List of k Jacobian matrices, each shape (p, p). jac_list[i] = ∂Vcov(β)/∂varpar_i. | required |
min_df | float | Minimum allowable df (default 1.0). | 1.0 |
max_df | float | Maximum allowable df (default 1e6). | 1000000.0 |
Returns:
| Type | Description |
|---|---|
ndarray | Array of degrees of freedom for each contrast, shape (n_contrasts,). |
ndarray | For a single contrast (1D input), returns a scalar float. |
Examples:
# Compute df for EMMs where X_ref is the prediction matrix
df = satterthwaite_df_for_contrasts(
X_ref, vcov_beta, vcov_varpar, jac_list
)
# df[i] is the denominator df for EMM iwald_variance¶
Wald confidence intervals for variance components using delta method.
This module computes asymptotic (Wald) confidence intervals for variance components in mixed-effects models. It uses the delta method to transform the asymptotic covariance of variance parameters (theta, sigma) to the standard deviation scale.
The key insight is that the Hessian of the deviance w.r.t. variance parameters is already computed for Satterthwaite inference, and we can reuse it here: vcov_varpar = 2 * H^{-1}
For transformation from theta scale to SD scale, we use: SD[i] = sigma * sqrt(diag(L @ L.T)[i])
where L is the Cholesky factor reconstructed from theta.
Functions:
| Name | Description |
|---|---|
build_cholesky_with_derivs | Build Cholesky factor L and its derivatives w.r.t. theta. |
compute_sd_jacobian | Compute SDs and Jacobian of SDs w.r.t. varpar = [theta, sigma]. |
compute_wald_ci_varying | Compute Wald CIs for variance components on SD scale. |
Functions¶
build_cholesky_with_derivs¶
build_cholesky_with_derivs(theta_group: np.ndarray, n_re: int, structure: str) -> tuple[np.ndarray, list[np.ndarray]]Build Cholesky factor L and its derivatives w.r.t. theta.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta_group | ndarray | Theta values for this group. | required |
n_re | int | Number of random effects. | required |
structure | str | “intercept”, “diagonal”, or “slope”. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (L, dL_dtheta) where: |
list[ndarray] | - L: Cholesky factor, shape (n_re, n_re) |
tuple[ndarray, list[ndarray]] | - dL_dtheta: List of derivative matrices, one per theta element |
compute_sd_jacobian¶
compute_sd_jacobian(theta: np.ndarray, sigma: float, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str]) -> tuple[np.ndarray, np.ndarray]Compute SDs and Jacobian of SDs w.r.t. varpar = [theta, sigma].
Sigma = sigma^2 * L @ L.T Sigma = sigma^2 * L @ L.T SD[i] = sqrt(Sigma[i,i]) = sigma * sqrt((L @ L.T)[i,i])
The Jacobian has shape (n_sds, n_theta + 1) where the last column corresponds to sigma.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta | ndarray | Theta parameters (relative Cholesky factors). | required |
sigma | float | Residual standard deviation. | required |
group_names | list[str] | Grouping factor names. | required |
random_names | list[str] | dict[str, list[str]] | Random effect names per group. | required |
re_structure | str | list[str] | dict[str, str] | RE structure per group. | required |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (sds, jacobian) where: |
ndarray | - sds: Array of SD values, length = n_variance_components + 1 |
tuple[ndarray, ndarray] | - jacobian: Shape (n_sds, n_theta + 1) |
compute_wald_ci_varying¶
compute_wald_ci_varying(theta: np.ndarray, sigma: float, vcov_varpar: np.ndarray, group_names: list[str], random_names: list[str] | dict[str, list[str]], re_structure: str | list[str] | dict[str, str], conf_level: float = 0.95) -> tuple[np.ndarray, np.ndarray]Compute Wald CIs for variance components on SD scale.
Uses the delta method to transform asymptotic variance from varpar (theta, sigma) scale to the standard deviation scale.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
theta | ndarray | Optimized theta parameters (relative scale). | required |
sigma | float | Residual standard deviation. | required |
vcov_varpar | ndarray | Asymptotic covariance of [theta, sigma], shape (k, k) where k = len(theta) + 1. This is 2 * H^{-1} from Satterthwaite. | required |
group_names | list[str] | Names of grouping factors. | required |
random_names | list[str] | dict[str, list[str]] | Names of random effects per group. | required |
re_structure | str | list[str] | dict[str, str] | RE structure type(s) per group. | required |
conf_level | float | Confidence level (default 0.95). | 0.95 |
Returns:
| Type | Description |
|---|---|
ndarray | Tuple of (ci_lower, ci_upper) arrays on SD scale. |
ndarray | Length = number of variance components + 1 (for residual). |
Notes: The delta method approximation assumes:
Asymptotic normality of theta and sigma estimates
Sample size large enough for the approximation to hold
For variance components near zero (boundary), profile likelihood CIs are more accurate as they respect the parameter bounds.
welch¶
Welch-Satterthwaite degrees of freedom for unequal variance inference.
Classes:
| Name | Description |
|---|---|
CellInfo | Information about factor cells for Welch-style inference. |
Functions:
| Name | Description |
|---|---|
compute_cell_info | Compute cell-based variance information for Welch inference. |
compute_welch_satterthwaite_df_for_L | Vectorized Satterthwaite df for each row of a contrast matrix L. |
compute_welch_satterthwaite_df_for_contrast | Satterthwaite df for a single linear contrast L @ beta. |
compute_welch_satterthwaite_df_per_coef | Compute per-coefficient Welch-Satterthwaite degrees of freedom. |
compute_welch_vcov | Compute Welch sandwich variance-covariance matrix. |
extract_factors_from_formula | Extract factor (categorical) column names from a model formula. |
Classes¶
CellInfo¶
CellInfo(cell_labels: list[tuple[str, ...]], cell_variances: NDArray[np.float64], cell_counts: NDArray[np.int64], cell_indices: NDArray[np.int64], factor_columns: list[str]) -> NoneInformation about factor cells for Welch-style inference.
When using errors='unequal_var', residual variances are computed
within each cell defined by the crossing of all factors in the model.
This implements the Welch-James approach for factorial designs.
Attributes:
| Name | Type | Description |
|---|---|---|
cell_labels | list[tuple[str, ...]] | Labels for each cell as tuples of factor levels. For single factor: [(“A”,), (“B”,), ...] For multiple factors: [(“A”, “X”), (“A”, “Y”), (“B”, “X”), ...] |
cell_variances | NDArray[float64] | Sample variance of residuals within each cell. |
cell_counts | NDArray[int64] | Number of observations in each cell. |
cell_indices | NDArray[int64] | Cell membership for each observation (0-indexed). |
factor_columns | list[str] | Names of factor columns used to define cells. |
Attributes¶
cell_counts¶
cell_counts: NDArray[np.int64]cell_indices¶
cell_indices: NDArray[np.int64]cell_labels¶
cell_labels: list[tuple[str, ...]]cell_variances¶
cell_variances: NDArray[np.float64]factor_columns¶
factor_columns: list[str]Functions¶
compute_cell_info¶
compute_cell_info(residuals: NDArray[np.float64], data: pl.DataFrame, factor_columns: list[str]) -> CellInfoCompute cell-based variance information for Welch inference.
For multi-factor models, cells are defined by the crossing of all factors. This implements the Welch-James approach for factorial designs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
residuals | NDArray[float64] | OLS residuals, shape (n,). | required |
data | DataFrame | DataFrame with factor columns. | required |
factor_columns | list[str] | List of factor column names. | required |
Returns:
| Type | Description |
|---|---|
CellInfo | CellInfo containing variance information for each cell. |
Examples:
import numpy as np
import polars as pl
df = pl.DataFrame({
"group": ["A", "A", "B", "B", "B"],
"x": [1.0, 2.0, 3.0, 4.0, 5.0],
})
residuals = np.array([0.1, -0.1, 0.2, -0.1, -0.1])
info = compute_cell_info(residuals, df, ["group"])
info.cell_labels
# [('A',), ('B',)]
info.cell_counts
# array([2, 3])compute_welch_satterthwaite_df_for_L¶
compute_welch_satterthwaite_df_for_L(L: NDArray[np.float64], X: NDArray[np.float64], cell_info: CellInfo) -> NDArray[np.float64]Vectorized Satterthwaite df for each row of a contrast matrix L.
Applies compute_welch_satterthwaite_df_for_contrast to each row of L
and returns per-row degrees of freedom.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L | NDArray[float64] | Contrast matrix, shape (q, p). | required |
X | NDArray[float64] | Design matrix, shape (n, p). | required |
cell_info | CellInfo | Cell-based variance information from compute_cell_info(). | required |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Per-row degrees of freedom, shape (q,). |
compute_welch_satterthwaite_df_for_contrast¶
compute_welch_satterthwaite_df_for_contrast(L_row: NDArray[np.float64], X: NDArray[np.float64], cell_info: CellInfo) -> floatSatterthwaite df for a single linear contrast L @ beta.
Generalizes compute_welch_satterthwaite_df_per_coef from unit vectors to
arbitrary contrast vectors. For unit vector e_j, this recovers
compute_welch_satterthwaite_df_per_coef[j]. For a two-group A-B contrast,
this gives the classic Welch t-test df.
Algorithm::
W = X @ (X'X)^{-1} # (n, p) projection weights
For each cell k:
LW_k = W[cell_k, :] @ L_row # scalar contributions
v_k = s_k^2 * sum(LW_k^2)
df = (sum_k v_k)^2 / sum_k [v_k^2 / (n_k - 1)]Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
L_row | NDArray[float64] | Contrast vector, shape (p,). | required |
X | NDArray[float64] | Design matrix, shape (n, p). | required |
cell_info | CellInfo | Cell-based variance information from compute_cell_info(). | required |
Returns:
| Type | Description |
|---|---|
float | Satterthwaite degrees of freedom (scalar). |
Examples:
For a unit vector e_j::
L_row = np.zeros(p)
L_row[j] = 1.0
df_j = compute_welch_satterthwaite_df_for_contrast(L_row, X, cell_info)
# Matches compute_welch_satterthwaite_df_per_coef(X, cell_info)[j]compute_welch_satterthwaite_df_per_coef¶
compute_welch_satterthwaite_df_per_coef(X: NDArray[np.float64], cell_info: CellInfo) -> NDArray[np.float64]Compute per-coefficient Welch-Satterthwaite degrees of freedom.
For each coefficient j, decomposes its sandwich variance into per-cell contributions and applies the Satterthwaite approximation independently. This gives genuinely different df values for different coefficients.
For a two-group treatment-coded design y ~ factor(group):
Intercept (reference group mean): df = n_ref - 1
Slope (group difference): classic two-group Welch formula
Algorithm for each coefficient j::
W = X @ (X'X)^{-1} # (n, p) projection weights
cell_W_sq[k, j] = sum_{i in cell_k} W[i,j]^2
v[k, j] = s_k^2 * cell_W_sq[k, j] # cell k's contribution
df_j = (sum_k v[k,j])^2 / sum_k [v[k,j]^2 / (n_k - 1)]Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
cell_info | CellInfo | Cell-based variance information from compute_cell_info(). | required |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Per-coefficient degrees of freedom, shape (p,). |
Notes:
This is a bossanova extension with no direct R equivalent. R’s
sandwich/emmeans/coeftest use residual df with HC SEs.
Bossanova computes proper Satterthwaite df per coefficient, which
for a two-group model exactly matches the classic Welch t-test
(t.test(var.equal=FALSE) / scipy.stats.ttest_ind).
For symmetric designs (equal n and variance), all df are equal
The total variance
sum_k v[k,j]equalsvcov[j,j]fromcompute_welch_vcovLoops over cells (typically 2-5), vectorized over coefficients
Examples:
import numpy as np
# Two-group treatment-coded design
X = np.column_stack([np.ones(20), np.array([0]*10 + [1]*10)])
# ... with cell_info from compute_cell_info ...
df = compute_welch_satterthwaite_df_per_coef(X, cell_info)
# df[0] ≈ n_ref - 1 (intercept)
# df[1] ≈ classic Welch df (slope)compute_welch_vcov¶
compute_welch_vcov(X: NDArray[np.float64], cell_info: CellInfo) -> NDArray[np.float64]Compute Welch sandwich variance-covariance matrix.
Uses per-cell residual variances to build a heteroscedasticity-consistent vcov that accounts for unequal variances across groups:
V = (X'X)⁻¹ X' Ω X (X'X)⁻¹where Ω = diag(σ²_j) assigns each observation the variance of its cell.
For a two-group comparison this gives the same SE as the classical Welch t-test: SE = sqrt(s₁²/n₁ + s₂²/n₂).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X | NDArray[float64] | Design matrix, shape (n, p). | required |
cell_info | CellInfo | Cell-based variance information from compute_cell_info(). | required |
Returns:
| Type | Description |
|---|---|
NDArray[float64] | Variance-covariance matrix, shape (p, p). |
extract_factors_from_formula¶
extract_factors_from_formula(formula: str, data: pl.DataFrame) -> list[str]Extract factor (categorical) column names from a model formula.
Parses the RHS of the formula and identifies which columns are categorical (String, Categorical, or Enum type in polars).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula | str | Model formula like “y ~ A + B + x” or “y ~ A * B + center(x)”. | required |
data | DataFrame | DataFrame containing the columns referenced in the formula. | required |
Returns:
| Type | Description |
|---|---|
list[str] | List of column names that are categorical/factor variables. |
Examples:
import polars as pl
df = pl.DataFrame({
"y": [1.0, 2.0, 3.0],
"group": ["A", "B", "A"],
"x": [0.1, 0.2, 0.3],
})
extract_factors_from_formula("y ~ group + x", df)
# ['group']
# Multiple factors
df2 = df.with_columns(pl.col("group").alias("treatment"))
extract_factors_from_formula("y ~ group * treatment", df2)
# ['group', 'treatment']Notes:
Handles interaction terms (*, :) by extracting component columns
Handles transforms like center(x) by extracting the inner column
Skips intercept terms (1, 0, -1)
Only returns columns that exist in the data