Heteroscedastic Errors

Every linear model makes assumptions about the errors $\epsilon_i$ . The default assumption is iid (independent and identically distributed): all errors have the same variance and are uncorrelated. When this assumption doesn’t hold, you don’t need a different test—you just need a different error assumption.

bossanova lets you switch error assumptions with a single argument to .infer():

Assumption	`errors=`	When to use
Equal variance (default)	`"iid"`	Standard setting, homoscedastic data
Unequal group variances	`"unequal_var"`	Groups have different spreads
Unknown heteroscedasticity	`"hetero"`	Variance depends on predictors

The model stays the same—only the inference changes. This is the model-based advantage: separating the structural model from the error model.

import numpy as np
from scipy import stats
import polars as pl
from bossanova.model import model
from bossanova import load_dataset

np.random.seed(42)

# Load penguins dataset
penguins = load_dataset("penguins").drop_nulls().filter(pl.col("sex") != "NA")

# Extract for scipy comparison
adelie = penguins.filter(pl.col("species") == "Adelie")["body_mass_g"].to_numpy()
gentoo = penguins.filter(pl.col("species") == "Gentoo")["body_mass_g"].to_numpy()

Equal Variance (Default)¶

The standard assumption: all errors share the same variance $\sigma^2$ . Under this assumption, the coefficient variance is:

\text{Var}(\hat{\beta}) = \hat{\sigma}^2 (\mathbf{X}'\mathbf{X})^{-1}, \quad \hat{\sigma}^2 = \frac{\text{RSS}}{n - p}

(1)

This gives the classical t-test when comparing two groups.

from scipy.stats import ttest_ind

# scipy (equal variance)
scipy_eq = ttest_ind(gentoo, adelie, equal_var=True)
scipy_eq

TtestResult(statistic=np.float64(23.466803147391744), pvalue=np.float64(1.8806652580952688e-66), df=np.float64(263.0))

df = penguins.filter(pl.col("species").is_in(["Adelie", "Gentoo"]))
m = model("body_mass_g ~ species", df).fit().infer(errors="iid")

m.params[1].select("term", "statistic", "df", "p_value")

Unequal Group Variances¶

When groups have different spreads, the standard t-test is anticonservative. The "unequal_var" option replaces the pooled $\hat{\sigma}^2$ with per-group variance estimates and applies a Welch-Satterthwaite degrees of freedom adjustment — the same correction behind Welch’s t-test:

\text{Var}(\hat{\beta}_j) = \sum_g \frac{s_g^2}{n_g} c_{gj}^2, \quad df_j = \frac{\text{Var}(\hat{\beta}_j)^2}{\sum_g \frac{(s_g^2 c_{gj}^2 / n_g)^2}{n_g - 1}}

(2)

where $c_{gj}$ are the contrast coefficients for group $g$ on parameter $j$ .

scipy¶

scipy_welch = ttest_ind(gentoo, adelie, equal_var=False)
scipy_welch

TtestResult(statistic=np.float64(23.25392442915641), pvalue=np.float64(1.223170419256714e-63), df=np.float64(242.14429956468885))

bossanova¶

m_welch = model("body_mass_g ~ species", df).fit().infer(errors="unequal_var")

m_welch.params[1].select("term", "statistic", "df", "p_value")

The fractional df (instead of the integer $n - p$ ) confirms the Welch-Satterthwaite adjustment is active.

bn_t_w = float(m_welch.params["statistic"][1])
bn_df_w = float(m_welch.params["df"][1])
assert np.isclose(abs(bn_t_w), abs(scipy_welch.statistic), rtol=1e-3), f"t mismatch: {bn_t_w} vs {scipy_welch.statistic}"
assert np.isclose(bn_df_w, scipy_welch.df, rtol=1e-3), f"df mismatch: {bn_df_w} vs {scipy_welch.df}"

This works with covariates too—no classical test equivalent:

m_cov = model("body_mass_g ~ species + flipper_length_mm", penguins).fit().infer(
    errors="unequal_var"
)

m_cov.params.select("term", "estimate", "df", "p_value")

Heteroscedasticity-Consistent Errors¶

When you suspect non-constant variance but don’t know its form, use errors="hetero". This applies a sandwich (HC3) variance estimator that gives valid standard errors regardless of the variance structure:

\text{Var}(\hat{\beta}) = (\mathbf{X}'\mathbf{X})^{-1} \mathbf{X}' \boldsymbol{\Omega} \mathbf{X} (\mathbf{X}'\mathbf{X})^{-1}

(3)

where $\boldsymbol{\Omega} = \text{diag}(\hat{e}_i^2 / (1-h_{ii})^2)$ with $h_{ii}$ being leverage.

df_reg = penguins.select("body_mass_g", "flipper_length_mm")

# Standard (assumes equal variance)
m_iid = model("body_mass_g ~ flipper_length_mm", df_reg).fit().infer(errors="iid")

# Heteroscedasticity-consistent
m_hetero = model("body_mass_g ~ flipper_length_mm", df_reg).fit().infer(errors="hetero")

pl.DataFrame({
    "errors": ["iid", "hetero"],
    "slope_se": [
        float(m_iid.params["se"][1]),
        float(m_hetero.params["se"][1])
    ],
    "slope_p": [
        float(m_iid.params["p_value"][1]),
        float(m_hetero.params["p_value"][1])
    ]
})

When the equal-variance assumption holds, both methods give similar SEs. When it doesn’t, the "hetero" SEs are more trustworthy.