Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Classical Testbossanova EquivalentUse Case
One-way ANOVAmodel("y ~ group", df).infer("joint").effects3+ groups, equal variance
Welch’s ANOVAmodel("y ~ group", df).infer("joint").infer(errors="unequal_var").effects3+ groups, unequal variance
Kruskal-Wallismodel("rank(y) ~ group", df).infer("joint").effects3+ groups, robust
Two-way ANOVAmodel("y ~ a * b", df).infer("joint").effectsFactorial design
ANCOVAmodel("y ~ group + covariate", df)Group comparison with adjustment

All examples use the included penguins dataset.

One-Way ANOVA

Classical:

F=MSbetweenMSwithin=SSB/(k1)SSW/(Nk),FF(k1,  Nk) under H0F = \frac{MS_{\text{between}}}{MS_{\text{within}}} = \frac{SS_B/(k-1)}{SS_W/(N-k)}, \quad F \sim F(k-1,\; N-k) \text{ under } H_0

As GLM:

yijN(μj,σ2),μj=β0+β1x1j++βk1x(k1)jy_{ij} \sim \mathcal{N}(\mu_j, \sigma^2), \quad \mu_j = \beta_0 + \beta_1 x_{1j} + \cdots + \beta_{k-1} x_{(k-1)j}

Joint test H0:β1==βk1=0H_0: \beta_1 = \cdots = \beta_{k-1} = 0 yields the same F(k1,Nk)F(k-1, N-k) statistic.

scipy

from scipy.stats import f_oneway

adelie = penguins.filter(pl.col("species") == "Adelie")["body_mass_g"].to_numpy()
chinstrap = penguins.filter(pl.col("species") == "Chinstrap")["body_mass_g"].to_numpy()
gentoo = penguins.filter(pl.col("species") == "Gentoo")["body_mass_g"].to_numpy()

scipy_anova = f_oneway(adelie, chinstrap, gentoo)
scipy_anova
F_onewayResult(statistic=np.float64(341.8948949481461), pvalue=np.float64(3.74450512630046e-81))

bossanova

m = model("body_mass_g ~ species", penguins).fit().infer("joint")

m.effects
Loading...

Welch’s ANOVA (Unequal Variances)

Classical:

FW=1k1j=1kwj(xˉjμ^w)21+2(k2)k21j=1k(1wj/W)2nj1,wj=njsj2,W=wjF_W = \frac{1}{k-1} \frac{\displaystyle\sum_{j=1}^{k} w_j (\bar{x}_j - \hat{\mu}_w)^2}{1 + \frac{2(k-2)}{k^2-1} \displaystyle\sum_{j=1}^{k} \frac{(1 - w_j/W)^2}{n_j - 1}}, \quad w_j = \frac{n_j}{s_j^2}, \quad W = \sum w_j

μ^w=wjxˉjW,FW˙F(k1,  dfW)\hat{\mu}_w = \frac{\sum w_j \bar{x}_j}{W}, \quad F_W \dot{\sim} F(k-1,\; df_W)

As GLM:

yijN(μj,σj2),μj=β0+β1x1j++βk1x(k1)jy_{ij} \sim \mathcal{N}(\mu_j, \sigma_j^2), \quad \mu_j = \beta_0 + \beta_1 x_{1j} + \cdots + \beta_{k-1} x_{(k-1)j}

Same structural model as one-way ANOVA — only the variance estimator changes. Per-group variances σj2\sigma_j^2 yield Welch-Satterthwaite adjusted degrees of freedom for each coefficient’s tt-test.

bossanova

m_welch = model("body_mass_g ~ species", penguins).fit().infer(errors="unequal_var")

m_welch.params.select("term", "estimate", "df", "p_value")
Loading...

Kruskal-Wallis Test

Classical:

H=12N(N+1)j=1kRj2nj3(N+1),H˙χ2(k1) under H0H = \frac{12}{N(N+1)} \sum_{j=1}^{k} \frac{R_j^2}{n_j} - 3(N+1), \quad H \dot{\sim} \chi^2(k-1) \text{ under } H_0

where RjR_j is the sum of ranks in group jj.

As GLM:

yijN(μj,σ2),μj=β0+β1x1j+,where yij=rank(yij)y_{ij}^* \sim \mathcal{N}(\mu_j, \sigma^2), \quad \mu_j = \beta_0 + \beta_1 x_{1j} + \cdots, \quad \text{where } y_{ij}^* = \text{rank}(y_{ij})

Joint test H0:β1==βk1=0H_0: \beta_1 = \cdots = \beta_{k-1} = 0 yields FranksH/(k1)F_{\text{ranks}} \approx H/(k-1). The rank transformation makes ANOVA robust to outliers and non-normality.

scipy

from scipy.stats import kruskal

scipy_kw = kruskal(
    penguins.filter(pl.col("species") == "Adelie")["body_mass_g"].to_numpy(),
    penguins.filter(pl.col("species") == "Chinstrap")["body_mass_g"].to_numpy(),
    penguins.filter(pl.col("species") == "Gentoo")["body_mass_g"].to_numpy(),
)
scipy_kw
KruskalResult(statistic=np.float64(212.08513173193893), pvalue=np.float64(8.836876744281845e-47))

bossanova

m_kw = model("rank(body_mass_g) ~ species", penguins).fit().infer("joint")

m_kw.effects
Loading...

scipy reports the Kruskal-Wallis H statistic (χ²-distributed); bossanova reports an F-ratio on ranks. The test statistics differ but both test H₀: all group locations equal and yield equivalent p-values.

Two-Way ANOVA

Classical:

FA=MSAMSEF(a1,  Nab),FB=MSBMSEF(b1,  Nab),FAB=MSABMSEF((a1)(b1),  Nab)F_A = \frac{MS_A}{MS_E} \sim F(a-1,\; N-ab), \quad F_B = \frac{MS_B}{MS_E} \sim F(b-1,\; N-ab), \quad F_{AB} = \frac{MS_{AB}}{MS_E} \sim F((a-1)(b-1),\; N-ab)

As GLM:

yijkN(μij,σ2),μij=β0+αi+γj+(αγ)ijy_{ijk} \sim \mathcal{N}(\mu_{ij}, \sigma^2), \quad \mu_{ij} = \beta_0 + \alpha_i + \gamma_j + (\alpha\gamma)_{ij}

Each classical FF corresponds to a joint test on the corresponding GLM coefficients. The interaction (αγ)ij(\alpha\gamma)_{ij} tests whether the effect of one factor depends on the level of the other.

bossanova

m_2way = model("body_mass_g ~ species * sex", penguins).fit().infer("joint")

m_2way.effects
Loading...

ANCOVA (Covariate Adjustment)

Classical:

yˉjadj=yˉjβ^w(xˉjxˉ),F=MSgroups (adj)MSEF(k1,  Nk1) under H0\bar{y}_j^{\text{adj}} = \bar{y}_j - \hat{\beta}_w(\bar{x}_j - \bar{x}_{\cdot\cdot}), \quad F = \frac{MS_{\text{groups (adj)}}}{MS_E} \sim F(k-1,\; N-k-1) \text{ under } H_0

where β^w\hat{\beta}_w is the pooled within-group regression slope.

As GLM:

yiN(μi,σ2),μi=β0+β1gi+β2xiy_i \sim \mathcal{N}(\mu_i, \sigma^2), \quad \mu_i = \beta_0 + \beta_1 g_i + \beta_2 x_i

The group effect β1\beta_1 is the adjusted mean difference after controlling for the covariate xix_i. ANCOVA reduces residual variance and removes confounding by the covariate.

bossanova

# Compare species controlling for flipper length
m_ancova = model("body_mass_g ~ species + flipper_length_mm", penguins).fit().infer()

m_ancova.params.select("term", "estimate", "p_value")
Loading...