Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Within-Subject Designs

UCSD Psychology
Classical Testbossanova EquivalentWhen
Paired t-testmodel("y ~ condition + (1|subject)", df)2 conditions
RM-ANOVAmodel("y ~ condition + (1|subject)", df)3+ conditions
Wilcoxon signed-rank (paired)model("rank(y) ~ condition + (1|subject)", df)2 conditions, robust
Friedman testmodel("rank(y) ~ condition + (1|subject)", df)3+ conditions, robust
RM-ANOVA + covariatemodel("y ~ condition + covariate + (1|subject)", df)Between-subject covariate

Notice that paired tests and repeated measures are the same formula. Adding (1|subject) accounts for within-subject variation; with 2 conditions this is equivalent to the paired t-test, with 3+ conditions it becomes RM-ANOVA.

Two conditions (paired t-test)

Classical:

di=x1ix2i,t=dˉsd/n,tt(n1) under H0:μd=0d_i = x_{1i} - x_{2i}, \quad t = \frac{\bar{d}}{s_d / \sqrt{n}}, \quad t \sim t(n-1) \text{ under } H_0: \mu_d = 0

where nn is the number of paired observations.

As GLM (mixed):

yijN(μij,σ2),μij=β0+β1xj+ui,uiN(0,σu2)y_{ij} \sim \mathcal{N}(\mu_{ij}, \sigma^2), \quad \mu_{ij} = \beta_0 + \beta_1 x_j + u_i, \quad u_i \sim \mathcal{N}(0, \sigma_u^2)

The varying intercept uiu_i absorbs subject-level baseline differences — equivalent to computing difference scores. The test of H0:β1=0H_0: \beta_1 = 0 asks whether conditions differ after accounting for within-subject variation.

scipy

male = penguins.filter(pl.col("sex") == "male")["body_mass_g"].to_numpy()
female = penguins.filter(pl.col("sex") == "female")["body_mass_g"].to_numpy()

scipy_ttest = ttest_ind(male, female)
scipy_ttest
TtestResult(statistic=np.float64(8.541720337994516), pvalue=np.float64(4.897246751596224e-16), df=np.float64(331.0))

bossanova (simple linear model)

Without varying effects, the linear model recovers the classical independent t-test exactly:

m_lm = model("body_mass_g ~ sex", penguins).fit().infer()

m_lm.params[1].select("term", "statistic", "df", "p_value")
Loading...

bossanova (mixed model)

Adding (1|species) accounts for species-level variation, giving a more powerful test -- the t-statistic is larger because residual variance is reduced:

m = model("body_mass_g ~ sex + (1|species)", penguins).fit().infer()

m.params.select("term", "statistic", "df", "p_value")
Loading...

The mixed-model statistic differs from the classical t-test because it partitions variance into species-level and residual components. By accounting for group structure, the test becomes more sensitive.

Three+ conditions (RM-ANOVA)

Classical:

F=MSconditionMSerror,FF(k1,  (k1)(n1)) under H0 (assuming sphericity)F = \frac{MS_{\text{condition}}}{MS_{\text{error}}}, \quad F \sim F(k-1,\; (k-1)(n-1)) \text{ under } H_0 \text{ (assuming sphericity)}

As GLM (mixed):

yijN(μij,σ2),μij=β0+l=1k1βlxlj+ui,uiN(0,σu2)y_{ij} \sim \mathcal{N}(\mu_{ij}, \sigma^2), \quad \mu_{ij} = \beta_0 + \sum_{l=1}^{k-1} \beta_l x_{lj} + u_i, \quad u_i \sim \mathcal{N}(0, \sigma_u^2)

Standard ANOVA ignores within-group correlation, inflating Type I error. The varying intercept captures group-level baseline differences, properly partitioning variance.

scipy

adelie = penguins.filter(pl.col("species") == "Adelie")["flipper_length_mm"].to_numpy()
chinstrap = penguins.filter(pl.col("species") == "Chinstrap")["flipper_length_mm"].to_numpy()
gentoo = penguins.filter(pl.col("species") == "Gentoo")["flipper_length_mm"].to_numpy()

scipy_anova = f_oneway(adelie, chinstrap, gentoo)
scipy_anova
F_onewayResult(statistic=np.float64(567.4069920123421), pvalue=np.float64(1.5874180554406345e-107))

bossanova

m_rm = model("flipper_length_mm ~ species + (1|island)", penguins).fit().infer()

m_rm.params.select("term", "estimate", "statistic", "p_value")
Loading...
# Joint F-test for species effect
m_rm.infer("joint").effects
Loading...
# Random effects variance and model fit
m_rm.diagnostics.select("aic", "bic", "rsquared_marginal", "rsquared_conditional", "icc")
Loading...

Rank-based variants (robust)

The rank() transformation makes within-subject inference robust to outliers and non-normality. With 2 conditions this parallels the Wilcoxon signed-rank test; with 3+ it parallels the Friedman test.

Two conditions (Wilcoxon)

Classical:

W+=di>0Ri,where Ri=rank(di) and di=x1ix2iW^+ = \sum_{d_i > 0} R_i, \quad \text{where } R_i = \text{rank}(|d_i|) \text{ and } d_i = x_{1i} - x_{2i}

As GLM (mixed):

yijN(μij,σ2),μij=β0+β1xj+ui,where yij=rank(yij)y_{ij}^* \sim \mathcal{N}(\mu_{ij}, \sigma^2), \quad \mu_{ij} = \beta_0 + \beta_1 x_j + u_i, \quad \text{where } y_{ij}^* = \text{rank}(y_{ij})
scipy_mann = mannwhitneyu(male, female)
scipy_mann
MannwhitneyuResult(statistic=np.float64(20845.5), pvalue=np.float64(1.8133343032461053e-15))
m_wilcox = model("rank(body_mass_g) ~ sex + (1|species)", penguins).fit().infer()

m_wilcox.params.filter(pl.col("term").str.contains("sex")).select("term", "statistic", "p_value")
Loading...

scipy reports the Mann-Whitney U statistic; bossanova reports a t-statistic on ranks with varying intercepts. The test statistics differ but both test H_0: no group difference and yield equivalent conclusions.

Three+ conditions (Friedman)

Classical:

χF2=12nk(k+1)j=1kRj23n(k+1),χF2˙χ2(k1) under H0\chi^2_F = \frac{12}{nk(k+1)} \sum_{j=1}^{k} R_j^2 - 3n(k+1), \quad \chi^2_F \dot{\sim} \chi^2(k-1) \text{ under } H_0

where RjR_j is the sum of ranks for condition jj across nn subjects.

As GLM (mixed):

yijN(μij,σ2),μij=β0+l=1k1βlxlj+ui,where yij=rank(yij)y_{ij}^* \sim \mathcal{N}(\mu_{ij}, \sigma^2), \quad \mu_{ij} = \beta_0 + \sum_{l=1}^{k-1} \beta_l x_{lj} + u_i, \quad \text{where } y_{ij}^* = \text{rank}(y_{ij})

Joint FF-test on the condition coefficients parallels the Friedman χ2\chi^2.

scipy_kruskal = kruskal(adelie, chinstrap, gentoo)
scipy_kruskal
KruskalResult(statistic=np.float64(237.34574750210166), pvalue=np.float64(2.890851468876691e-52))
m_friedman = model("rank(flipper_length_mm) ~ species + (1|island)", penguins).fit().infer("joint")

m_friedman.effects
Loading...

Adding covariates

Mixed models naturally extend to include between-subject covariates -- there is no simple classical equivalent.

m_cov = model("flipper_length_mm ~ species + sex + (1|island)", penguins).fit().infer()

m_cov.params.select("term", "estimate", "p_value")
Loading...