In mixed models, there are two fundamentally different ways to interpret the effects of predictors. Conditional effects describe what happens for a specific group -- a particular classroom, patient, or experimental subject. They answer: “given that I know which group this observation belongs to, what’s the expected outcome?” Marginal effects describe what happens on average across all groups, integrating over the distribution of random effects. They answer: “for a randomly chosen observation from the population, what’s the expected outcome?”
For linear mixed models, the distinction is often academic -- marginal and conditional slopes are identical because the identity link function preserves additivity. But for generalized linear mixed models (GLMMs), the two can differ dramatically. The nonlinear link function (logit, log, etc.) means that averaging predictions across groups is not the same as predicting at the average random effect. Understanding this distinction is crucial for interpreting mixed model results correctly, and for knowing what your software is actually reporting.
Linear mixed models: the easy case¶
Intercepts cancel in slopes (MAR.1)¶
In a random-intercept LMM, each group has its own intercept but the same slope. When computing the effect of a predictor (the change in per unit change in ), the group intercepts cancel out. This means the conditional slope (within any specific group) and the marginal slope (averaged across groups) are the same.
n_groups = 5
n_per = 30
n = n_groups * n_per
group = np.repeat(np.arange(n_groups), n_per)
# Random intercepts, common slope
group_intercepts = np.array([60, 65, 70, 75, 80])
true_slope = 2.0
x = np.random.normal(0, 1, n)
y = group_intercepts[group] + true_slope * x + np.random.normal(0, 1, n)
# Conditional prediction: y_i = (intercept_g + slope * x)
# Marginal prediction: E[y] = E[intercept_g] + slope * x = grand_mean + slope * x
# The SLOPE is the same either way!
for g in range(n_groups):
mask = group == g
# Group-specific OLS slope
x_g = x[mask]
y_g = y[mask]
slope_g = np.cov(x_g, y_g)[0, 1] / np.var(x_g)
print(f"Group {g} slope: {slope_g:.3f}")
print(f"\nTrue slope: {true_slope:.3f}")
print("→ Marginal slope = Conditional slope (intercepts cancel)")Group 0 slope: 2.298
Group 1 slope: 2.105
Group 2 slope: 1.782
Group 3 slope: 2.147
Group 4 slope: 1.992
True slope: 2.000
→ Marginal slope = Conditional slope (intercepts cancel)
Property MAR.1: In a random-intercept LMM, marginal and conditional slopes are identical because group intercepts cancel when computing derivatives.
Why do the intercepts cancel?
For a random intercept model, . The slope regardless of . Marginalizing over groups: , so the marginal slope is also . The intercepts shift each group’s line up or down, but the slope -- the rate of change with respect to -- is shared across all groups and survives marginalization.
Conditional = Fixed + Random (MAR.2)¶
The conditional mean for a specific group combines two components: the population-level prediction (fixed effects) and the group-specific deviation (random effects). This decomposition is what makes mixed models interpretable -- you can see both the “typical” prediction and how each group deviates from it.
Property MAR.2: The conditional mean for group is:
# Population prediction (fixed effects only)
grand_intercept = np.mean(group_intercepts)
y_pop = grand_intercept + true_slope * np.linspace(-2, 2, 5)
# Group-specific prediction (fixed + random)
print(f"Population intercept: {grand_intercept}")
print(f"\nConditional means = Fixed + Random:")
for g in range(n_groups):
offset = group_intercepts[g] - grand_intercept
print(f" Group {g}: {grand_intercept:.0f} + {offset:+.0f} = {group_intercepts[g]:.0f}")Population intercept: 70.0
Conditional means = Fixed + Random:
Group 0: 70 + -10 = 60
Group 1: 70 + -5 = 65
Group 2: 70 + +0 = 70
Group 3: 70 + +5 = 75
Group 4: 70 + +10 = 80
Balanced design equivalence (MAR.3)¶
With balanced designs (equal group sizes), the marginal mean equals the simple average of group means. This is the simplest case: no group is over- or under-represented, so the population average is just the arithmetic mean of the group-level predictions.
Property MAR.3: With balanced designs (equal group sizes), the marginal mean equals the simple average of group means.
# Balanced design: each group has exactly n_per observations
group_means = np.array([y[group == g].mean() for g in range(n_groups)])
marginal_mean = y.mean()
avg_of_group_means = group_means.mean()
print("Group means:")
for g in range(n_groups):
print(f" Group {g}: {group_means[g]:.2f}")
print(f"\nMarginal mean (overall): {marginal_mean:.2f}")
print(f"Average of group means: {avg_of_group_means:.2f}")
print(f"Difference: {abs(marginal_mean - avg_of_group_means):.4f}")
print("\n→ With balanced groups, these are identical")Group means:
Group 0: 59.90
Group 1: 64.82
Group 2: 70.07
Group 3: 74.82
Group 4: 79.92
Marginal mean (overall): 69.91
Average of group means: 69.91
Difference: 0.0000
→ With balanced groups, these are identical
GLMMs: where things get interesting¶
Link scale additivity (MAR.4)¶
In GLMMs, the random effects are additive on the link scale (e.g., log-odds for logistic regression), not on the response scale (probabilities). This is a key distinction: a shift of +1 on the log-odds scale produces a different change in probability depending on where you start. Near , a log-odds shift has a large effect on probability; near or , the same shift has a much smaller effect.
Property MAR.4: On the link scale, (additive). On the response scale, (nonlinear).
# Logistic GLMM: log-odds are additive
beta_fixed = np.array([-1.0, 0.5]) # intercept, slope
# 5 groups with varying intercepts (on log-odds scale)
random_intercepts = np.array([-1.5, -0.5, 0.0, 0.5, 1.5])
x_grid = np.linspace(-3, 3, 7)
print(f"{'x':>4} {'Pop prob':>9} " + " ".join(f"Group {g}" for g in range(5)))
for x_val in x_grid:
eta_pop = beta_fixed[0] + beta_fixed[1] * x_val
p_pop = expit(eta_pop)
group_probs = [expit(eta_pop + b) for b in random_intercepts]
pop_str = f"{p_pop:.3f}"
grp_str = " ".join(f"{p:.3f}" for p in group_probs)
print(f"{x_val:>4.0f} {pop_str:>9} {grp_str}") x Pop prob Group 0 Group 1 Group 2 Group 3 Group 4
-3 0.076 0.018 0.047 0.076 0.119 0.269
-2 0.119 0.029 0.076 0.119 0.182 0.378
-1 0.182 0.047 0.119 0.182 0.269 0.500
0 0.269 0.076 0.182 0.269 0.378 0.622
1 0.378 0.119 0.269 0.378 0.500 0.731
2 0.500 0.182 0.378 0.500 0.622 0.818
3 0.622 0.269 0.500 0.622 0.731 0.881
Jensen’s inequality: the core issue (MAR.5)¶
The problem¶
For GLMMs, the marginal mean is NOT the same as .
This is Jensen’s inequality: for a concave function , . The logistic function is concave in certain regions, which means that averaging the transformed values is not the same as transforming the average.
Property MAR.5: Due to Jensen’s inequality, the marginal probability is NOT simply the inverse-link applied to the fixed effects alone.
# Jensen's inequality in action
# E[expit(eta + b)] ≠ expit(eta + E[b]) = expit(eta)
eta = 0.0 # population log-odds = 0 → prob = 0.5
sigma_b = 1.5 # random intercept SD
# Monte Carlo: average expit over random intercepts
b_samples = np.random.normal(0, sigma_b, 10000)
marginal_prob = np.mean(expit(eta + b_samples))
conditional_prob = expit(eta) # at E[b] = 0
print(f"Conditional prob (at b=0): {conditional_prob:.4f}")
print(f"Marginal prob (averaged): {marginal_prob:.4f}")
print(f"Difference: {marginal_prob - conditional_prob:.4f}")
print(f"\n→ Marginal < conditional because expit is concave around 0.5")Conditional prob (at b=0): 0.5000
Marginal prob (averaged): 0.4993
Difference: -0.0007
→ Marginal < conditional because expit is concave around 0.5
Why does averaging pull the probability down?
The logistic function is concave above and convex below. Around , averaging the function values over random intercepts pulls the mean probability downward. Intuitively, the logistic curve flattens near 0 and 1 -- groups with extreme random effects are “squashed” toward the boundaries, while groups near the center move more freely. The net effect is that the population-average probability is less extreme than the conditional probability evaluated at the mean random effect (). The larger the random effect variance , the stronger this effect.
The attenuation effect (MAR.6)¶
The problem¶
Marginal slopes are always smaller (attenuated) compared to conditional slopes in GLMMs. This is a direct consequence of Jensen’s inequality applied to the derivative of the link function. Random effects add variability on the link scale, and when that variability is transformed through the nonlinear inverse-link, the resulting curves are flatter on average.
Property MAR.6: The marginal slope in a GLMM is attenuated relative to the conditional slope: .
# Attenuation: marginal slope < conditional slope
beta_cond = 0.5 # conditional (within-group) slope
# Marginal slope via Monte Carlo
x_lo, x_hi = -0.5, 0.5 # small step
b_samples = np.random.normal(0, sigma_b, 10000)
p_lo = np.mean(expit(beta_fixed[0] + beta_cond * x_lo + b_samples))
p_hi = np.mean(expit(beta_fixed[0] + beta_cond * x_hi + b_samples))
marginal_slope_prob = (p_hi - p_lo) / (x_hi - x_lo)
# Conditional slope at b=0
p_lo_cond = expit(beta_fixed[0] + beta_cond * x_lo)
p_hi_cond = expit(beta_fixed[0] + beta_cond * x_hi)
cond_slope_prob = (p_hi_cond - p_lo_cond) / (x_hi - x_lo)
print(f"Conditional slope (prob scale): {cond_slope_prob:.4f}")
print(f"Marginal slope (prob scale): {marginal_slope_prob:.4f}")
print(f"Attenuation ratio: {marginal_slope_prob / cond_slope_prob:.3f}")
print(f"\n→ Marginal slope is ~{(1 - marginal_slope_prob/cond_slope_prob)*100:.0f}% smaller")Conditional slope (prob scale): 0.0981
Marginal slope (prob scale): 0.0796
Attenuation ratio: 0.812
→ Marginal slope is ~19% smaller
Why are marginal slopes always smaller?
Random intercepts add “noise” to the link-scale prediction. When transformed through the nonlinear link function, this noise averages out to flatter curves. Think of it this way: some groups have their logistic curve shifted left, others shifted right. When you average all these shifted S-curves together, you get a shallower S-curve. More group variation (larger ) means more attenuation. This is why you should always report which type of effect you’re interpreting in GLMM results -- conditional and marginal slopes can differ substantially.
Link scale invariance (MAR.7)¶
The attenuation effect only occurs on the response scale. On the link scale (log-odds for logistic, log for Poisson), marginal and conditional effects are identical -- just as they are in linear models. The discrepancy is entirely introduced by the nonlinear transformation from the link scale to the response scale.
Property MAR.7: On the link scale, marginal and conditional effects are the same -- the attenuation only occurs when transforming to the response scale.
# On log-odds scale: marginal = conditional
print("Log-odds scale (LINEAR):")
print(f" Conditional slope: {beta_cond:.4f}")
print(f" Marginal slope: {beta_cond:.4f} (same!)")
print(f"\nProbability scale (NONLINEAR):")
print(f" Conditional slope: {cond_slope_prob:.4f}")
print(f" Marginal slope: {marginal_slope_prob:.4f} (attenuated!)")Log-odds scale (LINEAR):
Conditional slope: 0.5000
Marginal slope: 0.5000 (same!)
Probability scale (NONLINEAR):
Conditional slope: 0.0981
Marginal slope: 0.0796 (attenuated!)
Practical implications¶
The table below summarizes when marginal and conditional effects agree and when they diverge. The key factor is the link function: identity links preserve the equivalence, while nonlinear links (logit, log) introduce attenuation.
print("When are marginal and conditional effects the same?")
print("=" * 55)
print(f"{'Model':>20} {'Link scale':>12} {'Response scale':>15}")
print(f"{'LMM (identity)':>20} {'Same':>12} {'Same':>15}")
print(f"{'GLMM (logit)':>20} {'Same':>12} {'Different':>15}")
print(f"{'GLMM (log)':>20} {'Same':>12} {'Different':>15}")
print(f"\n→ Attenuation increases with random effect variance")
print(f"→ For LMMs, this distinction doesn't matter")
print(f"→ For GLMMs, always specify which you're reporting")When are marginal and conditional effects the same?
=======================================================
Model Link scale Response scale
LMM (identity) Same Same
GLMM (logit) Same Different
GLMM (log) Same Different
→ Attenuation increases with random effect variance
→ For LMMs, this distinction doesn't matter
→ For GLMMs, always specify which you're reporting
Summary¶
| Property | Statement | When it matters |
|---|---|---|
| MAR.1 | Intercepts cancel in slopes | LMM: marginal = conditional slopes |
| MAR.2 | Conditional = Fixed + Random | Interpreting group-specific predictions |
| MAR.3 | Balanced design equivalence | Marginal mean = average of group means |
| MAR.4 | Link scale additivity | Random effects additive on link scale |
| MAR.5 | Jensen’s inequality | Marginal ≠ conditional on response scale |
| MAR.6 | Attenuation effect | Marginal slopes < conditional slopes |
| MAR.7 | Link scale invariance | No attenuation on link scale |