Marginal vs Conditional Effects

In mixed models, there are two fundamentally different ways to interpret the effects of predictors. Conditional effects describe what happens for a specific group -- a particular classroom, patient, or experimental subject. They answer: “given that I know which group this observation belongs to, what’s the expected outcome?” Marginal effects describe what happens on average across all groups, integrating over the distribution of random effects. They answer: “for a randomly chosen observation from the population, what’s the expected outcome?”

For linear mixed models, the distinction is often academic -- marginal and conditional slopes are identical because the identity link function preserves additivity. But for generalized linear mixed models (GLMMs), the two can differ dramatically. The nonlinear link function (logit, log, etc.) means that averaging predictions across groups is not the same as predicting at the average random effect. Understanding this distinction is crucial for interpreting mixed model results correctly, and for knowing what your software is actually reporting.

Linear mixed models: the easy case¶

Intercepts cancel in slopes (MAR.1)¶

In a random-intercept LMM, each group has its own intercept but the same slope. When computing the effect of a predictor (the change in $y$ per unit change in $x$ ), the group intercepts cancel out. This means the conditional slope (within any specific group) and the marginal slope (averaged across groups) are the same.

n_groups = 5
n_per = 30
n = n_groups * n_per
group = np.repeat(np.arange(n_groups), n_per)

# Random intercepts, common slope
group_intercepts = np.array([60, 65, 70, 75, 80])
true_slope = 2.0
x = np.random.normal(0, 1, n)
y = group_intercepts[group] + true_slope * x + np.random.normal(0, 1, n)

# Conditional prediction: y_i = (intercept_g + slope * x)
# Marginal prediction:   E[y] = E[intercept_g] + slope * x = grand_mean + slope * x
# The SLOPE is the same either way!

for g in range(n_groups):
    mask = group == g
    # Group-specific OLS slope
    x_g = x[mask]
    y_g = y[mask]
    slope_g = np.cov(x_g, y_g)[0, 1] / np.var(x_g)
    print(f"Group {g} slope: {slope_g:.3f}")
print(f"\nTrue slope:     {true_slope:.3f}")
print("→ Marginal slope = Conditional slope (intercepts cancel)")

Group 0 slope: 2.298
Group 1 slope: 2.105
Group 2 slope: 1.782
Group 3 slope: 2.147
Group 4 slope: 1.992

True slope:     2.000
→ Marginal slope = Conditional slope (intercepts cancel)

Property MAR.1: In a random-intercept LMM, marginal and conditional slopes are identical because group intercepts cancel when computing derivatives.

Conditional = Fixed + Random (MAR.2)¶

The conditional mean for a specific group combines two components: the population-level prediction (fixed effects) and the group-specific deviation (random effects). This decomposition is what makes mixed models interpretable -- you can see both the “typical” prediction and how each group deviates from it.

Property MAR.2: The conditional mean for group $g$ is: $E[y \mid b_g] = \mathbf{X}\boldsymbol{\beta} + \mathbf{Z}b_g$

# Population prediction (fixed effects only)
grand_intercept = np.mean(group_intercepts)
y_pop = grand_intercept + true_slope * np.linspace(-2, 2, 5)

# Group-specific prediction (fixed + random)
print(f"Population intercept: {grand_intercept}")
print(f"\nConditional means = Fixed + Random:")
for g in range(n_groups):
    offset = group_intercepts[g] - grand_intercept
    print(f"  Group {g}: {grand_intercept:.0f} + {offset:+.0f} = {group_intercepts[g]:.0f}")

Population intercept: 70.0

Conditional means = Fixed + Random:
  Group 0: 70 + -10 = 60
  Group 1: 70 + -5 = 65
  Group 2: 70 + +0 = 70
  Group 3: 70 + +5 = 75
  Group 4: 70 + +10 = 80

Balanced design equivalence (MAR.3)¶

With balanced designs (equal group sizes), the marginal mean equals the simple average of group means. This is the simplest case: no group is over- or under-represented, so the population average is just the arithmetic mean of the group-level predictions.

Property MAR.3: With balanced designs (equal group sizes), the marginal mean equals the simple average of group means.

# Balanced design: each group has exactly n_per observations
group_means = np.array([y[group == g].mean() for g in range(n_groups)])
marginal_mean = y.mean()
avg_of_group_means = group_means.mean()

print("Group means:")
for g in range(n_groups):
    print(f"  Group {g}: {group_means[g]:.2f}")
print(f"\nMarginal mean (overall):         {marginal_mean:.2f}")
print(f"Average of group means:          {avg_of_group_means:.2f}")
print(f"Difference:                      {abs(marginal_mean - avg_of_group_means):.4f}")
print("\n→ With balanced groups, these are identical")

Group means:
  Group 0: 59.90
  Group 1: 64.82
  Group 2: 70.07
  Group 3: 74.82
  Group 4: 79.92

Marginal mean (overall):         69.91
Average of group means:          69.91
Difference:                      0.0000

→ With balanced groups, these are identical

GLMMs: where things get interesting¶

Link scale additivity (MAR.4)¶

In GLMMs, the random effects are additive on the link scale (e.g., log-odds for logistic regression), not on the response scale (probabilities). This is a key distinction: a shift of +1 on the log-odds scale produces a different change in probability depending on where you start. Near $p = 0.5$ , a log-odds shift has a large effect on probability; near $p = 0$ or $p = 1$ , the same shift has a much smaller effect.

Property MAR.4: On the link scale, $\eta_g = \mathbf{X}\boldsymbol{\beta} + b_g$ (additive). On the response scale, $\mu_g = g^{-1}(\mathbf{X}\boldsymbol{\beta} + b_g)$ (nonlinear).

# Logistic GLMM: log-odds are additive
beta_fixed = np.array([-1.0, 0.5])  # intercept, slope

# 5 groups with varying intercepts (on log-odds scale)
random_intercepts = np.array([-1.5, -0.5, 0.0, 0.5, 1.5])

x_grid = np.linspace(-3, 3, 7)
print(f"{'x':>4}  {'Pop prob':>9}  " + "  ".join(f"Group {g}" for g in range(5)))
for x_val in x_grid:
    eta_pop = beta_fixed[0] + beta_fixed[1] * x_val
    p_pop = expit(eta_pop)
    group_probs = [expit(eta_pop + b) for b in random_intercepts]
    pop_str = f"{p_pop:.3f}"
    grp_str = "  ".join(f"{p:.3f}" for p in group_probs)
    print(f"{x_val:>4.0f}  {pop_str:>9}  {grp_str}")

   x   Pop prob  Group 0  Group 1  Group 2  Group 3  Group 4
  -3      0.076  0.018  0.047  0.076  0.119  0.269
  -2      0.119  0.029  0.076  0.119  0.182  0.378
  -1      0.182  0.047  0.119  0.182  0.269  0.500
   0      0.269  0.076  0.182  0.269  0.378  0.622
   1      0.378  0.119  0.269  0.378  0.500  0.731
   2      0.500  0.182  0.378  0.500  0.622  0.818
   3      0.622  0.269  0.500  0.622  0.731  0.881

Jensen’s inequality: the core issue (MAR.5)¶

The problem¶

For GLMMs, the marginal mean $E[\mu] = E[g^{-1}(\mathbf{X}\boldsymbol{\beta} + b)]$ is NOT the same as $g^{-1}(\mathbf{X}\boldsymbol{\beta} + E[b]) = g^{-1}(\mathbf{X}\boldsymbol{\beta})$ .

This is Jensen’s inequality: for a concave function $f$ , $E[f(x)] \leq f(E[x])$ . The logistic function is concave in certain regions, which means that averaging the transformed values is not the same as transforming the average.

Property MAR.5: Due to Jensen’s inequality, the marginal probability is NOT simply the inverse-link applied to the fixed effects alone.

# Jensen's inequality in action
# E[expit(eta + b)] ≠ expit(eta + E[b]) = expit(eta)

eta = 0.0  # population log-odds = 0 → prob = 0.5
sigma_b = 1.5  # random intercept SD

# Monte Carlo: average expit over random intercepts
b_samples = np.random.normal(0, sigma_b, 10000)
marginal_prob = np.mean(expit(eta + b_samples))
conditional_prob = expit(eta)  # at E[b] = 0

print(f"Conditional prob (at b=0):   {conditional_prob:.4f}")
print(f"Marginal prob (averaged):    {marginal_prob:.4f}")
print(f"Difference:                  {marginal_prob - conditional_prob:.4f}")
print(f"\n→ Marginal < conditional because expit is concave around 0.5")

Conditional prob (at b=0):   0.5000
Marginal prob (averaged):    0.4993
Difference:                  -0.0007

→ Marginal < conditional because expit is concave around 0.5

The attenuation effect (MAR.6)¶

The problem¶

Marginal slopes are always smaller (attenuated) compared to conditional slopes in GLMMs. This is a direct consequence of Jensen’s inequality applied to the derivative of the link function. Random effects add variability on the link scale, and when that variability is transformed through the nonlinear inverse-link, the resulting curves are flatter on average.

Property MAR.6: The marginal slope in a GLMM is attenuated relative to the conditional slope: $\beta_{\text{marginal}} < \beta_{\text{conditional}}$ .

# Attenuation: marginal slope < conditional slope
beta_cond = 0.5  # conditional (within-group) slope

# Marginal slope via Monte Carlo
x_lo, x_hi = -0.5, 0.5  # small step
b_samples = np.random.normal(0, sigma_b, 10000)

p_lo = np.mean(expit(beta_fixed[0] + beta_cond * x_lo + b_samples))
p_hi = np.mean(expit(beta_fixed[0] + beta_cond * x_hi + b_samples))
marginal_slope_prob = (p_hi - p_lo) / (x_hi - x_lo)

# Conditional slope at b=0
p_lo_cond = expit(beta_fixed[0] + beta_cond * x_lo)
p_hi_cond = expit(beta_fixed[0] + beta_cond * x_hi)
cond_slope_prob = (p_hi_cond - p_lo_cond) / (x_hi - x_lo)

print(f"Conditional slope (prob scale): {cond_slope_prob:.4f}")
print(f"Marginal slope (prob scale):    {marginal_slope_prob:.4f}")
print(f"Attenuation ratio:              {marginal_slope_prob / cond_slope_prob:.3f}")
print(f"\n→ Marginal slope is ~{(1 - marginal_slope_prob/cond_slope_prob)*100:.0f}% smaller")

Conditional slope (prob scale): 0.0981
Marginal slope (prob scale):    0.0796
Attenuation ratio:              0.812

→ Marginal slope is ~19% smaller

Link scale invariance (MAR.7)¶

The attenuation effect only occurs on the response scale. On the link scale (log-odds for logistic, log for Poisson), marginal and conditional effects are identical -- just as they are in linear models. The discrepancy is entirely introduced by the nonlinear transformation from the link scale to the response scale.

Property MAR.7: On the link scale, marginal and conditional effects are the same -- the attenuation only occurs when transforming to the response scale.

# On log-odds scale: marginal = conditional
print("Log-odds scale (LINEAR):")
print(f"  Conditional slope: {beta_cond:.4f}")
print(f"  Marginal slope:    {beta_cond:.4f}  (same!)")

print(f"\nProbability scale (NONLINEAR):")
print(f"  Conditional slope: {cond_slope_prob:.4f}")
print(f"  Marginal slope:    {marginal_slope_prob:.4f}  (attenuated!)")

Log-odds scale (LINEAR):
  Conditional slope: 0.5000
  Marginal slope:    0.5000  (same!)

Probability scale (NONLINEAR):
  Conditional slope: 0.0981
  Marginal slope:    0.0796  (attenuated!)

Practical implications¶

The table below summarizes when marginal and conditional effects agree and when they diverge. The key factor is the link function: identity links preserve the equivalence, while nonlinear links (logit, log) introduce attenuation.

print("When are marginal and conditional effects the same?")
print("=" * 55)
print(f"{'Model':>20}  {'Link scale':>12}  {'Response scale':>15}")
print(f"{'LMM (identity)':>20}  {'Same':>12}  {'Same':>15}")
print(f"{'GLMM (logit)':>20}  {'Same':>12}  {'Different':>15}")
print(f"{'GLMM (log)':>20}  {'Same':>12}  {'Different':>15}")
print(f"\n→ Attenuation increases with random effect variance")
print(f"→ For LMMs, this distinction doesn't matter")
print(f"→ For GLMMs, always specify which you're reporting")

When are marginal and conditional effects the same?
=======================================================
               Model    Link scale   Response scale
      LMM (identity)          Same             Same
        GLMM (logit)          Same        Different
          GLMM (log)          Same        Different

→ Attenuation increases with random effect variance
→ For LMMs, this distinction doesn't matter
→ For GLMMs, always specify which you're reporting

Summary¶

Property	Statement	When it matters
MAR.1	Intercepts cancel in slopes	LMM: marginal = conditional slopes
MAR.2	Conditional = Fixed + Random	Interpreting group-specific predictions
MAR.3	Balanced design equivalence	Marginal mean = average of group means
MAR.4	Link scale additivity	Random effects additive on link scale
MAR.5	Jensen’s inequality	Marginal ≠ conditional on response scale
MAR.6	Attenuation effect	Marginal slopes < conditional slopes
MAR.7	Link scale invariance	No attenuation on link scale

Marginal vs Conditional Effects

Linear mixed models: the easy case¶

Intercepts cancel in slopes (MAR.1)¶

Conditional = Fixed + Random (MAR.2)¶

Balanced design equivalence (MAR.3)¶

GLMMs: where things get interesting¶

Link scale additivity (MAR.4)¶

Jensen’s inequality: the core issue (MAR.5)¶

The problem¶

The attenuation effect (MAR.6)¶

The problem¶

Link scale invariance (MAR.7)¶

Practical implications¶

Summary¶

References¶