Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Why bossanova?

A Library for Statistical Thinking

University of California San Diego

Confusion arises not from a disagreement about core beliefs, but from the absence of a shared language for discussing tools.

Motivation

Bossa Nova emerged in 1950s Rio de Janeiro as a fusion of samba and jazz—a synthesis of Brazilian and North American musical traditions that became something entirely its own. When critics dismissed it as Americanized imitation, composer Antônio Carlos Jobim countered remarked how this synthesis had become “the biggest influence on American music in the last thirty years.”

In a similar spirit bossanova aims to bridge two traditionally distinct statistical cultures Breiman, 2001 Shmueli, 2010:

  1. Explanation modeling culture assumes data comes from a stochastic model. The goal is to estimate parameters and test hypotheses. “Is the effect significant?”

  2. Prediction modeling culture treats the data-generating mechanism as unknown. The goal is prediction. “How accurately can we forecast new observations?”

Most experimental social science training grew from the first culture, while much modern machine learning grew from the latter Yarkoni & Westfall, 2017. And the tools we use reflect this divide: R grew from statisticians asking “what are the effects?”; scikit-learn grew from computer scientists asking “what’s the test accuracy?”

But the division is artificial. Researchers need both. A neuroscientist fitting a mixed-effects model wants to know whether an effect is “real” and whether it generalizes beyond the current sample. A data scientist building a classifier wants predictive accuracy and some understanding of why the model works. The practical consequence is tooling fragmentation. A typical analysis might require (in Python):

Each package has its own conventions, data structures, and mental models adding to the cognitive overhead and distracting from the goal: what is my model telling me?

A Cultural Bridge

After fitting any model, you can ask three fundamentally different questions:

QuestionCultureWhat You Learn
“Is β ≠ 0?”ExplanationStatistical significance of predictors
“Does this generalize?”PredictionOut-of-sample accuracy
“What if X changed?”BothExpected outcomes under counterfactuals

The third question—“What would happen if...?”—is where the cultures meet. This is the domain of marginal effects Searle et al., 1980Lenth, 2016Arel-Bundock et al., 2024: the model’s implications under controlled changes.

“What if?” questions have deep roots. Judea Pearl’s Ladder of Causation Pearl & Mackenzie, 2018 distinguishes three levels of reasoning:

  1. Association: “What do I expect to see?” — Observational data, correlations

  2. Intervention: “What if I do X?” — The do(X=x) operator Pearl, 1995Pearl, 2009

  3. Counterfactuals: “What if I had done X differently?” — Individual-level potential outcomes Rubin, 1974

Traditional regression lives on Rung 1—it describes patterns in data. Causal inference Hernán & Robins, 2020 asks how to climb higher: when can we interpret associations as interventions?

Marginal effects occupy a productive middle ground. Computationally, they answer: “What does the model predict under counterfactual covariate values?” This resembles interventional reasoning but remains a Rung 1 calculation—a summary of the fitted model, not a causal claim. Whether the resemblance is causally meaningful depends entirely on assumptions the model cannot verify: no unmeasured confounding, correct functional form, no selection bias. These require domain knowledge to evaluate Shmueli, 2010.

The value of marginal effects isn’t that they automatically provide causal answers. It’s that they phrase questions in substantively meaningful terms. Instead of “the coefficient is 0.3,” you ask “what’s the predicted difference between treatment and control, averaging over the covariate distribution?” or “what would happen if the treatment was twice as strong?” This reframing connects statistical output to the counterfactual questions researchers actually care about Harrell, 2015.

A Bicycle for the Mind

Richard McElreath describes statistical models as golems—powerful constructs that do exactly what you tell them, whether or not that’s what you actually wanted McElreath, 2020. A golem has no intent; it follows instructions. If your instructions are wrong, the golem will faithfully execute them. But as engineers we can help users wrangle their golem a bit more easily.

Take the simple Wilkinson formula syntax familiar to R users Wilkinson & Rogers, 1973Chambers & Hastie, 1992: y ~ x1 * x2. While simple to understand for beginners its strength lies in its declarative specification of structure, separate from the how of linear algebra and statistical inference. bossanova extends this idea with a handful of grammar-like methods Wickham, 2010 that encompass the full combinatorial space of all the ways you might use a model as a tool:

  1. Formulas: declare what your model is

  2. Methods: declare what question you want to ask

  1. Inference: declare how you want to answer it

The goal isn’t to hide complexity behind “smart” defaults that guess your intent, but make the golem’s behavior legible to help you interpret what you’re asking for. Statistical jargon creates unnecessary confusion. The terms “fixed” and “random” effects have at least five conflicting definitions across statistics, econometrics, and machine learning Gelman, 2005Gelman & Hill, 2007. bossanova follows Gelman’s recommendation: we use varying for effects that differ across groups, avoiding terminology that means different things to different communities.

Conclusion

Bossa Nova roughly means “new trend” or “new wave” in Portuguese. The musical genre emerged from a desire to create something fresh by synthesizing existing traditions—not rejecting them, but building on them with a different sensibility. bossanova aspires to the same: a synthesis of statistical cultures, explanation and prediction, that becomes a tool for exploration.

References
  1. Breiman, L. (2001). Statistical Modeling: The Two Cultures. Statistical Science, 16(3), 199–231. 10.1214/ss/1009213726
  2. Shmueli, G. (2010). To Explain or to Predict? Statistical Science, 25(3), 289–310. 10.1214/10-STS330
  3. Yarkoni, T., & Westfall, J. (2017). Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning. Perspectives on Psychological Science, 12(6), 1100–1122. 10.1177/1745691617693393
  4. Jolly, E. (2018). Pymer4: Connecting R and Python for Linear Mixed Modeling. Journal of Open Source Software, 3(31), 862. 10.21105/joss.00862
  5. Searle, S. R., Speed, F. M., & Milliken, G. A. (1980). Population Marginal Means in the Linear Model: An Alternative to Least Squares Means. The American Statistician, 34(4), 216–221. 10.1080/00031305.1980.10483031
  6. Lenth, R. V. (2016). Least-Squares Means: The R Package lsmeans. Journal of Statistical Software, 69(1), 1–33. 10.18637/jss.v069.i01
  7. Arel-Bundock, V., Greifer, N., & Heiss, A. (2024). How to Interpret Statistical Models Using marginaleffects for R and Python. Journal of Statistical Software, 111(9), 1–32. 10.18637/jss.v111.i09
  8. Pearl, J., & Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect. Basic Books.
  9. Pearl, J. (1995). Causal Diagrams for Empirical Research. Biometrika, 82(4), 669–688. 10.1093/biomet/82.4.669
  10. Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press.
  11. Rubin, D. B. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66(5), 688–701. 10.1037/h0037350
  12. Hernán, M. A., & Robins, J. M. (2020). Causal Inference: What If. Chapman & Hall/CRC. https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
  13. Harrell, F. E. (2015). Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (2nd ed.). Springer. 10.1007/978-3-319-19425-7
  14. McElreath, R. (2020). Statistical Rethinking: A Bayesian Course with Examples in R and Stan (2nd ed.). CRC Press.
  15. Wilkinson, G. N., & Rogers, C. E. (1973). Symbolic Description of Factorial Models for Analysis of Variance. Journal of the Royal Statistical Society, Series C (Applied Statistics), 22(3), 392–399. 10.2307/2346786