Containers Reference

All container classes, builder functions, and validators from bossanova.internal.containers. See Container Overview for the pipeline diagram.

Container Classes¶

`DataBundle`¶

Validated model data (valid observations only)

Field	Type
`X`	`NDArray[np.floating]`
`X_names`	`tuple[str, ...]`
`Z`	`sp.csc_matrix \| None`
`contrast_types`	`dict[str, str]`
`factor_levels`	`dict[str, tuple[str, ...]]`
`has_random_effects`	`bool`
`n`	`int`
`n_total`	`int`
`offset`	`NDArray[np.floating] \| None`
`p`	`int`
`rank`	`int`
`rank_info`	`RankInfo \| None`
`re_metadata`	`REInfo \| None`
`response_levels`	`tuple[str, ...] \| None`
`valid_mask`	`NDArray[np.bool_]`
`weights`	`NDArray[np.floating] \| None`
`y`	`NDArray[np.floating]`
`y_name`	`str`

`REInfo`¶

Random effects metadata

Field	Type
`X_re`	`NDArray[np.float64] \| list[NDArray[np.float64]] \| None`
`group_ids_list`	`list[NDArray[np.intp]]`
`group_indices`	`dict[str, NDArray[np.intp]]`
`grouping_vars`	`tuple[str, ...]`
`metadata`	`dict`
`n_groups`	`dict[str, int]`
`n_groups_list`	`list[int]`
`random_names`	`list[str]`
`re_structure`	`str`
`term_names`	`tuple[str, ...]`

`RankInfo`¶

Rank deficiency information for a design matrix

Field	Type
`dropped_indices`	`tuple[int, ...]`
`dropped_names`	`tuple[str, ...]`
`is_deficient`	`bool`
`kept_indices`	`NDArray[np.intp]`
`p`	`int`
`rank`	`int`

`MathDisplay`¶

Display wrapper for model equation with IPython rich display

Field	Type
`equation`	`str`
`explanations`	`tuple[str, ...]`

`TermInfo`¶

Parsed information about a single model term

Field	Type
`base_var`	`str`
`explanation`	`str`
`latex`	`str`
`name`	`str`
`symbol`	`str`
`term_type`	`str`

`Condition`¶

A conditioning specification in explore formula

Field	Type
`at_quantile`	`int \| None`
`at_range`	`int \| None`
`at_values`	`tuple \| None`
`contrast_expr`	`ContrastExpr \| None`
`var`	`str`

`ContrastExpr`¶

Bracket contrast expression: Drug[A - B, C - D]

Field	Type
`items`	`tuple[ContrastItem, ...]`
`var`	`str`

`ContrastItem`¶

A single contrast: left operand minus right operand

Field	Type
`left`	`ContrastOperand`
`right`	`ContrastOperand`

`ContrastOperand`¶

One side of a bracket contrast item

Field	Type
`is_wildcard`	`bool`
`levels`	`tuple[str, ...]`

`ExploreFormulaError`¶

Error in explore formula syntax

Field	Type
`formula`	``
`position`	``

`ExploreFormulaSpec`¶

Parsed explore formula

Field	Type
`conditions`	`tuple[Condition, ...]`
`contrast_degree`	`int \| None`
`contrast_expr`	`ContrastExpr \| None`
`contrast_level_ordering`	`tuple[str, ...] \| None`
`contrast_ref`	`str \| None`
`contrast_type`	`str \| None`
`focal_at_quantile`	`int \| None`
`focal_at_range`	`int \| None`
`focal_at_values`	`tuple[float \| str, ...] \| None`
`focal_var`	`str`
`has_conditions`	`bool`
`has_contrast`	`bool`
`has_contrast_expr`	`bool`
`has_rhs_contrasts`	`bool`

`FitState`¶

Immutable fitting result

Field	Type
`XtWX_inv`	`NDArray[np.floating] \| None`
`coef`	`NDArray[np.floating]`
`converged`	`bool`
`deviance`	`float \| None`
`df_resid`	`float`
`dispersion`	`float \| None`
`fitted`	`NDArray[np.floating]`
`irls_weights`	`NDArray[np.floating] \| None`
`leverage`	`NDArray[np.floating]`
`loglik`	`float`
`n_iter`	`int`
`null_deviance`	`float \| None`
`residuals`	`NDArray[np.floating]`
`sigma`	`float \| None`
`theta`	`NDArray[np.floating] \| None`
`u`	`NDArray[np.floating] \| None`
`vcov`	`NDArray[np.floating]`

`FormulaSpec`¶

Learned formula encoding — everything needed to replay on new data

Field	Type
`contrast_matrices`	`dict[str, NDArray]`
`contrast_types`	`dict[str, str]`
`custom_contrasts`	`dict[str, NDArray]`
`factors`	`dict[str, tuple[str, ...]]`
`formula`	`str`
`has_intercept`	`bool`
`has_random_effects`	`bool`
`nested_metadata`	`dict`
`re_terms`	`tuple`
`response_transform`	`tuple[str, ...] \| None`
`response_var`	`str \| None`
`rhs_terms`	`tuple`
`transform_state`	`dict[str, dict]`
`transforms`	`dict[str, object]`
`uncorr_metadata`	`dict`

`CVState`¶

Cross-validation results for model evaluation

Field	Type
`accuracy`	`float \| None`
`auc`	`float \| None`
`deviance`	`float \| None`
`f1`	`float \| None`
`fold_assignments`	`np.ndarray \| None`
`fold_metrics`	`dict[str, np.ndarray]`
`k`	`int`
`mae`	`float`
`mse`	`float`
`oos_predictions`	`np.ndarray \| None`
`oos_residuals`	`np.ndarray \| None`
`r_squared`	`float`
`rmse`	`float`
`sensitivity`	`float \| None`
`specificity`	`float \| None`

`InferenceState`¶

Inference results that augment params or estimates

Field	Type
`alternative`	`str`
`boot_samples`	`np.ndarray \| None`
`ci_lower`	`np.ndarray`
`ci_upper`	`np.ndarray`
`conf_level`	`float`
`df`	`np.ndarray`
`method`	`str`
`n_resamples`	`int \| None`
`null`	`float`
`p_value`	`np.ndarray`
`perm_samples`	`np.ndarray \| None`
`pre`	`np.ndarray \| None`
`pre_sd`	`np.ndarray \| None`
`se`	`np.ndarray`
`statistic`	`np.ndarray`

`JointTestState`¶

Joint hypothesis test results for model terms

Field	Type
`df1`	`np.ndarray`
`df2`	`np.ndarray \| None`
`p_value`	`np.ndarray`
`ss_type`	`str`
`statistic`	`np.ndarray`
`terms`	`tuple[str, ...]`
`test_type`	`str`

`ResamplesState`¶

Unified resampling results from bootstrap or permutation inference

Field	Type
`context`	`str`
`method`	`str`
`n_resamples`	`int`
`names`	`tuple[str, ...]`
`observed`	`np.ndarray`
`samples`	`np.ndarray`

`MeeState`¶

Marginal effects / estimated marginal means results

Field	Type
`L_matrix`	`np.ndarray \| None`
`L_matrix_link`	`np.ndarray \| None`
`ci_lower`	`np.ndarray \| None`
`ci_upper`	`np.ndarray \| None`
`conf_level`	`float \| None`
`contrast_method`	`str \| None`
`df`	`np.ndarray \| None`
`effect_scale`	`str`
`estimate`	`np.ndarray`
`explore_formula`	`str`
`focal_var`	`str`
`grid`	`'pl.DataFrame'`
`has_inference`	`bool`
`how`	`str`
`inference_method`	`str \| None`
`link`	`str \| None`
`n_contrast_levels`	`int \| None`
`p_value`	`np.ndarray \| None`
`se`	`np.ndarray \| None`
`statistic`	`np.ndarray \| None`
`type`	`str`

`ProfileState`¶

Profile likelihood state for variance component CIs

Field	Type
`ci_lower_sd`	`NDArray[np.floating]`
`ci_sd`	`dict[str, tuple[float, float]]`
`ci_theta`	`dict[str, tuple[float, float]]`
`ci_upper_sd`	`NDArray[np.floating]`
`conf_level`	`float`
`dev_opt`	`float`
`spline_forward`	`dict[str, Any]`
`spline_reverse`	`dict[str, Any]`
`table`	`'pl.DataFrame'`
`threshold`	`float`

`VaryingSpreadState`¶

Variance components for mixed models

Field	Type
`ci_lower`	`dict[str, float] \| None`
`ci_method`	`str \| None`
`ci_upper`	`dict[str, float] \| None`
`components`	`'pl.DataFrame'`
`conf_level`	`float \| None`
`has_inference`	`bool`
`icc`	`float \| None`
`rho`	`dict[str, float]`
`sigma2`	`float`
`tau2`	`dict[str, float]`

`VaryingState`¶

Random effects (BLUPs) for mixed models

Field	Type
`conf_level`	`float \| None`
`effects`	`dict[str, np.ndarray]`
`grid`	`'pl.DataFrame'`
`grouping_var`	`str`
`has_inference`	`bool`
`n_groups`	`int`
`pi_lower`	`dict[str, np.ndarray] \| None`
`pi_upper`	`dict[str, np.ndarray] \| None`

`PredictionConfig`¶

Configuration that produced a PredictionState

Field	Type
`allow_new_levels`	`bool`
`formula_spec`	`Any`
`newdata`	`'pl.DataFrame \| None'`
`pred_type`	`str`
`training_data`	`'pl.DataFrame \| None'`
`varying`	`str`

`PredictionState`¶

Prediction results with optional intervals

Field	Type
`X_pred`	`np.ndarray \| None`
`ci_lower`	`np.ndarray \| None`
`ci_upper`	`np.ndarray \| None`
`conf_level`	`float \| None`
`config`	`PredictionConfig \| None`
`cv_fitted`	`np.ndarray \| None`
`cv_fold`	`np.ndarray \| None`
`cv_residual`	`np.ndarray \| None`
`fitted`	`np.ndarray`
`grid`	`'pl.DataFrame \| None'`
`has_cv`	`bool`
`has_inference`	`bool`
`interval_type`	`str \| None`
`link`	`np.ndarray \| None`
`se`	`np.ndarray \| None`

`SimulationInferenceState`¶

Simulation inference results for post-fit or power analysis simulations

Field	Type
`alpha`	`float`
`bias`	`dict[str, float]`
`coverage`	`dict[str, float]`
`n_sims`	`int`
`power`	`dict[str, float]`
`rmse`	`dict[str, float]`
`sim_mean`	`np.ndarray \| None`
`sim_quantiles`	`dict[str, np.ndarray]`
`sim_sd`	`np.ndarray \| None`
`sim_type`	`str`
`true_coef`	`dict[str, float]`

`Distribution`¶

Protocol for distribution objects that can generate random samples

`ModelSpec`¶

Immutable model configuration

Field	Type
`family`	`str`
`fixed_terms`	`tuple[str, ...]`
`formula`	`str`
`has_random_effects`	`bool`
`link`	`str`
`method`	`str`
`random_terms`	`tuple[str, ...]`
`response_var`	`str`

`SimulationSpec`¶

Specification for data generation in simulation-first workflows

Field	Type
`coef`	`dict[str, float]`
`distributions`	`dict[str, Distribution]`
`n`	`int`
`re_spec`	`dict[str, VaryingSpec]`
`seed`	`int \| None`
`sigma`	`float`

`VaryingSpec`¶

Specification for random effect grouping in simulation

Field	Type
`correlations`	`dict[tuple[str, str], float]`
`n`	`int`
`n_per`	`int \| None`
`sd`	`float`
`slope_sds`	`dict[str, float]`

`Col`¶

Namespace for all column name constants used across DataFrame outputs

Field	Type
`AIC`	`str`
`AIC_R`	`str`
`BIAS`	`str`
`BIC`	`str`
`BIC_R`	`str`
`CHI2`	`str`
`CHISQ`	`str`
`CI_INCREASE_FACTOR`	`str`
`CI_LOWER`	`str`
`CI_METHOD`	`str`
`CI_UPPER`	`str`
`COHENS_D`	`str`
`COMPONENT`	`str`
`CONTRAST`	`str`
`CONVERGED`	`str`
`COOKSD`	`str`
`CORR`	`str`
`COVERAGE`	`str`
`CV_DEVIANCE`	`str`
`CV_FITTED`	`str`
`CV_FOLD`	`str`
`CV_K`	`str`
`CV_MAE`	`str`
`CV_MAE_SD`	`str`
`CV_MSE`	`str`
`CV_MSE_SD`	`str`
`CV_RESIDUAL`	`str`
`CV_RMSE`	`str`
`CV_RMSE_SD`	`str`
`CV_RSQUARED`	`str`
`CV_RSQUARED_SD`	`str`
`CV_SCORE`	`str`
`CV_SE`	`str`
`DELTA_AIC`	`str`
`DELTA_BIC`	`str`
`DEVIANCE`	`str`
`DEV_DIFF`	`str`
`DF`	`str`
`DF1`	`str`
`DF2`	`str`
`DF_MODEL`	`str`
`DF_RESID`	`str`
`DIFF`	`str`
`DIFF_SE`	`str`
`DISPERSION`	`str`
`D_LOWER`	`str`
`D_UPPER`	`str`
`EFFECT1`	`str`
`EFFECT2`	`str`
`EMPIRICAL_SE`	`str`
`ESTIMATE`	`str`
`ETA_SQ`	`str`
`FITTED`	`str`
`FSTATISTIC`	`str`
`FSTATISTIC_PVALUE`	`str`
`F_RATIO`	`str`
`F_STAT`	`str`
`GROUP`	`str`
`HAT`	`str`
`ICC`	`str`
`IS_SINGULAR`	`str`
`LEVEL`	`str`
`LINK`	`str`
`LOGLIK`	`str`
`MEAN_SE`	`str`
`METRIC`	`str`
`MODEL`	`str`
`MSE`	`str`
`N`	`str`
`NGROUPS`	`str`
`NOBS`	`str`
`NOBS_MISSING`	`str`
`NOBS_TOTAL`	`str`
`NPAR`	`str`
`NPARAMS`	`str`
`NULL_DEVIANCE`	`str`
`N_FAILED`	`str`
`N_ITER`	`str`
`N_SIMS`	`str`
`N_THETA`	`str`
`OBSERVED`	`str`
`ODDS_RATIO`	`str`
`OPTIMIZER`	`str`
`PI_LOWER_PREFIX`	`str`
`PI_UPPER_PREFIX`	`str`
`POWER`	`str`
`POWER_CI_LOWER`	`str`
`POWER_CI_UPPER`	`str`
`PRE`	`str`
`PRE_R`	`str`
`PRE_SD`	`str`
`PSEUDO_RSQUARED`	`str`
`P_VALUE`	`str`
`RESAMPLE`	`str`
`RESID`	`str`
`RHO_PREFIX`	`str`
`RHS_CONTRAST`	`str`
`RMSE`	`str`
`RSQUARED`	`str`
`RSQUARED_ADJ`	`str`
`RSQUARED_CONDITIONAL`	`str`
`RSQUARED_MARGINAL`	`str`
`RSS`	`str`
`R_SEMI`	`str`
`SE`	`str`
`SIGMA`	`str`
`SIGMA2`	`str`
`SIM_MEAN`	`str`
`SIM_Q025`	`str`
`SIM_Q975`	`str`
`SIM_SD`	`str`
`SS`	`str`
`STATISTIC`	`str`
`STD_RESID`	`str`
`TAU2_PREFIX`	`str`
`TERM`	`str`
`TERM_TYPE`	`str`
`TRUE_VALUE`	`str`
`T_STAT`	`str`
`VALUE`	`str`
`VIF`	`str`
`WEIGHT`	`str`

Builder Functions¶

Function	Signature	Description	Module
`build_cv_state`	`(k, mse, rmse, mae, r_squared, deviance, accuracy, sensitivity, specificity, f1, auc, fold_metrics, oos_predictions, oos_residuals, fold_assignments) -> CVState`	Build a CVState from cross-validation computation	`builders`
`build_effects_dataframe`	`(mee, method) -> pl.DataFrame`	Build the `.effects` DataFrame from marginal effects state	`builders`
`build_fit_state`	`(coef, vcov, fitted, residuals, leverage, df_resid, loglik, converged, n_iter, sigma, dispersion, null_deviance, deviance, theta, u, irls_weights, XtWX_inv) -> FitState`	Build a FitState instance with validation	`builders`
`build_inference_state`	`(se, statistic, df, p_value, ci_lower, ci_upper, conf_level, method, null, alternative, n_resamples, boot_samples, perm_samples, pre, pre_sd) -> InferenceState`	Build an InferenceState from computed inference values	`builders`
`build_joint_test_dataframe`	`(state) -> pl.DataFrame`	Build an ANOVA-style DataFrame from joint test results	`builders`
`build_joint_test_state`	`(terms, df1, statistic, p_value, test_type, ss_type, df2) -> JointTestState`	Build a JointTestState from computed joint test values	`builders`
`build_mee_resamples`	`(mee, samples, how) -> ResamplesState \| None`	Build ResamplesState from MEE inference if samples are available	`builders`
`build_mee_state`	`(grid, estimate, explore_formula, focal_var, mee_type, how, effect_scale, L_matrix, contrast_method, n_contrast_levels, link, L_matrix_link, boot_X_plus, boot_X_minus, boot_delta, se, df, statistic, p_value, ci_lower, ci_upper, conf_level) -> MeeState`	Build a MeeState from marginal effects computation	`builders`
`build_model_spec`	`(formula, family, link, method, response_var, fixed_terms, random_terms, has_random_effects) -> ModelSpec`	Build a ModelSpec from raw inputs	`builders`
`build_model_spec_from_formula`	`(formula, family, link, method, structure) -> ModelSpec`	Build ModelSpec from a pre-parsed formula structure and resolve defaults	`builders`
`build_params_dataframe`	`(bundle, fit, params_inference) -> pl.DataFrame`	Build the `.params` DataFrame from fit state	`builders`
`build_params_resamples`	`(inference, fit_coef, x_names, how) -> ResamplesState \| None`	Build ResamplesState from params inference if samples are available	`builders`
`build_prediction_state`	`(fitted, link, X_pred, config, se, ci_lower, ci_upper, interval_type, conf_level, grid) -> PredictionState`	Build a PredictionState from prediction computation	`builders`
`build_predictions_dataframe`	`(pred) -> pl.DataFrame`	Build the `.predictions` DataFrame from prediction state	`builders`
`build_resamples_dataframe`	`(rs) -> pl.DataFrame`	Build a long-format DataFrame of raw resampled values	`builders`
`build_resamples_state`	`(samples, observed, names, method, n_resamples, context) -> ResamplesState`	Build a ResamplesState from resampling results	`builders`
`build_simulation_inference_state`	`(sim_type, n_sims, sim_mean, sim_sd, sim_quantiles, power, coverage, bias, rmse, alpha, true_coef) -> SimulationInferenceState`	Build a SimulationInferenceState from computed values	`builders`
`build_simulation_spec`	`(n, distributions, coef, sigma, re_spec, seed) -> SimulationSpec`	Build a SimulationSpec for data generation	`builders`
`build_simulation_spec_from_formula`	`(formula, n, distributions, coef, sigma, seed) -> SimulationSpec`	Build SimulationSpec from formula with defaults for unspecified variables	`builders`
`build_simulations_dataframe`	`(simulations, sim_inference) -> pl.DataFrame`	Build the `.simulations` DataFrame with optional inference columns	`builders`
`build_varying_corr_dataframe`	`(varying_spread) -> pl.DataFrame`	Build the `.varying_corr` DataFrame from random effect correlations	`builders`
`build_varying_offsets_dataframe`	`(varying_offsets) -> pl.DataFrame`	Build the `.varying_offsets` DataFrame from varying state	`builders`
`build_varying_params_dataframe`	`(bundle, fit, varying_offsets) -> pl.DataFrame`	Build the `.varying_params` DataFrame (population + offsets)	`builders`
`build_varying_spec`	`(n, sd, slope_sds, correlations, n_per) -> VaryingSpec`	Build a VaryingSpec for random effect structure	`builders`
`build_varying_spread_dataframe`	`(varying_spread) -> pl.DataFrame`	Build long-form `.varying_spread` DataFrame from variance components	`builders`
`build_varying_spread_state`	`(components, sigma2, tau2, rho, icc, ci_lower, ci_upper, conf_level, ci_method) -> VaryingSpreadState`	Build a VaryingSpreadState from variance component estimates	`builders`
`build_varying_state`	`(grid, effects, grouping_var, n_groups, pi_lower, pi_upper, conf_level) -> VaryingState`	Build a VaryingState from computed BLUPs	`builders`

Validators¶

Function	Signature	Description
`is_choice_str`	`(choices) -> ...`	Build a validator for a string constrained to a fixed set of choices
`is_conf_level`	`(instance, attribute, value) -> None`	Validate that a value is a normalized confidence level in (0, 1)
`is_ndarray`	`(instance, attribute, value) -> None`	Validate that a value is a numpy ndarray
`is_nonnegative_int`	`(instance, attribute, value) -> None`	Validate that a value is a non-negative integer
`is_optional_conf_level`	`(instance, attribute, value) -> None`	Validate that a value is a normalized confidence level or None
`is_optional_int`	`(instance, attribute, value) -> None`	Validate that a value is an int or None
`is_optional_ndarray`	`(instance, attribute, value) -> None`	Validate that a value is a numpy ndarray or None
`is_optional_positive_int`	`(instance, attribute, value) -> None`	Validate that a value is a positive integer or None
`is_optional_sparse_csc`	`(instance, attribute, value) -> None`	Validate that a value is a scipy.sparse.csc_matrix or None
`is_optional_str`	`(instance, attribute, value) -> None`	Validate that a value is a string or None
`is_optional_str_key_dict`	`(instance, attribute, value) -> None`	Validate that a value is a dict with string keys or None
`is_optional_tuple_of_str`	`(instance, attribute, value) -> None`	Validate that a value is a tuple of strings or None
`is_positive_int`	`(instance, attribute, value) -> None`	Validate that a value is a positive integer
`is_tuple_of_str`	`(instance, attribute, value) -> None`	Validate that a value is a tuple of strings
`normalize_conf_level`	`(conf_level) -> float`	Normalize conf_level to a float in (0, 1)
`normalize_optional_conf_level`	`(conf_level) -> float \| None`	Normalize an optional confidence level
`validate_correlations`	`(instance, attribute, value) -> None`	Validate correlation values are in [-1, 1]
`validate_sigma`	`(instance, attribute, value) -> None`	Validate sigma is non-negative
`validate_slope_sds`	`(instance, attribute, value) -> None`	Validate slope SDs are non-negative

Schemas & Constants¶

Constant	Description
`AugmentedDataCols`
`ComparisonAic`
`ComparisonBic`
`ComparisonCv`
`ComparisonDevianceChi2`
`ComparisonDevianceF`
`ComparisonFTest`
`ComparisonLrt`
`DiagnosticsCvCols`
`DiagnosticsGaussian`
`DiagnosticsGlm`
`DiagnosticsLmer`
`DiagnosticsPredCvCols`
`EffectsAsympCols`
`EffectsBaseCols`
`EffectsBootCols`
`EffectsPermCols`
`MetadataBase`
`MetadataMixed`
`ParamsAsymp`
`ParamsBase`
`ParamsBoot`
`ParamsCv`
`ParamsPerm`
`PowerSummaryCols`
`PredictionsAsymp`
`PredictionsBase`
`PredictionsCv`
`ResamplesRawSchema`
`SimulationsInferCols`
`VaryingCorrSchema`
`VaryingOffsetsBaseCols`
`VaryingOffsetsInferSuffix`
`VaryingParamsBaseCols`
`VaryingSpreadBase`
`VaryingSpreadInfer`
`VifSchema`