Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Smart constructors: basic types → containers.

Builder functions validate inputs and construct frozen container instances. Each builder accepts primitive types and returns a frozen attrs container.

Functions:

NameDescription
append_inference_columnsAppend standard inference columns to a DataFrame if available.
build_cv_stateBuild a CVState from cross-validation computation.
build_effects_dataframeBuild the .effects DataFrame from marginal effects state.
build_fit_stateBuild a FitState instance with validation.
build_inference_stateBuild an InferenceState from computed inference values.
build_joint_test_dataframeBuild an ANOVA-style DataFrame from joint test results.
build_joint_test_stateBuild a JointTestState from computed joint test values.
build_mee_resamplesBuild ResamplesState from MEE inference if samples are available.
build_mee_stateBuild a MeeState from marginal effects computation.
build_model_specBuild a ModelSpec from raw inputs.
build_model_spec_from_formulaBuild ModelSpec from a pre-parsed formula structure and resolve defaults.
build_params_dataframeBuild the .params DataFrame from fit state.
build_params_resamplesBuild ResamplesState from params inference if samples are available.
build_prediction_stateBuild a PredictionState from prediction computation.
build_predictions_dataframeBuild the .predictions DataFrame from prediction state.
build_resamples_dataframeBuild a long-format DataFrame of raw resampled values.
build_resamples_stateBuild a ResamplesState from resampling results.
build_simulation_inference_stateBuild a SimulationInferenceState from computed values.
build_simulation_specBuild a SimulationSpec for data generation.
build_simulation_spec_from_formulaBuild SimulationSpec from formula with defaults for unspecified variables.
build_simulations_dataframeBuild the .simulations DataFrame with optional inference columns.
build_varying_corr_dataframeBuild the .varying_corr DataFrame from random effect correlations.
build_varying_offsets_dataframeBuild the .varying_offsets DataFrame from varying state.
build_varying_params_dataframeBuild the .varying_params DataFrame (population + offsets).
build_varying_specBuild a VaryingSpec for random effect structure.
build_varying_spread_dataframeBuild the .varying_spread DataFrame from variance components.
build_varying_spread_stateBuild a VaryingSpreadState from variance component estimates.
build_varying_stateBuild a VaryingState from computed BLUPs.
extract_mee_namesExtract human-readable names from a MeeState.
get_varying_random_termsGet all random terms (Intercept + slope terms) for a VaryingSpec.

Modules:

NameDescription
dataframesDataFrame builders for user-facing property accessors.
resamplesBuilder functions for resamples-related containers.
resultsResult DataFrame assembly utilities.
specsBuilder functions for specification containers.
stateBuilder functions for computation state containers.

Functions

append_inference_columns

append_inference_columns(df: pl.DataFrame, state: object, method: str | None = None) -> pl.DataFrame

Append standard inference columns to a DataFrame if available.

Checks state.has_inference and, when True, adds each inference column that is not None on state. Columns are added in canonical order: se, ci_lower, ci_upper, statistic, df, p_value.

When method is "perm", the ci_lower and ci_upper columns are excluded.

Parameters:

NameTypeDescriptionDefault
dfDataFrameBase DataFrame to augment (not mutated).required
stateobjectObject with a has_inference bool and optional se, statistic, df, p_value, ci_lower, ci_upper array attributes.required
methodstr | NoneInference method ("asymp", "boot", "perm", or None). Controls which columns are included.None

Returns:

TypeDescription
DataFrameA new DataFrame with inference columns appended (or the
DataFrameoriginal DataFrame unchanged if inference is not available).

build_cv_state

build_cv_state(k: int, rmse: float, mae: float, r_squared: float, *, deviance: float | None = None, accuracy: float | None = None, sensitivity: float | None = None, specificity: float | None = None, f1: float | None = None, auc: float | None = None, fold_metrics: dict[str, np.ndarray] | None = None, oos_predictions: np.ndarray | None = None, oos_residuals: np.ndarray | None = None, fold_assignments: np.ndarray | None = None) -> CVState

Build a CVState from cross-validation computation.

Parameters:

NameTypeDescriptionDefault
kintNumber of folds used.required
rmsefloatRoot mean squared error.required
maefloatMean absolute error.required
r_squaredfloatCoefficient of determination.required
deviancefloat | NoneMean deviance (GLM only).None
accuracyfloat | NoneClassification accuracy (binomial only).None
sensitivityfloat | NoneSensitivity / true positive rate (binomial only).None
specificityfloat | NoneSpecificity / true negative rate (binomial only).None
f1float | NoneF1 score (binomial only).None
aucfloat | NoneArea under ROC curve (binomial only).None
fold_metricsdict[str, ndarray] | NonePer-fold metrics dictionary.None
oos_predictionsndarray | NoneOut-of-sample predictions.None
oos_residualsndarray | NoneOut-of-sample residuals.None
fold_assignmentsndarray | NoneArray indicating which fold each observation belongs to.None

Returns:

TypeDescription
CVStateFrozen CVState instance.

Examples:

>>> state = build_cv_state(
...     k=10,
...     rmse=0.523,
...     mae=0.412,
...     r_squared=0.891,
... )

build_effects_dataframe

build_effects_dataframe(mee: MeeState, method: str | None = None) -> pl.DataFrame

Build the .effects DataFrame from marginal effects state.

Column set varies by inference method: bootstrap excludes p_value, permutation excludes ci_lower/ci_upper.

Parameters:

NameTypeDescriptionDefault
meeMeeStateMeeState with grid, estimates, and optional inference.required
methodstr | NoneInference method ("asymp", "boot", "perm", or None). Controls which inference columns are included.None

Returns:

TypeDescription
DataFrameDataFrame with grid columns, estimate, and method-appropriate
DataFrameinference columns.

build_fit_state

build_fit_state(*, coef: NDArray[np.floating], vcov: NDArray[np.floating], fitted: NDArray[np.floating], residuals: NDArray[np.floating], leverage: NDArray[np.floating], df_resid: float, loglik: float, converged: bool = True, n_iter: int = 1, sigma: float | None = None, dispersion: float | None = None, null_deviance: float | None = None, deviance: float | None = None, theta: NDArray[np.floating] | None = None, u: NDArray[np.floating] | None = None, irls_weights: NDArray[np.floating] | None = None, XtWX_inv: NDArray[np.floating] | None = None) -> FitState

Build a FitState instance with validation.

This builder function provides a keyword-only interface for constructing FitState instances, ensuring all required fields are explicitly provided.

Parameters:

NameTypeDescriptionDefault
coefNDArray[floating]Coefficient estimates (1D array of length p).required
vcovNDArray[floating]Variance-covariance matrix (p x p array).required
fittedNDArray[floating]Fitted values (1D array of length n).required
residualsNDArray[floating]Residuals (1D array of length n).required
leverageNDArray[floating]Hat matrix diagonal / leverage values (1D array of length n).required
df_residfloatResidual degrees of freedom.required
loglikfloatLog-likelihood at convergence.required
convergedboolWhether the optimization converged.True
n_iterintNumber of iterations (1 for closed-form solutions).1
sigmafloat | NoneResidual standard deviation (OLS models only).None
dispersionfloat | NoneDispersion parameter (GLM models only).None
null_deviancefloat | NoneNull model deviance (GLM models only).None
deviancefloat | NoneResidual deviance, sum of unit deviances (GLM models only).None
thetaNDArray[floating] | NoneRandom effect variance parameters (mixed models only).None
uNDArray[floating] | NoneSpherical random effects (mixed models only).None
irls_weightsNDArray[floating] | NoneIRLS weights from GLM fit (GLM sandwich estimator).None
XtWX_invNDArray[floating] | NoneInverse of X’WX from GLM fit (GLM sandwich estimator).None

Returns:

TypeDescription
FitStateA new FitState instance.

Examples:

>>> import numpy as np
>>> from state import build_fit_state
>>> state = build_fit_state(
...     coef=np.array([1.0, 2.0]),
...     vcov=np.eye(2),
...     fitted=np.array([1.0, 2.0, 3.0]),
...     residuals=np.array([0.1, -0.1, 0.0]),
...     leverage=np.array([0.3, 0.3, 0.4]),
...     df_resid=1.0,
...     loglik=-10.0,
...     sigma=0.5,
... )
>>> state.sigma
0.5

build_inference_state

build_inference_state(se: np.ndarray, statistic: np.ndarray, df: np.ndarray, p_value: np.ndarray, ci_lower: np.ndarray, ci_upper: np.ndarray, *, conf_level: float = 0.95, method: str = 'asymp', null: float = 0.0, alternative: str = 'two-sided', n_resamples: int | None = None, boot_samples: np.ndarray | None = None, perm_samples: np.ndarray | None = None, pre: np.ndarray | None = None, pre_sd: np.ndarray | None = None) -> InferenceState

Build an InferenceState from computed inference values.

Parameters:

NameTypeDescriptionDefault
sendarrayStandard errors for each coefficient.required
statisticndarrayTest statistics (t or z).required
dfndarrayDegrees of freedom.required
p_valuendarrayP-values.required
ci_lowerndarrayLower confidence interval bounds.required
ci_upperndarrayUpper confidence interval bounds.required
conf_levelfloatConfidence level (default 0.95).0.95
methodstrInference method (“asymp”, “boot”, “perm”, “cv”).‘asymp’
nullfloatNull hypothesis value (default 0.0).0.0
alternativestrAlternative hypothesis direction (default “two-sided”).‘two-sided’
n_resamplesint | NoneNumber of bootstrap/permutation resamples.None
boot_samplesndarray | NoneRaw bootstrap samples.None
perm_samplesndarray | NoneNull distribution of test statistics from permutation tests.None
prendarray | NonePRE (Proportion Reduction in Error) per coefficient (CV ablation).None
pre_sdndarray | NoneStandard deviation of PRE across CV folds (CV ablation).None

Returns:

TypeDescription
InferenceStateFrozen InferenceState instance.

Examples:

>>> state = build_inference_state(
...     se=np.array([0.1, 0.2]),
...     statistic=np.array([5.0, 2.5]),
...     df=np.array([98.0, 98.0]),
...     p_value=np.array([0.001, 0.014]),
...     ci_lower=np.array([0.3, 0.1]),
...     ci_upper=np.array([0.7, 0.9]),
... )

build_joint_test_dataframe

build_joint_test_dataframe(state: JointTestState) -> pl.DataFrame

Build an ANOVA-style DataFrame from joint test results.

Parameters:

NameTypeDescriptionDefault
stateJointTestStateJointTestState with terms, df, statistics, and p-values.required

Returns:

TypeDescription
DataFrameDataFrame with term, df1, optional df2,
DataFramef_ratio or Chisq, and p_value columns.

build_joint_test_state

build_joint_test_state(terms: tuple[str, ...] | list[str], df1: np.ndarray, statistic: np.ndarray, p_value: np.ndarray, *, test_type: str = 'F', ss_type: str = 'III', df2: np.ndarray | None = None) -> JointTestState

Build a JointTestState from computed joint test values.

Parameters:

NameTypeDescriptionDefault
termstuple[str, ...] | list[str]Names of terms being tested.required
df1ndarrayNumerator degrees of freedom per term.required
statisticndarrayTest statistic values (F or chi2).required
p_valuendarrayP-values for each term.required
test_typestrType of test (“F” for linear models, “chi2” for GLMs).‘F’
ss_typestrSum of squares type (“II” or “III”).‘III’
df2ndarray | NoneDenominator degrees of freedom (required for F-tests).None

Returns:

TypeDescription
JointTestStateFrozen JointTestState instance.

Examples:

F-test results (linear model)::

>>> state = build_joint_test_state(
...     terms=("a", "b", "a:b"),
...     df1=np.array([2, 1, 2]),
...     df2=np.array([94, 94, 94]),
...     statistic=np.array([5.2, 12.1, 0.8]),
...     p_value=np.array([0.007, 0.001, 0.45]),
...     test_type="F",
... )

Chi-square results (GLM)::

>>> state = build_joint_test_state(
...     terms=("a", "b"),
...     df1=np.array([2, 1]),
...     statistic=np.array([8.5, 15.2]),
...     p_value=np.array([0.014, 0.0001]),
...     test_type="chi2",
... )

build_mee_resamples

build_mee_resamples(mee: MeeState | None, samples: np.ndarray | None, how: str) -> ResamplesState | None

Build ResamplesState from MEE inference if samples are available.

Parameters:

NameTypeDescriptionDefault
meeMeeState | NoneThe MeeState from explore, or None.required
samplesndarray | NoneRaw resample array from dispatch_mee_inference, or None.required
howstrInference method used ("boot", "perm", etc.).required

Returns:

TypeDescription
ResamplesState | NoneResamplesState if boot/perm samples were saved, else None.

build_mee_state

build_mee_state(grid: 'pl.DataFrame', estimate: np.ndarray, explore_formula: str, focal_var: str, mee_type: str, *, how: str = 'mem', effect_scale: str = 'link', L_matrix: np.ndarray | None = None, contrast_method: str | None = None, n_contrast_levels: int | None = None, link: str | None = None, L_matrix_link: np.ndarray | None = None, boot_X_plus: np.ndarray | None = None, boot_X_minus: np.ndarray | None = None, boot_delta: float | None = None, se: np.ndarray | None = None, df: np.ndarray | None = None, statistic: np.ndarray | None = None, p_value: np.ndarray | None = None, ci_lower: np.ndarray | None = None, ci_upper: np.ndarray | None = None, conf_level: float | None = None) -> MeeState

Build a MeeState from marginal effects computation.

Parameters:

NameTypeDescriptionDefault
grid‘pl.DataFrame’Polars DataFrame with the evaluation grid.required
estimatendarrayPoint estimates for each grid row.required
explore_formulastrThe explore formula string.required
focal_varstrThe primary variable being explored.required
mee_typestrType of effect (“means”, “slopes”, “contrasts”).required
howstrAveraging method: "mem" (Marginal Estimated Mean, balanced reference grid) or "ame" (Average Marginal Effect, g-computation over observed data).‘mem’
effect_scalestrScale of estimates: "link" (linear predictor) or "response" (inverse-link / data scale).‘link’
L_matrixndarray | NoneDesign matrix for delta method inference (optional). Shape (n_estimates, n_coef). For EMMs this is X_ref.None
contrast_methodstr | NoneOriginal contrast type for multiplicity adjustment (“pairwise”, “sequential”, “poly”, “treatment”, “sum”, “helmert”, or None).None
n_contrast_levelsint | NoneNumber of EMM levels before contrasting (family size).None
linkstr | NoneLink function name for response-scale CI back-transformation.None
L_matrix_linkndarray | NoneLink-scale L_matrix for CI back-transformation.None
boot_X_plusndarray | NonePer-combo average design matrix at focal_var + delta/2. For exact response-scale bootstrap AME recomputation.None
boot_X_minusndarray | NonePer-combo average design matrix at focal_var - delta/2.None
boot_deltafloat | NoneFinite-difference step size for bootstrap slope recomputation.None
sendarray | NoneStandard errors (optional, from .infer()).None
dfndarray | NoneDegrees of freedom (optional).None
statisticndarray | NoneTest statistics (optional).None
p_valuendarray | NoneP-values (optional).None
ci_lowerndarray | NoneLower CI bounds (optional).None
ci_upperndarray | NoneUpper CI bounds (optional).None
conf_levelfloat | NoneConfidence level (optional).None

Returns:

TypeDescription
MeeStateFrozen MeeState instance.

Examples:

>>> import polars as pl
>>> grid = pl.DataFrame({"treatment": ["A", "B", "C"]})
>>> state = build_mee_state(
...     grid=grid,
...     estimate=np.array([1.0, 2.0, 3.0]),
...     explore_formula="treatment",
...     focal_var="treatment",
...     mee_type="means",
... )
>>> state.has_inference
False

build_model_spec

build_model_spec(formula: str, *, family: str = 'gaussian', link: str | None = None, method: str | None = None, response_var: str | None = None, fixed_terms: tuple[str, ...] | list[str] | None = None, random_terms: tuple[str, ...] | list[str] | None = None, has_random_effects: bool | None = None) -> ModelSpec

Build a ModelSpec from raw inputs.

This factory function handles defaults, validation, and inference of missing fields. In a full implementation, formula parsing would extract response_var, fixed_terms, and random_terms automatically.

Parameters:

NameTypeDescriptionDefault
formulastrThe model formula string.required
familystrDistribution family (default: “gaussian”).‘gaussian’
linkstr | NoneLink function. If None, uses canonical link for family.None
methodstr | NoneEstimation method. If None, inferred from family and RE.None
response_varstr | NoneResponse variable name. Required if not parsed.None
fixed_termstuple[str, ...] | list[str] | NoneFixed effect terms. Required if not parsed.None
random_termstuple[str, ...] | list[str] | NoneRandom effect terms (default: empty tuple).None
has_random_effectsbool | NoneWhether model has RE. Inferred from random_terms.None

Returns:

TypeDescription
ModelSpecA validated ModelSpec instance.

Examples:

>>> spec = build_model_spec(
...     formula="y ~ x + treatment",
...     response_var="y",
...     fixed_terms=["Intercept", "x", "treatment"],
... )
>>> spec.method
'ols'

build_model_spec_from_formula

build_model_spec_from_formula(formula: str, *, family: str = 'gaussian', link: str | None = None, method: str | None = None, structure: FormulaStructure) -> ModelSpec

Build ModelSpec from a pre-parsed formula structure and resolve defaults.

The caller must parse the formula into a FormulaStructure first (via extract_formula_structure). This keeps containers/ free of imports from formula/.

Parameters:

NameTypeDescriptionDefault
formulastrR-style model formula (e.g., ``"y ~ x + (1group)"``).
familystrDistribution family (default: "gaussian").‘gaussian’
linkstr | NoneLink function. If None, uses canonical link for family.None
methodstr | NoneEstimation method. If None, inferred from family and RE presence. Validated against family/RE constraints if specified.None
structureFormulaStructurePre-parsed formula structure from extract_formula_structure(formula).required

Returns:

TypeDescription
ModelSpecA validated ModelSpec instance.

Examples:

>>> from parse import extract_formula_structure
>>> s = extract_formula_structure("y ~ x + treatment")
>>> spec = build_model_spec_from_formula("y ~ x + treatment", structure=s)
>>> spec.method
'ols'

build_params_dataframe

build_params_dataframe(bundle: DataBundle, fit: FitState, params_inference: InferenceState | None) -> pl.DataFrame

Build the .params DataFrame from fit state.

Column set varies by inference method:

Parameters:

NameTypeDescriptionDefault
bundleDataBundleData bundle containing X_names (coefficient labels).required
fitFitStateFit state containing coef (coefficient estimates).required
params_inferenceInferenceState | NoneOptional inference state with SE, CI, p-values.required

Returns:

TypeDescription
DataFrameDataFrame with term, estimate, and method-appropriate
DataFrameinference columns.

build_params_resamples

build_params_resamples(inference: InferenceState | None, fit_coef: np.ndarray, x_names: tuple[str, ...], how: str) -> ResamplesState | None

Build ResamplesState from params inference if samples are available.

Parameters:

NameTypeDescriptionDefault
inferenceInferenceState | NoneThe InferenceState from params inference, or None.required
fit_coefndarrayCoefficient estimates from the FitState.required
x_namestuple[str, ...]Design matrix column names from the DataBundle.required
howstrInference method used ("boot", "perm", etc.).required

Returns:

TypeDescription
ResamplesState | NoneResamplesState if boot/perm samples were saved, else None.

build_prediction_state

build_prediction_state(fitted: np.ndarray, *, link: np.ndarray | None = None, X_pred: np.ndarray | None = None, config: PredictionConfig | None = None, se: np.ndarray | None = None, ci_lower: np.ndarray | None = None, ci_upper: np.ndarray | None = None, interval_type: str | None = None, conf_level: float | None = None, grid: 'pl.DataFrame | None' = None) -> PredictionState

Build a PredictionState from prediction computation.

Parameters:

NameTypeDescriptionDefault
fittedndarrayPredicted values on response scale.required
linkndarray | NonePredicted values on link scale (for GLM/GLMM).None
X_predndarray | NoneDesign matrix used for predictions. Stored so that .infer() can compute delta-method SEs on the correct X.None
configPredictionConfig | NonePrediction configuration for bootstrap replay.None
sendarray | NoneStandard errors of predictions.None
ci_lowerndarray | NoneLower interval bounds.None
ci_upperndarray | NoneUpper interval bounds.None
interval_typestr | NoneType of interval (“confidence” or “prediction”).None
conf_levelfloat | NoneConfidence level for intervals.None
grid‘pl.DataFrame | None’Grid DataFrame for formula-mode predictions. When present, build_predictions_dataframe() prepends these columns.None

Returns:

TypeDescription
PredictionStateFrozen PredictionState instance.

Examples:

>>> state = build_prediction_state(
...     fitted=np.array([1.0, 2.0, 3.0]),
... )
>>> state.has_inference
False
>>> # With inference
>>> state = build_prediction_state(
...     fitted=np.array([1.0, 2.0, 3.0]),
...     se=np.array([0.1, 0.1, 0.1]),
...     ci_lower=np.array([0.8, 1.8, 2.8]),
...     ci_upper=np.array([1.2, 2.2, 3.2]),
...     interval_type="confidence",
...     conf_level=0.95,
... )
>>> state.has_inference
True

build_predictions_dataframe

build_predictions_dataframe(pred: PredictionState) -> pl.DataFrame

Build the .predictions DataFrame from prediction state.

Parameters:

NameTypeDescriptionDefault
predPredictionStatePrediction state with fitted values and optional link-scale, inference, and CV columns.required

Returns:

TypeDescription
DataFrameDataFrame with optional grid columns (formula mode), fitted,
DataFrameoptional link, inference columns, and optional CV columns.

build_resamples_dataframe

build_resamples_dataframe(rs: ResamplesState) -> pl.DataFrame

Build a long-format DataFrame of raw resampled values.

Returns one row per (resample, term) combination with the raw resampled value — coefficient estimates for bootstrap, null t-statistics for permutation.

Columns: resample (int), term (str), value (float).

Parameters:

NameTypeDescriptionDefault
rsResamplesStateFrozen ResamplesState from bootstrap or permutation inference.required

Returns:

TypeDescription
DataFramePolars DataFrame with n_resamples × k rows, where k is the
DataFramenumber of terms/effects.

Examples:

>>> df = build_resamples_dataframe(rs)
>>> df.columns
['resample', 'term', 'value']
>>> df.shape
(1000, 3)  # 100 resamples × 10 terms

build_resamples_state

build_resamples_state(*, samples: NDArray[np.floating], observed: NDArray[np.floating], names: tuple[str, ...] | list[str], method: str, n_resamples: int, context: str) -> ResamplesState

Build a ResamplesState from resampling results.

Parameters:

NameTypeDescriptionDefault
samplesNDArray[floating]Resampled statistics array, shape (n_resamples, k).required
observedNDArray[floating]Observed statistics, shape (k,).required
namestuple[str, ...] | list[str]Term/effect names corresponding to columns of samples.required
methodstrResampling method ("boot" or "perm").required
n_resamplesintNumber of resamples.required
contextstrWhat was resampled ("params" or "effects").required

Returns:

TypeDescription
ResamplesStateFrozen ResamplesState instance.

Examples:

>>> state = build_resamples_state(
...     samples=np.random.randn(100, 2),
...     observed=np.array([1.0, 2.0]),
...     names=("Intercept", "x"),
...     method="boot",
...     n_resamples=100,
...     context="params",
... )
>>> state.method
'boot'

build_simulation_inference_state

build_simulation_inference_state(sim_type: str, n_sims: int, *, sim_mean: np.ndarray | None = None, sim_sd: np.ndarray | None = None, sim_quantiles: dict[str, np.ndarray] | None = None, power: dict[str, float] | None = None, coverage: dict[str, float] | None = None, bias: dict[str, float] | None = None, rmse: dict[str, float] | None = None, alpha: float = 0.05, true_coef: dict[str, float] | None = None) -> SimulationInferenceState

Build a SimulationInferenceState from computed values.

Parameters:

NameTypeDescriptionDefault
sim_typestrType of simulation (“post_fit” or “power_analysis”).required
n_simsintNumber of simulations.required
sim_meanndarray | NoneMean of simulated values per observation.None
sim_sdndarray | NoneSD of simulated values per observation.None
sim_quantilesdict[str, ndarray] | NoneDict of quantile name -> array mappings.None
powerdict[str, float] | NoneDict of term name -> power mappings.None
coveragedict[str, float] | NoneDict of term name -> coverage mappings.None
biasdict[str, float] | NoneDict of term name -> bias mappings.None
rmsedict[str, float] | NoneDict of term name -> RMSE mappings.None
alphafloatSignificance level for power calculation.0.05
true_coefdict[str, float] | NoneTrue coefficient values for coverage/bias.None

Returns:

TypeDescription
SimulationInferenceStateFrozen SimulationInferenceState instance.

Examples:

>>> state = build_simulation_inference_state(
...     sim_type="post_fit",
...     n_sims=100,
...     sim_mean=np.array([1.0, 2.0, 3.0]),
...     sim_sd=np.array([0.1, 0.2, 0.3]),
... )

build_simulation_spec

build_simulation_spec(n: int, *, distributions: dict[str, Distribution] | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, re_spec: dict[str, VaryingSpec] | None = None, seed: int | None = None) -> SimulationSpec

Build a SimulationSpec for data generation.

Parameters:

NameTypeDescriptionDefault
nintTotal number of observations.required
distributionsdict[str, Distribution] | NoneVariable name -> Distribution mappings.None
coefdict[str, float] | NoneCoefficient name -> value mappings.None
sigmafloatResidual standard deviation.1.0
re_specdict[str, VaryingSpec] | NoneGrouping variable -> VaryingSpec mappings.None
seedint | NoneRandom seed for reproducibility.None

Returns:

TypeDescription
SimulationSpecSimulationSpec instance.

Examples:

>>> spec = build_simulation_spec(n=100)

build_simulation_spec_from_formula

build_simulation_spec_from_formula(formula: str, n: int, *, distributions: dict[str, Distribution] | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, seed: int | None = None) -> SimulationSpec

Build SimulationSpec from formula with defaults for unspecified variables.

Parses the formula to identify variable types and creates appropriate default distributions for variables not explicitly specified.

Parameters:

NameTypeDescriptionDefault
formulastrModel formula (e.g., "y ~ x + factor(group) + (1subject)").
nintNumber of observations to generate.required
distributionsdict[str, Distribution] | NoneUser-provided distributions for specific variables.None
coefdict[str, float] | NoneTrue coefficient values (defaults to all zeros).None
sigmafloatResidual standard deviation.1.0
seedint | NoneRandom seed.None

Returns:

TypeDescription
SimulationSpecSimulationSpec ready for data generation.

Examples:

>>> spec = build_simulation_spec_from_formula("y ~ x + factor(group)", n=100)
>>> spec.n
100

build_simulations_dataframe

build_simulations_dataframe(simulations: pl.DataFrame, sim_inference: SimulationInferenceState | None) -> pl.DataFrame

Build the .simulations DataFrame with optional inference columns.

Parameters:

NameTypeDescriptionDefault
simulationsDataFrameBase simulations DataFrame (generated data or sim columns).required
sim_inferenceSimulationInferenceState | NoneOptional simulation inference state with summary stats.required

Returns:

TypeDescription
DataFrameDataFrame with simulation data and optional sim_mean, sim_sd,
DataFrameand quantile columns.

build_varying_corr_dataframe

build_varying_corr_dataframe(varying_spread: VaryingSpreadState) -> pl.DataFrame

Build the .varying_corr DataFrame from random effect correlations.

Extracts correlation entries from the VaryingSpreadState rho dict into a tidy DataFrame. Returns an empty DataFrame (with the correct schema) when no correlations are present (e.g., intercept-only or diagonal RE structures).

Parameters:

NameTypeDescriptionDefault
varying_spreadVaryingSpreadStateVariance component state containing rho (dict mapping "effect1:effect2" keys to correlation values).required

Returns:

TypeDescription
DataFrameDataFrame with columns group, effect1, effect2, corr.

build_varying_offsets_dataframe

build_varying_offsets_dataframe(varying_offsets: VaryingState) -> pl.DataFrame

Build the .varying_offsets DataFrame from varying state.

Parameters:

NameTypeDescriptionDefault
varying_offsetsVaryingStateVarying state with grid, effects, and optional PIs.required

Returns:

TypeDescription
DataFrameDataFrame with group, level, effect columns, and optional
DataFrameprediction interval columns.

build_varying_params_dataframe

build_varying_params_dataframe(bundle: DataBundle, fit: FitState, varying_offsets: VaryingState) -> pl.DataFrame

Build the .varying_params DataFrame (population + offsets).

Parameters:

NameTypeDescriptionDefault
bundleDataBundleData bundle containing X_names for population param lookup.required
fitFitStateFit state containing coef (population coefficients).required
varying_offsetsVaryingStateVarying state with grid and per-group effects.required

Returns:

TypeDescription
DataFrameDataFrame with group, level, and effect columns where each
DataFramevalue is population_param + BLUP.

build_varying_spec

build_varying_spec(n: int, sd: float = 1.0, *, slope_sds: dict[str, float] | None = None, correlations: dict[tuple[str, str], float] | None = None, n_per: int | None = None) -> VaryingSpec

Build a VaryingSpec for random effect structure.

Parameters:

NameTypeDescriptionDefault
nintNumber of groups.required
sdfloatStandard deviation for random intercept.1.0
slope_sdsdict[str, float] | NoneDictionary of term -> slope SD mappings.None
correlationsdict[tuple[str, str], float] | NoneDictionary of (term1, term2) -> correlation mappings.None
n_perint | NoneNumber of units per group for nested effects.None

Returns:

TypeDescription
VaryingSpecVaryingSpec instance.

Examples:

>>> spec = build_varying_spec(n=50, sd=0.3)

build_varying_spread_dataframe

build_varying_spread_dataframe(varying_spread: VaryingSpreadState) -> pl.DataFrame

Build the .varying_spread DataFrame from variance components.

Parameters:

NameTypeDescriptionDefault
varying_spreadVaryingSpreadStateVariance component state with components DataFrame and optional CI information.required

Returns:

TypeDescription
DataFrameDataFrame with component, estimate, and optional
DataFrameci_lower, ci_upper, ci_method columns.

build_varying_spread_state

build_varying_spread_state(components: 'pl.DataFrame', sigma2: float, tau2: dict[str, float], *, rho: dict[str, float] | None = None, icc: float | None = None, ci_lower: dict[str, float] | None = None, ci_upper: dict[str, float] | None = None, conf_level: float | None = None, ci_method: str | None = None) -> VaryingSpreadState

Build a VaryingSpreadState from variance component estimates.

Parameters:

NameTypeDescriptionDefault
components‘pl.DataFrame’Polars DataFrame with component estimates.required
sigma2floatResidual variance.required
tau2dict[str, float]Dict mapping effect names to variance estimates.required
rhodict[str, float] | NoneDict mapping effect pairs to correlations (optional).None
iccfloat | NoneIntraclass correlation coefficient (optional).None
ci_lowerdict[str, float] | NoneLower CI bounds (optional, from .infer()).None
ci_upperdict[str, float] | NoneUpper CI bounds (optional, from .infer()).None
conf_levelfloat | NoneConfidence level (optional).None
ci_methodstr | NoneCI method used (optional).None

Returns:

TypeDescription
VaryingSpreadStateFrozen VaryingSpreadState instance.

Examples:

>>> import polars as pl
>>> components = pl.DataFrame({
...     "component": ["sigma2", "tau2_Intercept", "icc"],
...     "estimate": [1.0, 0.5, 0.33],
... })
>>> state = build_varying_spread_state(
...     components=components,
...     sigma2=1.0,
...     tau2={"Intercept": 0.5},
...     icc=0.33,
... )

build_varying_state

build_varying_state(grid: 'pl.DataFrame', effects: dict[str, np.ndarray], grouping_var: str, n_groups: int, *, pi_lower: dict[str, np.ndarray] | None = None, pi_upper: dict[str, np.ndarray] | None = None, conf_level: float | None = None) -> VaryingState

Build a VaryingState from computed BLUPs.

Parameters:

NameTypeDescriptionDefault
grid‘pl.DataFrame’Polars DataFrame with group identifiers.required
effectsdict[str, ndarray]Dict mapping effect names to BLUP arrays.required
grouping_varstrName of the grouping variable.required
n_groupsintNumber of groups.required
pi_lowerdict[str, ndarray] | NoneLower prediction interval bounds (optional).None
pi_upperdict[str, ndarray] | NoneUpper prediction interval bounds (optional).None
conf_levelfloat | NoneConfidence level for intervals (optional).None

Returns:

TypeDescription
VaryingStateFrozen VaryingState instance.

Examples:

>>> state = build_varying_state(
...     grid=pl.DataFrame({"subject": ["S1", "S2", "S3"]}),
...     effects={"Intercept": np.array([0.5, -0.3, 0.1])},
...     grouping_var="subject",
...     n_groups=3,
... )

extract_mee_names

extract_mee_names(mee: MeeState) -> tuple[str, ...]

Extract human-readable names from a MeeState.

Parameters:

NameTypeDescriptionDefault
meeMeeStateThe MeeState to extract names from.required

Returns:

TypeDescription
tuple[str, ...]Tuple of effect names for each estimate.

get_varying_random_terms

get_varying_random_terms(spec: VaryingSpec) -> tuple[str, ...]

Get all random terms (Intercept + slope terms) for a VaryingSpec.

Parameters:

NameTypeDescriptionDefault
specVaryingSpecA VaryingSpec instance.required

Returns:

TypeDescription
tuple[str, ...]Tuple of random term names, starting with “Intercept”.

Modules

dataframes

DataFrame builders for user-facing property accessors.

Pure functions that assemble Polars DataFrames from internal state containers. Each builder corresponds to a model property (.params, .effects, etc.) and contains only the DataFrame construction logic that was previously inlined in core.py property methods.

Functions:

NameDescription
build_effects_dataframeBuild the .effects DataFrame from marginal effects state.
build_joint_test_dataframeBuild an ANOVA-style DataFrame from joint test results.
build_params_dataframeBuild the .params DataFrame from fit state.
build_predictions_dataframeBuild the .predictions DataFrame from prediction state.
build_simulations_dataframeBuild the .simulations DataFrame with optional inference columns.
build_varying_corr_dataframeBuild the .varying_corr DataFrame from random effect correlations.
build_varying_offsets_dataframeBuild the .varying_offsets DataFrame from varying state.
build_varying_params_dataframeBuild the .varying_params DataFrame (population + offsets).
build_varying_spread_dataframeBuild the .varying_spread DataFrame from variance components.

Attributes

Classes

Functions

build_effects_dataframe
build_effects_dataframe(mee: MeeState, method: str | None = None) -> pl.DataFrame

Build the .effects DataFrame from marginal effects state.

Column set varies by inference method: bootstrap excludes p_value, permutation excludes ci_lower/ci_upper.

Parameters:

NameTypeDescriptionDefault
meeMeeStateMeeState with grid, estimates, and optional inference.required
methodstr | NoneInference method ("asymp", "boot", "perm", or None). Controls which inference columns are included.None

Returns:

TypeDescription
DataFrameDataFrame with grid columns, estimate, and method-appropriate
DataFrameinference columns.
build_joint_test_dataframe
build_joint_test_dataframe(state: JointTestState) -> pl.DataFrame

Build an ANOVA-style DataFrame from joint test results.

Parameters:

NameTypeDescriptionDefault
stateJointTestStateJointTestState with terms, df, statistics, and p-values.required

Returns:

TypeDescription
DataFrameDataFrame with term, df1, optional df2,
DataFramef_ratio or Chisq, and p_value columns.
build_params_dataframe
build_params_dataframe(bundle: DataBundle, fit: FitState, params_inference: InferenceState | None) -> pl.DataFrame

Build the .params DataFrame from fit state.

Column set varies by inference method:

Parameters:

NameTypeDescriptionDefault
bundleDataBundleData bundle containing X_names (coefficient labels).required
fitFitStateFit state containing coef (coefficient estimates).required
params_inferenceInferenceState | NoneOptional inference state with SE, CI, p-values.required

Returns:

TypeDescription
DataFrameDataFrame with term, estimate, and method-appropriate
DataFrameinference columns.
build_predictions_dataframe
build_predictions_dataframe(pred: PredictionState) -> pl.DataFrame

Build the .predictions DataFrame from prediction state.

Parameters:

NameTypeDescriptionDefault
predPredictionStatePrediction state with fitted values and optional link-scale, inference, and CV columns.required

Returns:

TypeDescription
DataFrameDataFrame with optional grid columns (formula mode), fitted,
DataFrameoptional link, inference columns, and optional CV columns.
build_simulations_dataframe
build_simulations_dataframe(simulations: pl.DataFrame, sim_inference: SimulationInferenceState | None) -> pl.DataFrame

Build the .simulations DataFrame with optional inference columns.

Parameters:

NameTypeDescriptionDefault
simulationsDataFrameBase simulations DataFrame (generated data or sim columns).required
sim_inferenceSimulationInferenceState | NoneOptional simulation inference state with summary stats.required

Returns:

TypeDescription
DataFrameDataFrame with simulation data and optional sim_mean, sim_sd,
DataFrameand quantile columns.
build_varying_corr_dataframe
build_varying_corr_dataframe(varying_spread: VaryingSpreadState) -> pl.DataFrame

Build the .varying_corr DataFrame from random effect correlations.

Extracts correlation entries from the VaryingSpreadState rho dict into a tidy DataFrame. Returns an empty DataFrame (with the correct schema) when no correlations are present (e.g., intercept-only or diagonal RE structures).

Parameters:

NameTypeDescriptionDefault
varying_spreadVaryingSpreadStateVariance component state containing rho (dict mapping "effect1:effect2" keys to correlation values).required

Returns:

TypeDescription
DataFrameDataFrame with columns group, effect1, effect2, corr.
build_varying_offsets_dataframe
build_varying_offsets_dataframe(varying_offsets: VaryingState) -> pl.DataFrame

Build the .varying_offsets DataFrame from varying state.

Parameters:

NameTypeDescriptionDefault
varying_offsetsVaryingStateVarying state with grid, effects, and optional PIs.required

Returns:

TypeDescription
DataFrameDataFrame with group, level, effect columns, and optional
DataFrameprediction interval columns.
build_varying_params_dataframe
build_varying_params_dataframe(bundle: DataBundle, fit: FitState, varying_offsets: VaryingState) -> pl.DataFrame

Build the .varying_params DataFrame (population + offsets).

Parameters:

NameTypeDescriptionDefault
bundleDataBundleData bundle containing X_names for population param lookup.required
fitFitStateFit state containing coef (population coefficients).required
varying_offsetsVaryingStateVarying state with grid and per-group effects.required

Returns:

TypeDescription
DataFrameDataFrame with group, level, and effect columns where each
DataFramevalue is population_param + BLUP.
build_varying_spread_dataframe
build_varying_spread_dataframe(varying_spread: VaryingSpreadState) -> pl.DataFrame

Build the .varying_spread DataFrame from variance components.

Parameters:

NameTypeDescriptionDefault
varying_spreadVaryingSpreadStateVariance component state with components DataFrame and optional CI information.required

Returns:

TypeDescription
DataFrameDataFrame with component, estimate, and optional
DataFrameci_lower, ci_upper, ci_method columns.

resamples

Builder functions for resamples-related containers.

Provides constructors for ResamplesState and helpers for building resamples from inference results. Moved from state.py to keep modules under the 800-line limit.

Functions:

NameDescription
build_mee_resamplesBuild ResamplesState from MEE inference if samples are available.
build_params_resamplesBuild ResamplesState from params inference if samples are available.
build_resamples_dataframeBuild a long-format DataFrame of raw resampled values.
build_resamples_stateBuild a ResamplesState from resampling results.
extract_mee_namesExtract human-readable names from a MeeState.

Attributes

Classes

Functions

build_mee_resamples
build_mee_resamples(mee: MeeState | None, samples: np.ndarray | None, how: str) -> ResamplesState | None

Build ResamplesState from MEE inference if samples are available.

Parameters:

NameTypeDescriptionDefault
meeMeeState | NoneThe MeeState from explore, or None.required
samplesndarray | NoneRaw resample array from dispatch_mee_inference, or None.required
howstrInference method used ("boot", "perm", etc.).required

Returns:

TypeDescription
ResamplesState | NoneResamplesState if boot/perm samples were saved, else None.
build_params_resamples
build_params_resamples(inference: InferenceState | None, fit_coef: np.ndarray, x_names: tuple[str, ...], how: str) -> ResamplesState | None

Build ResamplesState from params inference if samples are available.

Parameters:

NameTypeDescriptionDefault
inferenceInferenceState | NoneThe InferenceState from params inference, or None.required
fit_coefndarrayCoefficient estimates from the FitState.required
x_namestuple[str, ...]Design matrix column names from the DataBundle.required
howstrInference method used ("boot", "perm", etc.).required

Returns:

TypeDescription
ResamplesState | NoneResamplesState if boot/perm samples were saved, else None.
build_resamples_dataframe
build_resamples_dataframe(rs: ResamplesState) -> pl.DataFrame

Build a long-format DataFrame of raw resampled values.

Returns one row per (resample, term) combination with the raw resampled value — coefficient estimates for bootstrap, null t-statistics for permutation.

Columns: resample (int), term (str), value (float).

Parameters:

NameTypeDescriptionDefault
rsResamplesStateFrozen ResamplesState from bootstrap or permutation inference.required

Returns:

TypeDescription
DataFramePolars DataFrame with n_resamples × k rows, where k is the
DataFramenumber of terms/effects.

Examples:

>>> df = build_resamples_dataframe(rs)
>>> df.columns
['resample', 'term', 'value']
>>> df.shape
(1000, 3)  # 100 resamples × 10 terms
build_resamples_state
build_resamples_state(*, samples: NDArray[np.floating], observed: NDArray[np.floating], names: tuple[str, ...] | list[str], method: str, n_resamples: int, context: str) -> ResamplesState

Build a ResamplesState from resampling results.

Parameters:

NameTypeDescriptionDefault
samplesNDArray[floating]Resampled statistics array, shape (n_resamples, k).required
observedNDArray[floating]Observed statistics, shape (k,).required
namestuple[str, ...] | list[str]Term/effect names corresponding to columns of samples.required
methodstrResampling method ("boot" or "perm").required
n_resamplesintNumber of resamples.required
contextstrWhat was resampled ("params" or "effects").required

Returns:

TypeDescription
ResamplesStateFrozen ResamplesState instance.

Examples:

>>> state = build_resamples_state(
...     samples=np.random.randn(100, 2),
...     observed=np.array([1.0, 2.0]),
...     names=("Intercept", "x"),
...     method="boot",
...     n_resamples=100,
...     context="params",
... )
>>> state.method
'boot'
extract_mee_names
extract_mee_names(mee: MeeState) -> tuple[str, ...]

Extract human-readable names from a MeeState.

Parameters:

NameTypeDescriptionDefault
meeMeeStateThe MeeState to extract names from.required

Returns:

TypeDescription
tuple[str, ...]Tuple of effect names for each estimate.

results

Result DataFrame assembly utilities.

Shared helpers for building user-facing DataFrames from internal state containers. These are used by the model property accessors (effects, predictions, etc.) to assemble Polars DataFrames with optional inference columns.

Functions:

NameDescription
append_inference_columnsAppend standard inference columns to a DataFrame if available.

Classes

Functions

append_inference_columns
append_inference_columns(df: pl.DataFrame, state: object, method: str | None = None) -> pl.DataFrame

Append standard inference columns to a DataFrame if available.

Checks state.has_inference and, when True, adds each inference column that is not None on state. Columns are added in canonical order: se, ci_lower, ci_upper, statistic, df, p_value.

When method is "perm", the ci_lower and ci_upper columns are excluded.

Parameters:

NameTypeDescriptionDefault
dfDataFrameBase DataFrame to augment (not mutated).required
stateobjectObject with a has_inference bool and optional se, statistic, df, p_value, ci_lower, ci_upper array attributes.required
methodstr | NoneInference method ("asymp", "boot", "perm", or None). Controls which columns are included.None

Returns:

TypeDescription
DataFrameA new DataFrame with inference columns appended (or the
DataFrameoriginal DataFrame unchanged if inference is not available).

specs

Builder functions for specification containers.

Functions:

NameDescription
build_model_specBuild a ModelSpec from raw inputs.
build_model_spec_from_formulaBuild ModelSpec from a pre-parsed formula structure and resolve defaults.
build_simulation_specBuild a SimulationSpec for data generation.
build_simulation_spec_from_formulaBuild SimulationSpec from formula with defaults for unspecified variables.
build_varying_specBuild a VaryingSpec for random effect structure.
get_varying_random_termsGet all random terms (Intercept + slope terms) for a VaryingSpec.
strip_backticksRemove surrounding backtick quotes from a name.

Classes

Functions

build_model_spec
build_model_spec(formula: str, *, family: str = 'gaussian', link: str | None = None, method: str | None = None, response_var: str | None = None, fixed_terms: tuple[str, ...] | list[str] | None = None, random_terms: tuple[str, ...] | list[str] | None = None, has_random_effects: bool | None = None) -> ModelSpec

Build a ModelSpec from raw inputs.

This factory function handles defaults, validation, and inference of missing fields. In a full implementation, formula parsing would extract response_var, fixed_terms, and random_terms automatically.

Parameters:

NameTypeDescriptionDefault
formulastrThe model formula string.required
familystrDistribution family (default: “gaussian”).‘gaussian’
linkstr | NoneLink function. If None, uses canonical link for family.None
methodstr | NoneEstimation method. If None, inferred from family and RE.None
response_varstr | NoneResponse variable name. Required if not parsed.None
fixed_termstuple[str, ...] | list[str] | NoneFixed effect terms. Required if not parsed.None
random_termstuple[str, ...] | list[str] | NoneRandom effect terms (default: empty tuple).None
has_random_effectsbool | NoneWhether model has RE. Inferred from random_terms.None

Returns:

TypeDescription
ModelSpecA validated ModelSpec instance.

Examples:

>>> spec = build_model_spec(
...     formula="y ~ x + treatment",
...     response_var="y",
...     fixed_terms=["Intercept", "x", "treatment"],
... )
>>> spec.method
'ols'
build_model_spec_from_formula
build_model_spec_from_formula(formula: str, *, family: str = 'gaussian', link: str | None = None, method: str | None = None, structure: FormulaStructure) -> ModelSpec

Build ModelSpec from a pre-parsed formula structure and resolve defaults.

The caller must parse the formula into a FormulaStructure first (via extract_formula_structure). This keeps containers/ free of imports from formula/.

Parameters:

NameTypeDescriptionDefault
formulastrR-style model formula (e.g., ``"y ~ x + (1group)"``).
familystrDistribution family (default: "gaussian").‘gaussian’
linkstr | NoneLink function. If None, uses canonical link for family.None
methodstr | NoneEstimation method. If None, inferred from family and RE presence. Validated against family/RE constraints if specified.None
structureFormulaStructurePre-parsed formula structure from extract_formula_structure(formula).required

Returns:

TypeDescription
ModelSpecA validated ModelSpec instance.

Examples:

>>> from parse import extract_formula_structure
>>> s = extract_formula_structure("y ~ x + treatment")
>>> spec = build_model_spec_from_formula("y ~ x + treatment", structure=s)
>>> spec.method
'ols'
build_simulation_spec
build_simulation_spec(n: int, *, distributions: dict[str, Distribution] | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, re_spec: dict[str, VaryingSpec] | None = None, seed: int | None = None) -> SimulationSpec

Build a SimulationSpec for data generation.

Parameters:

NameTypeDescriptionDefault
nintTotal number of observations.required
distributionsdict[str, Distribution] | NoneVariable name -> Distribution mappings.None
coefdict[str, float] | NoneCoefficient name -> value mappings.None
sigmafloatResidual standard deviation.1.0
re_specdict[str, VaryingSpec] | NoneGrouping variable -> VaryingSpec mappings.None
seedint | NoneRandom seed for reproducibility.None

Returns:

TypeDescription
SimulationSpecSimulationSpec instance.

Examples:

>>> spec = build_simulation_spec(n=100)
build_simulation_spec_from_formula
build_simulation_spec_from_formula(formula: str, n: int, *, distributions: dict[str, Distribution] | None = None, coef: dict[str, float] | None = None, sigma: float = 1.0, seed: int | None = None) -> SimulationSpec

Build SimulationSpec from formula with defaults for unspecified variables.

Parses the formula to identify variable types and creates appropriate default distributions for variables not explicitly specified.

Parameters:

NameTypeDescriptionDefault
formulastrModel formula (e.g., "y ~ x + factor(group) + (1subject)").
nintNumber of observations to generate.required
distributionsdict[str, Distribution] | NoneUser-provided distributions for specific variables.None
coefdict[str, float] | NoneTrue coefficient values (defaults to all zeros).None
sigmafloatResidual standard deviation.1.0
seedint | NoneRandom seed.None

Returns:

TypeDescription
SimulationSpecSimulationSpec ready for data generation.

Examples:

>>> spec = build_simulation_spec_from_formula("y ~ x + factor(group)", n=100)
>>> spec.n
100
build_varying_spec
build_varying_spec(n: int, sd: float = 1.0, *, slope_sds: dict[str, float] | None = None, correlations: dict[tuple[str, str], float] | None = None, n_per: int | None = None) -> VaryingSpec

Build a VaryingSpec for random effect structure.

Parameters:

NameTypeDescriptionDefault
nintNumber of groups.required
sdfloatStandard deviation for random intercept.1.0
slope_sdsdict[str, float] | NoneDictionary of term -> slope SD mappings.None
correlationsdict[tuple[str, str], float] | NoneDictionary of (term1, term2) -> correlation mappings.None
n_perint | NoneNumber of units per group for nested effects.None

Returns:

TypeDescription
VaryingSpecVaryingSpec instance.

Examples:

>>> spec = build_varying_spec(n=50, sd=0.3)
get_varying_random_terms
get_varying_random_terms(spec: VaryingSpec) -> tuple[str, ...]

Get all random terms (Intercept + slope terms) for a VaryingSpec.

Parameters:

NameTypeDescriptionDefault
specVaryingSpecA VaryingSpec instance.required

Returns:

TypeDescription
tuple[str, ...]Tuple of random term names, starting with “Intercept”.
strip_backticks
strip_backticks(name: str) -> str

Remove surrounding backtick quotes from a name.

Parameters:

NameTypeDescriptionDefault
namestrA column name, possibly surrounded by backticks.required

Returns:

TypeDescription
strThe name with backticks stripped if present.

state

Builder functions for computation state containers.

Functions:

NameDescription
build_cv_stateBuild a CVState from cross-validation computation.
build_fit_stateBuild a FitState instance with validation.
build_inference_stateBuild an InferenceState from computed inference values.
build_joint_test_stateBuild a JointTestState from computed joint test values.
build_mee_stateBuild a MeeState from marginal effects computation.
build_prediction_stateBuild a PredictionState from prediction computation.
build_simulation_inference_stateBuild a SimulationInferenceState from computed values.
build_varying_spread_stateBuild a VaryingSpreadState from variance component estimates.
build_varying_stateBuild a VaryingState from computed BLUPs.

Classes

Functions

build_cv_state
build_cv_state(k: int, rmse: float, mae: float, r_squared: float, *, deviance: float | None = None, accuracy: float | None = None, sensitivity: float | None = None, specificity: float | None = None, f1: float | None = None, auc: float | None = None, fold_metrics: dict[str, np.ndarray] | None = None, oos_predictions: np.ndarray | None = None, oos_residuals: np.ndarray | None = None, fold_assignments: np.ndarray | None = None) -> CVState

Build a CVState from cross-validation computation.

Parameters:

NameTypeDescriptionDefault
kintNumber of folds used.required
rmsefloatRoot mean squared error.required
maefloatMean absolute error.required
r_squaredfloatCoefficient of determination.required
deviancefloat | NoneMean deviance (GLM only).None
accuracyfloat | NoneClassification accuracy (binomial only).None
sensitivityfloat | NoneSensitivity / true positive rate (binomial only).None
specificityfloat | NoneSpecificity / true negative rate (binomial only).None
f1float | NoneF1 score (binomial only).None
aucfloat | NoneArea under ROC curve (binomial only).None
fold_metricsdict[str, ndarray] | NonePer-fold metrics dictionary.None
oos_predictionsndarray | NoneOut-of-sample predictions.None
oos_residualsndarray | NoneOut-of-sample residuals.None
fold_assignmentsndarray | NoneArray indicating which fold each observation belongs to.None

Returns:

TypeDescription
CVStateFrozen CVState instance.

Examples:

>>> state = build_cv_state(
...     k=10,
...     rmse=0.523,
...     mae=0.412,
...     r_squared=0.891,
... )
build_fit_state
build_fit_state(*, coef: NDArray[np.floating], vcov: NDArray[np.floating], fitted: NDArray[np.floating], residuals: NDArray[np.floating], leverage: NDArray[np.floating], df_resid: float, loglik: float, converged: bool = True, n_iter: int = 1, sigma: float | None = None, dispersion: float | None = None, null_deviance: float | None = None, deviance: float | None = None, theta: NDArray[np.floating] | None = None, u: NDArray[np.floating] | None = None, irls_weights: NDArray[np.floating] | None = None, XtWX_inv: NDArray[np.floating] | None = None) -> FitState

Build a FitState instance with validation.

This builder function provides a keyword-only interface for constructing FitState instances, ensuring all required fields are explicitly provided.

Parameters:

NameTypeDescriptionDefault
coefNDArray[floating]Coefficient estimates (1D array of length p).required
vcovNDArray[floating]Variance-covariance matrix (p x p array).required
fittedNDArray[floating]Fitted values (1D array of length n).required
residualsNDArray[floating]Residuals (1D array of length n).required
leverageNDArray[floating]Hat matrix diagonal / leverage values (1D array of length n).required
df_residfloatResidual degrees of freedom.required
loglikfloatLog-likelihood at convergence.required
convergedboolWhether the optimization converged.True
n_iterintNumber of iterations (1 for closed-form solutions).1
sigmafloat | NoneResidual standard deviation (OLS models only).None
dispersionfloat | NoneDispersion parameter (GLM models only).None
null_deviancefloat | NoneNull model deviance (GLM models only).None
deviancefloat | NoneResidual deviance, sum of unit deviances (GLM models only).None
thetaNDArray[floating] | NoneRandom effect variance parameters (mixed models only).None
uNDArray[floating] | NoneSpherical random effects (mixed models only).None
irls_weightsNDArray[floating] | NoneIRLS weights from GLM fit (GLM sandwich estimator).None
XtWX_invNDArray[floating] | NoneInverse of X’WX from GLM fit (GLM sandwich estimator).None

Returns:

TypeDescription
FitStateA new FitState instance.

Examples:

>>> import numpy as np
>>> from state import build_fit_state
>>> state = build_fit_state(
...     coef=np.array([1.0, 2.0]),
...     vcov=np.eye(2),
...     fitted=np.array([1.0, 2.0, 3.0]),
...     residuals=np.array([0.1, -0.1, 0.0]),
...     leverage=np.array([0.3, 0.3, 0.4]),
...     df_resid=1.0,
...     loglik=-10.0,
...     sigma=0.5,
... )
>>> state.sigma
0.5
build_inference_state
build_inference_state(se: np.ndarray, statistic: np.ndarray, df: np.ndarray, p_value: np.ndarray, ci_lower: np.ndarray, ci_upper: np.ndarray, *, conf_level: float = 0.95, method: str = 'asymp', null: float = 0.0, alternative: str = 'two-sided', n_resamples: int | None = None, boot_samples: np.ndarray | None = None, perm_samples: np.ndarray | None = None, pre: np.ndarray | None = None, pre_sd: np.ndarray | None = None) -> InferenceState

Build an InferenceState from computed inference values.

Parameters:

NameTypeDescriptionDefault
sendarrayStandard errors for each coefficient.required
statisticndarrayTest statistics (t or z).required
dfndarrayDegrees of freedom.required
p_valuendarrayP-values.required
ci_lowerndarrayLower confidence interval bounds.required
ci_upperndarrayUpper confidence interval bounds.required
conf_levelfloatConfidence level (default 0.95).0.95
methodstrInference method (“asymp”, “boot”, “perm”, “cv”).‘asymp’
nullfloatNull hypothesis value (default 0.0).0.0
alternativestrAlternative hypothesis direction (default “two-sided”).‘two-sided’
n_resamplesint | NoneNumber of bootstrap/permutation resamples.None
boot_samplesndarray | NoneRaw bootstrap samples.None
perm_samplesndarray | NoneNull distribution of test statistics from permutation tests.None
prendarray | NonePRE (Proportion Reduction in Error) per coefficient (CV ablation).None
pre_sdndarray | NoneStandard deviation of PRE across CV folds (CV ablation).None

Returns:

TypeDescription
InferenceStateFrozen InferenceState instance.

Examples:

>>> state = build_inference_state(
...     se=np.array([0.1, 0.2]),
...     statistic=np.array([5.0, 2.5]),
...     df=np.array([98.0, 98.0]),
...     p_value=np.array([0.001, 0.014]),
...     ci_lower=np.array([0.3, 0.1]),
...     ci_upper=np.array([0.7, 0.9]),
... )
build_joint_test_state
build_joint_test_state(terms: tuple[str, ...] | list[str], df1: np.ndarray, statistic: np.ndarray, p_value: np.ndarray, *, test_type: str = 'F', ss_type: str = 'III', df2: np.ndarray | None = None) -> JointTestState

Build a JointTestState from computed joint test values.

Parameters:

NameTypeDescriptionDefault
termstuple[str, ...] | list[str]Names of terms being tested.required
df1ndarrayNumerator degrees of freedom per term.required
statisticndarrayTest statistic values (F or chi2).required
p_valuendarrayP-values for each term.required
test_typestrType of test (“F” for linear models, “chi2” for GLMs).‘F’
ss_typestrSum of squares type (“II” or “III”).‘III’
df2ndarray | NoneDenominator degrees of freedom (required for F-tests).None

Returns:

TypeDescription
JointTestStateFrozen JointTestState instance.

Examples:

F-test results (linear model)::

>>> state = build_joint_test_state(
...     terms=("a", "b", "a:b"),
...     df1=np.array([2, 1, 2]),
...     df2=np.array([94, 94, 94]),
...     statistic=np.array([5.2, 12.1, 0.8]),
...     p_value=np.array([0.007, 0.001, 0.45]),
...     test_type="F",
... )

Chi-square results (GLM)::

>>> state = build_joint_test_state(
...     terms=("a", "b"),
...     df1=np.array([2, 1]),
...     statistic=np.array([8.5, 15.2]),
...     p_value=np.array([0.014, 0.0001]),
...     test_type="chi2",
... )
build_mee_state
build_mee_state(grid: 'pl.DataFrame', estimate: np.ndarray, explore_formula: str, focal_var: str, mee_type: str, *, how: str = 'mem', effect_scale: str = 'link', L_matrix: np.ndarray | None = None, contrast_method: str | None = None, n_contrast_levels: int | None = None, link: str | None = None, L_matrix_link: np.ndarray | None = None, boot_X_plus: np.ndarray | None = None, boot_X_minus: np.ndarray | None = None, boot_delta: float | None = None, se: np.ndarray | None = None, df: np.ndarray | None = None, statistic: np.ndarray | None = None, p_value: np.ndarray | None = None, ci_lower: np.ndarray | None = None, ci_upper: np.ndarray | None = None, conf_level: float | None = None) -> MeeState

Build a MeeState from marginal effects computation.

Parameters:

NameTypeDescriptionDefault
grid‘pl.DataFrame’Polars DataFrame with the evaluation grid.required
estimatendarrayPoint estimates for each grid row.required
explore_formulastrThe explore formula string.required
focal_varstrThe primary variable being explored.required
mee_typestrType of effect (“means”, “slopes”, “contrasts”).required
howstrAveraging method: "mem" (Marginal Estimated Mean, balanced reference grid) or "ame" (Average Marginal Effect, g-computation over observed data).‘mem’
effect_scalestrScale of estimates: "link" (linear predictor) or "response" (inverse-link / data scale).‘link’
L_matrixndarray | NoneDesign matrix for delta method inference (optional). Shape (n_estimates, n_coef). For EMMs this is X_ref.None
contrast_methodstr | NoneOriginal contrast type for multiplicity adjustment (“pairwise”, “sequential”, “poly”, “treatment”, “sum”, “helmert”, or None).None
n_contrast_levelsint | NoneNumber of EMM levels before contrasting (family size).None
linkstr | NoneLink function name for response-scale CI back-transformation.None
L_matrix_linkndarray | NoneLink-scale L_matrix for CI back-transformation.None
boot_X_plusndarray | NonePer-combo average design matrix at focal_var + delta/2. For exact response-scale bootstrap AME recomputation.None
boot_X_minusndarray | NonePer-combo average design matrix at focal_var - delta/2.None
boot_deltafloat | NoneFinite-difference step size for bootstrap slope recomputation.None
sendarray | NoneStandard errors (optional, from .infer()).None
dfndarray | NoneDegrees of freedom (optional).None
statisticndarray | NoneTest statistics (optional).None
p_valuendarray | NoneP-values (optional).None
ci_lowerndarray | NoneLower CI bounds (optional).None
ci_upperndarray | NoneUpper CI bounds (optional).None
conf_levelfloat | NoneConfidence level (optional).None

Returns:

TypeDescription
MeeStateFrozen MeeState instance.

Examples:

>>> import polars as pl
>>> grid = pl.DataFrame({"treatment": ["A", "B", "C"]})
>>> state = build_mee_state(
...     grid=grid,
...     estimate=np.array([1.0, 2.0, 3.0]),
...     explore_formula="treatment",
...     focal_var="treatment",
...     mee_type="means",
... )
>>> state.has_inference
False
build_prediction_state
build_prediction_state(fitted: np.ndarray, *, link: np.ndarray | None = None, X_pred: np.ndarray | None = None, config: PredictionConfig | None = None, se: np.ndarray | None = None, ci_lower: np.ndarray | None = None, ci_upper: np.ndarray | None = None, interval_type: str | None = None, conf_level: float | None = None, grid: 'pl.DataFrame | None' = None) -> PredictionState

Build a PredictionState from prediction computation.

Parameters:

NameTypeDescriptionDefault
fittedndarrayPredicted values on response scale.required
linkndarray | NonePredicted values on link scale (for GLM/GLMM).None
X_predndarray | NoneDesign matrix used for predictions. Stored so that .infer() can compute delta-method SEs on the correct X.None
configPredictionConfig | NonePrediction configuration for bootstrap replay.None
sendarray | NoneStandard errors of predictions.None
ci_lowerndarray | NoneLower interval bounds.None
ci_upperndarray | NoneUpper interval bounds.None
interval_typestr | NoneType of interval (“confidence” or “prediction”).None
conf_levelfloat | NoneConfidence level for intervals.None
grid‘pl.DataFrame | None’Grid DataFrame for formula-mode predictions. When present, build_predictions_dataframe() prepends these columns.None

Returns:

TypeDescription
PredictionStateFrozen PredictionState instance.

Examples:

>>> state = build_prediction_state(
...     fitted=np.array([1.0, 2.0, 3.0]),
... )
>>> state.has_inference
False
>>> # With inference
>>> state = build_prediction_state(
...     fitted=np.array([1.0, 2.0, 3.0]),
...     se=np.array([0.1, 0.1, 0.1]),
...     ci_lower=np.array([0.8, 1.8, 2.8]),
...     ci_upper=np.array([1.2, 2.2, 3.2]),
...     interval_type="confidence",
...     conf_level=0.95,
... )
>>> state.has_inference
True
build_simulation_inference_state
build_simulation_inference_state(sim_type: str, n_sims: int, *, sim_mean: np.ndarray | None = None, sim_sd: np.ndarray | None = None, sim_quantiles: dict[str, np.ndarray] | None = None, power: dict[str, float] | None = None, coverage: dict[str, float] | None = None, bias: dict[str, float] | None = None, rmse: dict[str, float] | None = None, alpha: float = 0.05, true_coef: dict[str, float] | None = None) -> SimulationInferenceState

Build a SimulationInferenceState from computed values.

Parameters:

NameTypeDescriptionDefault
sim_typestrType of simulation (“post_fit” or “power_analysis”).required
n_simsintNumber of simulations.required
sim_meanndarray | NoneMean of simulated values per observation.None
sim_sdndarray | NoneSD of simulated values per observation.None
sim_quantilesdict[str, ndarray] | NoneDict of quantile name -> array mappings.None
powerdict[str, float] | NoneDict of term name -> power mappings.None
coveragedict[str, float] | NoneDict of term name -> coverage mappings.None
biasdict[str, float] | NoneDict of term name -> bias mappings.None
rmsedict[str, float] | NoneDict of term name -> RMSE mappings.None
alphafloatSignificance level for power calculation.0.05
true_coefdict[str, float] | NoneTrue coefficient values for coverage/bias.None

Returns:

TypeDescription
SimulationInferenceStateFrozen SimulationInferenceState instance.

Examples:

>>> state = build_simulation_inference_state(
...     sim_type="post_fit",
...     n_sims=100,
...     sim_mean=np.array([1.0, 2.0, 3.0]),
...     sim_sd=np.array([0.1, 0.2, 0.3]),
... )
build_varying_spread_state
build_varying_spread_state(components: 'pl.DataFrame', sigma2: float, tau2: dict[str, float], *, rho: dict[str, float] | None = None, icc: float | None = None, ci_lower: dict[str, float] | None = None, ci_upper: dict[str, float] | None = None, conf_level: float | None = None, ci_method: str | None = None) -> VaryingSpreadState

Build a VaryingSpreadState from variance component estimates.

Parameters:

NameTypeDescriptionDefault
components‘pl.DataFrame’Polars DataFrame with component estimates.required
sigma2floatResidual variance.required
tau2dict[str, float]Dict mapping effect names to variance estimates.required
rhodict[str, float] | NoneDict mapping effect pairs to correlations (optional).None
iccfloat | NoneIntraclass correlation coefficient (optional).None
ci_lowerdict[str, float] | NoneLower CI bounds (optional, from .infer()).None
ci_upperdict[str, float] | NoneUpper CI bounds (optional, from .infer()).None
conf_levelfloat | NoneConfidence level (optional).None
ci_methodstr | NoneCI method used (optional).None

Returns:

TypeDescription
VaryingSpreadStateFrozen VaryingSpreadState instance.

Examples:

>>> import polars as pl
>>> components = pl.DataFrame({
...     "component": ["sigma2", "tau2_Intercept", "icc"],
...     "estimate": [1.0, 0.5, 0.33],
... })
>>> state = build_varying_spread_state(
...     components=components,
...     sigma2=1.0,
...     tau2={"Intercept": 0.5},
...     icc=0.33,
... )
build_varying_state
build_varying_state(grid: 'pl.DataFrame', effects: dict[str, np.ndarray], grouping_var: str, n_groups: int, *, pi_lower: dict[str, np.ndarray] | None = None, pi_upper: dict[str, np.ndarray] | None = None, conf_level: float | None = None) -> VaryingState

Build a VaryingState from computed BLUPs.

Parameters:

NameTypeDescriptionDefault
grid‘pl.DataFrame’Polars DataFrame with group identifiers.required
effectsdict[str, ndarray]Dict mapping effect names to BLUP arrays.required
grouping_varstrName of the grouping variable.required
n_groupsintNumber of groups.required
pi_lowerdict[str, ndarray] | NoneLower prediction interval bounds (optional).None
pi_upperdict[str, ndarray] | NoneUpper prediction interval bounds (optional).None
conf_levelfloat | NoneConfidence level for intervals (optional).None

Returns:

TypeDescription
VaryingStateFrozen VaryingState instance.

Examples:

>>> state = build_varying_state(
...     grid=pl.DataFrame({"subject": ["S1", "S2", "S3"]}),
...     effects={"Intercept": np.array([0.5, -0.3, 0.1])},
...     grouping_var="subject",
...     n_groups=3,
... )