differentiation - bossanova

Numerical differentiation — Richardson extrapolation, Jacobian, Hessian.

Functions:

Name	Description
`compute_gradient_richardson`	Compute gradient using Richardson extrapolation.
`compute_hessian_numerical`	Compute Hessian using central finite differences.
`compute_hessian_richardson`	Compute Hessian using Richardson extrapolation with genD method.
`compute_jacobian_numerical`	Compute Jacobian using central finite differences.
`compute_jacobian_richardson`	Compute Jacobian using Richardson extrapolation.

Functions¶

compute_gradient_richardson¶

compute_gradient_richardson(func: Callable[[np.ndarray], float], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarray

Compute gradient using Richardson extrapolation.

Matches numDeriv::grad() behavior exactly by using Richardson extrapolation to achieve high-order accuracy (O(h^(2r))) in numerical differentiation.

The algorithm:

Computes r gradient approximations using central differences with step sizes h, h/v, h/v², ..., h/v^(r-1)
Applies Richardson extrapolation recursively to cancel truncation errors
Returns the final extrapolated estimate

A_{k,m} = (v^(2m) * A_{k+1,m-1} - A_{k,m-1}) / (v^(2m) - 1) A_{k,m} = (v^(2m) * A_{k+1,m-1} - A_{k,m-1}) / (v^(2m) - 1)

where A_{k,0} is the k-th central difference approximation.

h[i] = |d * x[i]| + eps * I(|x[i]| < zero_tol) h[i] = |d * x[i]| + eps * I(|x[i]| < zero_tol)

This ensures:

Relative step for large parameters: h ≈ d * |x|
Absolute step for small parameters: h = eps

Parameters:

Name	Type	Description	Default
`func`	`Callable[[ndarray], float]`	Scalar function R^n -> R to differentiate.	required
`x`	`ndarray`	Point at which to compute gradient, shape (n,).	required
`d`	`float`	Relative step size multiplier (default 1e-4).	`0.0001`
`eps`	`float`	Absolute step size for small x values (default 1e-4).	`0.0001`
`r`	`int`	Number of Richardson iterations (default 4 gives O(h^8) accuracy).	`4`
`v`	`int`	Step reduction factor between iterations (default 2).	`2`
`zero_tol`	`float \| None`	Threshold for “small” x values. If None, computed as sqrt(machine_eps / 7e-7) ≈ 5.6e-5.	`None`

Returns:

Type	Description
`ndarray`	Gradient vector of shape (n,).

Notes:

With default r=4, achieves O(h^8) accuracy vs O(h^2) for simple central differences
Requires 2rn function evaluations (vs 2*n for simple central differences)
Matches numDeriv’s default parameters exactly for lmerTest compatibility

Examples:

def rosenbrock(x):
    return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2
x = np.array([1.0, 1.0])
grad = compute_gradient_richardson(rosenbrock, x)

compute_hessian_numerical¶

compute_hessian_numerical(func: Callable[[np.ndarray], float], x: np.ndarray, step_size: float | np.ndarray | None = None) -> np.ndarray

Compute Hessian using central finite differences.

Uses 2nd-order accurate central differences:

Diagonal: H[i,i] ≈ (f(x+h) - 2*f(x) + f(x-h)) / h^2
Off-diagonal: H[i,j] ≈ (f(x+h_i+h_j) - f(x+h_i-h_j) - f(x-h_i+h_j) + f(x-h_i-h_j)) / (4h_ih_j)

This is a simpler alternative to Richardson extrapolation, trading accuracy for simplicity. For production use, prefer compute_hessian_richardson().

Parameters:

Name	Type	Description	Default
`func`	`Callable[[ndarray], float]`	Function R^n -> R to differentiate (deviance function).	required
`x`	`ndarray`	Point at which to compute Hessian.	required
`step_size`	`float \| ndarray \| None`	Step size(s) for finite differences. Can be: - None: Adaptive step size (0.01 * max(	x

Returns:

Type	Description
`ndarray`	Hessian matrix of shape (n, n).

Notes:

More accurate than forward differences but less than Richardson
Requires O(n^2) function evaluations
Pragmatic step size balances truncation O(h^2) with roundoff O(eps*f/h^2)

compute_hessian_richardson¶

compute_hessian_richardson(func: Callable[[np.ndarray], float], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarray

Compute Hessian using Richardson extrapolation with genD method.

Matches numDeriv::hessian() behavior exactly by using the same two-step approach as numDeriv’s genD() function:

First computes diagonal Hessian elements H[i,i] using Richardson extrapolation
Then computes off-diagonal elements H[i,j] using a modified mixed partial formula that subtracts the diagonal contributions:
H[i,j] = (f(x+h_i+h_j) - 2*f(x) + f(x-h_i-h_j) - H[i,i]h_i² - H[j,j]h_j²) / (2h_ih_j)

This formula is more stable than the standard 4-point mixed partial formula and better isolates the cross-derivative term from numerical noise.

Parameters:

Name	Type	Description	Default
`func`	`Callable[[ndarray], float]`	Scalar function R^n -> R to differentiate.	required
`x`	`ndarray`	Point at which to compute Hessian, shape (n,).	required
`d`	`float`	Relative step size multiplier (default 1e-4).	`0.0001`
`eps`	`float`	Absolute step size for small x values (default 1e-4).	`0.0001`
`r`	`int`	Number of Richardson iterations (default 4).	`4`
`v`	`int`	Step reduction factor (default 2).	`2`
`zero_tol`	`float \| None`	Threshold for small x values (default sqrt(eps/7e-7)).	`None`

Returns:

Type	Description
`ndarray`	Hessian matrix of shape (n, n).

Notes:

Uses genD method exactly as in numDeriv R package
Diagonal elements computed with Richardson extrapolation
Off-diagonal elements use diagonal-subtracted formula
More accurate than simple finite differences for smooth functions
Requires 1 + rn + (n(n-1)/2)*2 function evaluations
Essential for matching lmerTest’s Satterthwaite df computation

f(x+h_i+h_j) ≈ f(x) + h_if_i + h_jf_j f(x+h_i+h_j) ≈ f(x) + h_if_i + h_jf_j + 0.5h_i²f_ii + 0.5h_j²f_jj + h_ih_jf_ij + ...

Taking f(x+h_i+h_j) - 2f(x) + f(x-h_i-h_j) gives: h_i²f_ii + h_j²f_jj + 2h_ih_jf_ij + O(h^4)

Subtracting the known diagonal terms H[i,i]h_i² + H[j,j]h_j² and dividing by 2h_ih_j isolates f_ij with O(h²) accuracy.

compute_jacobian_numerical¶

compute_jacobian_numerical(func: Callable[[np.ndarray], np.ndarray], x: np.ndarray, step_size: float | np.ndarray | None = None) -> np.ndarray

Compute Jacobian using central finite differences.

For func: R^n -> R^(p×p) (returning Vcov matrix), computes the derivative of each output element w.r.t. each input parameter.

This is a simpler alternative to Richardson extrapolation. For production use, prefer compute_jacobian_richardson().

Parameters:

Name	Type	Description	Default
`func`	`Callable[[ndarray], ndarray]`	Function R^n -> R^(p×p) to differentiate (Vcov(beta) function).	required
`x`	`ndarray`	Point at which to compute Jacobian.	required
`step_size`	`float \| ndarray \| None`	Step size(s) for finite differences. Can be: - None: Adaptive step size (0.001 * max(	x

Returns:

Type	Description
`ndarray`	Jacobian array of shape (*out_shape, n) where out_shape = func(x).shape.
`ndarray`	For Vcov(beta), this is (p, p, n).

Notes:

Uses central differences for 2nd-order accuracy
Requires O(n) function evaluations
Step size balances truncation O(h^2) with roundoff O(eps*f/h)

compute_jacobian_richardson¶

compute_jacobian_richardson(func: Callable[[np.ndarray], np.ndarray], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarray

Compute Jacobian using Richardson extrapolation.

Matches numDeriv::jacobian() behavior exactly. For a vector-valued function f: R^n -> R^m, computes the Jacobian matrix J where J[i,j] = ∂f_i/∂x_j.

Uses Richardson extrapolation on each column (partial derivatives w.r.t. each input variable) to achieve high-order accuracy.

Parameters:

Name	Type	Description	Default
`func`	`Callable[[ndarray], ndarray]`	Vector function R^n -> R^m to differentiate. Returns array of shape (m,) or multi-dimensional (*out_shape,).	required
`x`	`ndarray`	Point at which to compute Jacobian, shape (n,).	required
`d`	`float`	Relative step size multiplier (default 1e-4).	`0.0001`
`eps`	`float`	Absolute step size for small x values (default 1e-4).	`0.0001`
`r`	`int`	Number of Richardson iterations (default 4).	`4`
`v`	`int`	Step reduction factor (default 2).	`2`
`zero_tol`	`float \| None`	Threshold for small x values (default sqrt(eps/7e-7)).	`None`

Returns:

Type	Description
`ndarray`	Jacobian array of shape (*out_shape, n) where out_shape = func(x).shape.
`ndarray`	For matrix-valued functions (e.g., Vcov matrix), returns shape (p, p, n).

Notes:

Each column of the Jacobian is computed using Richardson extrapolation
For func: R^n -> R^(p×p), returns array of shape (p, p, n)
Matches numDeriv’s row-major flattening convention
Requires 2rn function evaluations

Examples:

For Vcov(beta): R^k -> R^(p×p), the Jacobian has shape (p, p, k) where Jac[i, j, l] = ∂Vcov[i,j]/∂varpar[l]