Numerical differentiation — Richardson extrapolation, Jacobian, Hessian.
Functions:
| Name | Description |
|---|---|
compute_gradient_richardson | Compute gradient using Richardson extrapolation. |
compute_hessian_numerical | Compute Hessian using central finite differences. |
compute_hessian_richardson | Compute Hessian using Richardson extrapolation with genD method. |
compute_jacobian_numerical | Compute Jacobian using central finite differences. |
compute_jacobian_richardson | Compute Jacobian using Richardson extrapolation. |
Functions¶
compute_gradient_richardson¶
compute_gradient_richardson(func: Callable[[np.ndarray], float], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarrayCompute gradient using Richardson extrapolation.
Matches numDeriv::grad() behavior exactly by using Richardson extrapolation to achieve high-order accuracy (O(h^(2r))) in numerical differentiation.
The algorithm:
Computes r gradient approximations using central differences with step sizes h, h/v, h/v², ..., h/v^(r-1)
Applies Richardson extrapolation recursively to cancel truncation errors
Returns the final extrapolated estimate
A_{k,m} = (v^(2m) * A_{k+1,m-1} - A_{k,m-1}) / (v^(2m) - 1) A_{k,m} = (v^(2m) * A_{k+1,m-1} - A_{k,m-1}) / (v^(2m) - 1)
where A_{k,0} is the k-th central difference approximation.
h[i] = |d * x[i]| + eps * I(|x[i]| < zero_tol) h[i] = |d * x[i]| + eps * I(|x[i]| < zero_tol)
This ensures:
Relative step for large parameters: h ≈ d * |x|
Absolute step for small parameters: h = eps
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func | Callable[[ndarray], float] | Scalar function R^n -> R to differentiate. | required |
x | ndarray | Point at which to compute gradient, shape (n,). | required |
d | float | Relative step size multiplier (default 1e-4). | 0.0001 |
eps | float | Absolute step size for small x values (default 1e-4). | 0.0001 |
r | int | Number of Richardson iterations (default 4 gives O(h^8) accuracy). | 4 |
v | int | Step reduction factor between iterations (default 2). | 2 |
zero_tol | float | None | Threshold for “small” x values. If None, computed as sqrt(machine_eps / 7e-7) ≈ 5.6e-5. | None |
Returns:
| Type | Description |
|---|---|
ndarray | Gradient vector of shape (n,). |
Notes:
With default r=4, achieves O(h^8) accuracy vs O(h^2) for simple central differences
Requires 2rn function evaluations (vs 2*n for simple central differences)
Matches numDeriv’s default parameters exactly for lmerTest compatibility
Examples:
def rosenbrock(x):
return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2
x = np.array([1.0, 1.0])
grad = compute_gradient_richardson(rosenbrock, x)compute_hessian_numerical¶
compute_hessian_numerical(func: Callable[[np.ndarray], float], x: np.ndarray, step_size: float | np.ndarray | None = None) -> np.ndarrayCompute Hessian using central finite differences.
Uses 2nd-order accurate central differences:
Diagonal: H[i,i] ≈ (f(x+h) - 2*f(x) + f(x-h)) / h^2
Off-diagonal: H[i,j] ≈ (f(x+h_i+h_j) - f(x+h_i-h_j) - f(x-h_i+h_j) + f(x-h_i-h_j)) / (4h_ih_j)
This is a simpler alternative to Richardson extrapolation, trading accuracy for simplicity. For production use, prefer compute_hessian_richardson().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func | Callable[[ndarray], float] | Function R^n -> R to differentiate (deviance function). | required |
x | ndarray | Point at which to compute Hessian. | required |
step_size | float | ndarray | None | Step size(s) for finite differences. Can be: - None: Adaptive step size (0.01 * max( | x |
Returns:
| Type | Description |
|---|---|
ndarray | Hessian matrix of shape (n, n). |
Notes:
More accurate than forward differences but less than Richardson
Requires O(n^2) function evaluations
Pragmatic step size balances truncation O(h^2) with roundoff O(eps*f/h^2)
compute_hessian_richardson¶
compute_hessian_richardson(func: Callable[[np.ndarray], float], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarrayCompute Hessian using Richardson extrapolation with genD method.
Matches numDeriv::hessian() behavior exactly by using the same two-step approach as numDeriv’s genD() function:
First computes diagonal Hessian elements H[i,i] using Richardson extrapolation
Then computes off-diagonal elements H[i,j] using a modified mixed partial formula that subtracts the diagonal contributions:
H[i,j] = (f(x+h_i+h_j) - 2*f(x) + f(x-h_i-h_j) - H[i,i]h_i² - H[j,j]h_j²) / (2h_ih_j)
This formula is more stable than the standard 4-point mixed partial formula and better isolates the cross-derivative term from numerical noise.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func | Callable[[ndarray], float] | Scalar function R^n -> R to differentiate. | required |
x | ndarray | Point at which to compute Hessian, shape (n,). | required |
d | float | Relative step size multiplier (default 1e-4). | 0.0001 |
eps | float | Absolute step size for small x values (default 1e-4). | 0.0001 |
r | int | Number of Richardson iterations (default 4). | 4 |
v | int | Step reduction factor (default 2). | 2 |
zero_tol | float | None | Threshold for small x values (default sqrt(eps/7e-7)). | None |
Returns:
| Type | Description |
|---|---|
ndarray | Hessian matrix of shape (n, n). |
Notes:
Uses genD method exactly as in numDeriv R package
Diagonal elements computed with Richardson extrapolation
Off-diagonal elements use diagonal-subtracted formula
More accurate than simple finite differences for smooth functions
Requires 1 + rn + (n(n-1)/2)*2 function evaluations
Essential for matching lmerTest’s Satterthwaite df computation
f(x+h_i+h_j) ≈ f(x) + h_if_i + h_jf_j f(x+h_i+h_j) ≈ f(x) + h_if_i + h_jf_j + 0.5h_i²f_ii + 0.5h_j²f_jj + h_ih_jf_ij + ...
Taking f(x+h_i+h_j) - 2f(x) + f(x-h_i-h_j) gives: h_i²f_ii + h_j²f_jj + 2h_ih_jf_ij + O(h^4)
Subtracting the known diagonal terms H[i,i]h_i² + H[j,j]h_j² and dividing by 2h_ih_j isolates f_ij with O(h²) accuracy.
compute_jacobian_numerical¶
compute_jacobian_numerical(func: Callable[[np.ndarray], np.ndarray], x: np.ndarray, step_size: float | np.ndarray | None = None) -> np.ndarrayCompute Jacobian using central finite differences.
For func: R^n -> R^(p×p) (returning Vcov matrix), computes the derivative of each output element w.r.t. each input parameter.
This is a simpler alternative to Richardson extrapolation. For production use, prefer compute_jacobian_richardson().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func | Callable[[ndarray], ndarray] | Function R^n -> R^(p×p) to differentiate (Vcov(beta) function). | required |
x | ndarray | Point at which to compute Jacobian. | required |
step_size | float | ndarray | None | Step size(s) for finite differences. Can be: - None: Adaptive step size (0.001 * max( | x |
Returns:
| Type | Description |
|---|---|
ndarray | Jacobian array of shape (*out_shape, n) where out_shape = func(x).shape. |
ndarray | For Vcov(beta), this is (p, p, n). |
Notes:
Uses central differences for 2nd-order accuracy
Requires O(n) function evaluations
Step size balances truncation O(h^2) with roundoff O(eps*f/h)
compute_jacobian_richardson¶
compute_jacobian_richardson(func: Callable[[np.ndarray], np.ndarray], x: np.ndarray, d: float = 0.0001, eps: float = 0.0001, r: int = 4, v: int = 2, zero_tol: float | None = None) -> np.ndarrayCompute Jacobian using Richardson extrapolation.
Matches numDeriv::jacobian() behavior exactly. For a vector-valued function f: R^n -> R^m, computes the Jacobian matrix J where J[i,j] = ∂f_i/∂x_j.
Uses Richardson extrapolation on each column (partial derivatives w.r.t. each input variable) to achieve high-order accuracy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func | Callable[[ndarray], ndarray] | Vector function R^n -> R^m to differentiate. Returns array of shape (m,) or multi-dimensional (*out_shape,). | required |
x | ndarray | Point at which to compute Jacobian, shape (n,). | required |
d | float | Relative step size multiplier (default 1e-4). | 0.0001 |
eps | float | Absolute step size for small x values (default 1e-4). | 0.0001 |
r | int | Number of Richardson iterations (default 4). | 4 |
v | int | Step reduction factor (default 2). | 2 |
zero_tol | float | None | Threshold for small x values (default sqrt(eps/7e-7)). | None |
Returns:
| Type | Description |
|---|---|
ndarray | Jacobian array of shape (*out_shape, n) where out_shape = func(x).shape. |
ndarray | For matrix-valued functions (e.g., Vcov matrix), returns shape (p, p, n). |
Notes:
Each column of the Jacobian is computed using Richardson extrapolation
For func: R^n -> R^(p×p), returns array of shape (p, p, n)
Matches numDeriv’s row-major flattening convention
Requires 2rn function evaluations
Examples:
For Vcov(beta): R^k -> R^(p×p), the Jacobian has shape (p, p, k) where Jac[i, j, l] = ∂Vcov[i,j]/∂varpar[l]