gpmp.modeldiagnosis module¶
The modeldiagnosis module reports parameter-selection results, predictive
performance, leave-one-out scores, selection-criterion cross sections, and
simple data summaries. Use it after parameters have been selected and
model.covparam is set.
Typical inputs¶
model: agpmp.core.Modelwith selectedcovparam.info: the diagnostics object returned bygpmp.kernel.select_*withinfo=True.xi, zi: observation points and scalar observations.
Inputs can be backend-native gpmp.num objects. Reporting and plotting
helpers convert to NumPy internally where required by tabular display or
Matplotlib.
Main diagnostics¶
diag(model, info, xi, zi) prints a model report. perf prints
prediction-performance metrics. selection_criterion_statistics and
selection_criterion_statistics_fast summarize one-dimensional criterion
cross sections around selected covariance parameters.
The performance metrics use the following definitions. For targets z,
tss is \(\sum_i (z_i - \bar z)^2\). For leave-one-out prediction,
press is \(\sum_i (z_i - \widehat z_{-i}(x_i))^2\) and
Q2 is \(1 - \mathrm{press}/\mathrm{tss}\). For test-set prediction,
rss is \(\sum_i (z_i - \widehat z(x_i))^2\) and R2 is
\(1 - \mathrm{rss}/\mathrm{tss}\). rmse is
\(\sqrt{\mathrm{sse}/n}\). rmse/std(z) divides it by the empirical
standard deviation of the reference values.
Plotting functions are loaded lazily to avoid importing Matplotlib at package import time.
Model diagnosis utilities for GPmp.
Notes
This package groups helpers for:
parameter statistics on selection criteria;
predictive performance metrics;
report construction and display;
plotting helpers;
small utilities for printing and data description.
Public API¶
The most common entry points are re-exported at package level. Importing gpmp.modeldiagnosis does not import matplotlib. Plotting functions are imported lazily via the plotting submodule.
- class gpmp.modeldiagnosis.Unnormalized1DDistribution(log_pdf: Callable[[float], float], bounds: Tuple[float, float], *, quad_opts: dict | None = None)[source]¶
One-dimensional distribution defined by an unnormalized scalar log-pdf.
- Parameters:
log_pdf (callable) – Function
log_pdf(x: float) -> float.bounds (tuple of float) – Integration bounds
(a, b)witha < b. May be infinite for normalization/integration.quad_opts (dict, optional) – Extra keyword arguments passed to
scipy.integrate.quad.
- log_pdf¶
Stored log-pdf callable.
- Type:
callable
- bounds¶
Stored bounds
(a, b).- Type:
tuple of float
- Z¶
Normalization constant.
- Type:
float
Notes
Quantiles require finite bounds.
- cdf(x: float) float[source]¶
Evaluate the CDF at a scalar point.
- Parameters:
x (float) – Evaluation point.
- Returns:
CDF value.
- Return type:
float
- f(x: Sequence[float]) ndarray[tuple[Any, ...], dtype[floating]][source]¶
Evaluate the unnormalized density on a 1D grid.
- Parameters:
x (sequence of float) – Evaluation points.
- Returns:
Unnormalized density values.
- Return type:
gpmp.num.ndarray
- pdf(x: Sequence[float]) ndarray[tuple[Any, ...], dtype[floating]][source]¶
Evaluate the normalized density on a 1D grid.
- Parameters:
x (sequence of float) – Evaluation points.
- Returns:
Density values.
- Return type:
gpmp.num.ndarray
- quantile(p: float, *, xtol: float = 1e-06) float[source]¶
Compute the quantile at level
p.- Parameters:
p (float) – Probability level in
(0, 1).xtol (float, optional) – Absolute tolerance for
brentq.
- Returns:
Quantile value.
- Return type:
float
- Raises:
ValueError – If
pis outside(0, 1)or bounds are not finite.
- gpmp.modeldiagnosis.compute_performance(model: Any, xi: Any, zi: Any, loo: bool = True, loo_res: Tuple[Any, Any, Any] | None = None, xtzt: Tuple[Any, Any] | None = None, zpmzpv: Tuple[Any, Any] | None = None, compute_pit: bool = False) Dict[str, Any][source]¶
Compute LOO and optional test-set performance metrics.
- Parameters:
model (object) – Must provide: - loo(xi, zi) -> (zloom, zloov, eloo) - predict(xi, zi, xt) -> (zpm, zpv)
xi (array-like) – Observation inputs, shape (n, d).
zi (array-like) – Observation targets.
loo (bool, optional) – If True, compute LOO metrics.
loo_res (tuple, optional) – Precomputed (zloom, zloov, eloo).
xtzt (tuple, optional) – (xt, zt) for test-set metrics.
zpmzpv (tuple, optional) – Precomputed (zpm, zpv) for the test set.
compute_pit (bool, optional) – If True, include PIT arrays when available.
- Returns:
LOO keys (if loo is True) - loo_n, loo_std, loo_tss, loo_press - loo_press_over_tss, loo_log10_press_over_tss - loo_rmse, loo_rmse_over_std, loo_Q2 - loo_pit (optional)
Test keys (if xtzt is not None) - test_n, test_std, test_tss, test_rss - test_rss_over_tss, test_log10_rss_over_tss - test_rmse, test_rmse_over_std, test_R2 - test_pit (optional)
- Return type:
dict
- gpmp.modeldiagnosis.describe_array(x, rownames, sigma_factor=None)[source]¶
Build simple descriptive statistics for an array.
- Parameters:
x (array_like) – Input data. Shape (n,) or (n, d).
rownames (list of str) – Row names for the output DataFrame. Length 1 if x is 1D, else d.
sigma_factor (float or array_like, optional) – Scaling used for the last column. If scalar, the same factor is applied to all dimensions. If array_like, must have length d.
- Returns:
Statistics per dimension.
- Return type:
DataFrame
- gpmp.modeldiagnosis.diag(model: Any, info_select_parameters: Any, xi: Any, zi: Any, *, model_type: str = 'linear_mean_matern_anisotropic', param_obj: Any | None = None) None[source]¶
Build and display a model diagnosis report.
- Parameters:
model (object) – GP model.
info_select_parameters (object) – Selection/optimization info passed to modeldiagnosis_init.
xi (array-like) – Observation data.
zi (array-like) – Observation data.
model_type (str, optional) – Passed to modeldiagnosis_init when param_obj is not provided.
param_obj (Param, optional) – If provided, use this Param directly.
- gpmp.modeldiagnosis.fast_univariate_stats(single_param_fn: Callable[[float], Any], lower_bound: float, upper_bound: float, n_points: int = 100) Tuple[float, float, Dict[str, float], float][source]¶
Compute weighted statistics on a scalar function evaluated on a grid.
The pseudo density is w(x) = exp(-single_param_fn(x)).
- Parameters:
single_param_fn (callable) – Function of one scalar returning a scalar-like value.
lower_bound (float) – Integration bounds.
upper_bound (float) – Integration bounds.
n_points (int, optional) – Number of grid points.
- Returns:
mean_val (float)
variance (float)
quantiles (dict) – Keys are “0.1”, “0.25”, “0.5”, “0.75”, “0.9”.
mode_val (float) – Grid mode (argmax of w).
- gpmp.modeldiagnosis.make_single_param_criterion_function(selection_criterion: Callable[[Any], Any], covparam: Any, param_index: int) Callable[[float], Any][source]¶
Freeze all parameters except one in a covariance parameter vector.
- Parameters:
selection_criterion (callable) – Function f(covparam) -> scalar-like.
covparam (array-like) – Reference covariance parameter vector.
param_index (int) – Index of the parameter to vary.
- Returns:
Function g(x) that evaluates f(covparam with covparam[param_index]=x).
- Return type:
callable
- gpmp.modeldiagnosis.model_diagnosis_disp(md: Dict[str, Any], xi: Any, zi: Any, *, model_type: str = 'linear_mean_matern_anisotropic') None[source]¶
Print a diagnosis report.
- Parameters:
md (dict) – Output of modeldiagnosis_init. Must contain “param_obj” and “parameters”.
xi (array-like) – Inputs, shape (n, d).
zi (array-like) – Targets, shape (n,) or (n, p).
model_type (str, optional) – Unused. Kept for backward compatibility with the former monolithic module.
- gpmp.modeldiagnosis.modeldiagnosis_init(model: Any, info: Any, *, model_type: str = 'linear_mean_matern_anisotropic', param_obj: Any | None = None) Dict[str, Any][source]¶
Build a diagnosis dictionary from a model and selection/optimization info.
- Parameters:
model (object) –
Must expose these attributes:
covparammeanparam(optional)
info (object) –
Must expose these attributes:
successbest_value_returnednfevtotal_timeselection_criterion(params)forinitial_valinitial_paramsfun
May expose:
bounds: array-like, shape(n_total_params, 2). Bounds in optimizer parameter space, ordered as[meanparam, covparam].
model_type (str, optional) –
Used only when param_obj is not provided. Supported values:
"linear_mean_matern_anisotropic""linear_mean_matern_anisotropic_noisy"
param_obj (Param, optional) – If provided, this Param is used directly. If info.bounds is present, bounds are still projected onto the covariance part of this Param when possible.
- Returns:
md – Keys:
"optim_info": info"param_selection": dict"parameters": dict, fromparam_obj.to_simple_dict()"param_obj": Param"loo": dict, reserved"data": dict, reserved
- Return type:
dict
- gpmp.modeldiagnosis.perf(model: Any, xi: Any, zi: Any, loo: bool = True, loo_res: Tuple[Any, Any, Any] | None = None, xtzt: Tuple[Any, Any] | None = None, zpmzpv: Tuple[Any, Any] | None = None) None[source]¶
Print compute_performance() results (PIT omitted) as DataFrames.
- Parameters:
model (object, array-like) – See compute_performance.
xi (object, array-like) – See compute_performance.
zi (object, array-like) – See compute_performance.
loo (bool, optional) – If True, include LOO metrics.
loo_res (tuple, optional) – Precomputed (zloom, zloov, eloo).
xtzt (tuple, optional) – (xt, zt) for test-set metrics.
zpmzpv (tuple, optional) – Precomputed (zpm, zpv) for the test set.
- gpmp.modeldiagnosis.pretty_print_dictionary(d: Dict[str, Any], fp: int = 4) None[source]¶
Print a dictionary with aligned keys.
- Parameters:
d (dict) – Values can be scalars or backend arrays with one element.
fp (int, optional) – Number of decimal places for float formatting.
- Return type:
None
- gpmp.modeldiagnosis.pretty_print_dictionnary(d: Dict[str, Any], fp: int = 4) None[source]¶
Backward-compatible alias for pretty_print_dictionary.
- Parameters:
d (dict)
fp (int, optional)
- Return type:
None
- gpmp.modeldiagnosis.selection_criterion_statistics(info: Any | None = None, model: Any | None = None, xi: Any | None = None, selection_criterion: Callable[[Any], Any] | None = None, covparam: Any | None = None, ind: Iterable[int] | None = None, param_box: Any | None = None, delta: float = 5.0, verbose: bool = False) Dict[str, Any][source]¶
Integration-based parameter statistics and Fisher information.
Each 1D marginal uses the pseudo log-pdf log p(x) = -criterion(x).
- Parameters:
info – Same meaning as in selection_criterion_statistics_fast.
model – Same meaning as in selection_criterion_statistics_fast.
xi – Same meaning as in selection_criterion_statistics_fast.
selection_criterion – Same meaning as in selection_criterion_statistics_fast.
covparam – Same meaning as in selection_criterion_statistics_fast.
ind – Same meaning as in selection_criterion_statistics_fast.
param_box – Same meaning as in selection_criterion_statistics_fast.
delta – Same meaning as in selection_criterion_statistics_fast.
verbose – Same meaning as in selection_criterion_statistics_fast.
- Returns:
Keys are “parameter_statistics” (DataFrame) and “fisher_information”.
- Return type:
dict
- gpmp.modeldiagnosis.selection_criterion_statistics_fast(info: Any | None = None, model: Any | None = None, xi: Any | None = None, selection_criterion: Callable[[Any], Any] | None = None, covparam: Any | None = None, ind: Iterable[int] | None = None, param_box: Any | None = None, delta: float = 5.0, n_points: int = 250, verbose: bool = False) Dict[str, Any][source]¶
Grid-based parameter statistics and Fisher information.
- Parameters:
info (object, optional) – If provided, defaults are taken from attributes: selection_criterion_nograd, covparam, model, xi.
model (object) – Must expose fisher_information(xi, covparam, epsilon=…).
xi (array-like) – Inputs passed to fisher_information.
selection_criterion (callable, optional) – Function f(covparam) -> scalar-like.
covparam (array-like, optional) – Covariance parameter vector.
ind (iterable of int, optional) – Parameter indices. Default is all.
param_box (array-like, optional) – Shape (2, n_params). Bounds per parameter.
delta (float, optional) – Range is [opt - delta, opt + delta] when param_box is None.
n_points (int, optional) – Grid size per parameter.
verbose (bool, optional) – Print per-parameter summaries.
- Returns:
Keys are “parameter_statistics” (DataFrame) and “fisher_information”.
- Return type:
dict
- gpmp.modeldiagnosis.sigma_rho_from_covparam(covparam: Any) Dict[str, Any][source]¶
Extract sigma and rho parameters from a covariance parameter vector.
Assumes the convention: - covparam[0] = log(sigma^2) - covparam[i] = log(1/rho_{i-1}) for i >= 1
- Parameters:
covparam (array-like, shape (p,)) – Covariance parameters.
- Returns:
Dictionary with keys: - “sigma”: sigma - “rho0”, “rho1”, … : rho values
- Return type:
dict