gpmp.modeldiagnosis module¶

The modeldiagnosis module reports parameter-selection results, predictive performance, leave-one-out scores, selection-criterion cross sections, and simple data summaries. Use it after parameters have been selected and model.covparam is set.

Typical inputs¶

model: a gpmp.core.Model with selected covparam.
info: the diagnostics object returned by gpmp.kernel.select_* with info=True.
xi, zi: observation points and scalar observations.

Inputs can be backend-native gpmp.num objects. Reporting and plotting helpers convert to NumPy internally where required by tabular display or Matplotlib.

Main diagnostics¶

diag(model, info, xi, zi) prints a model report. perf prints prediction-performance metrics. selection_criterion_statistics and selection_criterion_statistics_fast summarize one-dimensional criterion cross sections around selected covariance parameters.

The performance metrics use the following definitions. For targets z, tss is \(\sum_i (z_i - \bar z)^2\). For leave-one-out prediction, press is \(\sum_i (z_i - \widehat z_{-i}(x_i))^2\) and Q2 is \(1 - \mathrm{press}/\mathrm{tss}\). For test-set prediction, rss is \(\sum_i (z_i - \widehat z(x_i))^2\) and R2 is \(1 - \mathrm{rss}/\mathrm{tss}\). rmse is \(\sqrt{\mathrm{sse}/n}\). rmse/std(z) divides it by the empirical standard deviation of the reference values.

Plotting functions are loaded lazily to avoid importing Matplotlib at package import time.

Model diagnosis utilities for GPmp.

Notes

This package groups helpers for:

parameter statistics on selection criteria;
predictive performance metrics;
report construction and display;
plotting helpers;
small utilities for printing and data description.

Public API¶

The most common entry points are re-exported at package level. Importing gpmp.modeldiagnosis does not import matplotlib. Plotting functions are imported lazily via the plotting submodule.

class gpmp.modeldiagnosis.Unnormalized1DDistribution(log_pdf: Callable[[float], float], bounds: Tuple[float, float], *, quad_opts: dict | None = None)[source]¶

One-dimensional distribution defined by an unnormalized scalar log-pdf.

Parameters:

log_pdf (callable) – Function log_pdf(x: float) -> float.
bounds (tuple of float) – Integration bounds (a, b) with a < b. May be infinite for normalization/integration.
quad_opts (dict, optional) – Extra keyword arguments passed to scipy.integrate.quad.

log_pdf¶

Stored log-pdf callable.

Type:: callable

bounds¶

Stored bounds (a, b).

Type:: tuple of float

Z¶

Normalization constant.

Type:: float

Notes

Quantiles require finite bounds.

cdf(x: float) → float[source]¶

Evaluate the CDF at a scalar point.

Parameters:: x (float) – Evaluation point.
Returns:: CDF value.
Return type:: float

f(x: Sequence[float]) → ndarray[tuple[Any, ...], dtype[floating]][source]¶

Evaluate the unnormalized density on a 1D grid.

Parameters:: x (sequence of float) – Evaluation points.
Returns:: Unnormalized density values.
Return type:: gpmp.num.ndarray

mean() → float[source]¶

Compute the mean.

Returns:: Mean.
Return type:: float

pdf(x: Sequence[float]) → ndarray[tuple[Any, ...], dtype[floating]][source]¶

Evaluate the normalized density on a 1D grid.

Parameters:: x (sequence of float) – Evaluation points.
Returns:: Density values.
Return type:: gpmp.num.ndarray

quantile(p: float, *, xtol: float = 1e-06) → float[source]¶

Compute the quantile at level p.

Parameters:

p (float) – Probability level in (0, 1).
xtol (float, optional) – Absolute tolerance for brentq.

Returns:

Quantile value.

Return type:

float

Raises:

ValueError – If p is outside (0, 1) or bounds are not finite.

var() → float[source]¶

Compute the variance.

Returns:: Variance.
Return type:: float

gpmp.modeldiagnosis.compute_performance(model: Any, xi: Any, zi: Any, loo: bool = True, loo_res: Tuple[Any, Any, Any] | None = None, xtzt: Tuple[Any, Any] | None = None, zpmzpv: Tuple[Any, Any] | None = None, compute_pit: bool = False) → Dict[str, Any][source]¶

Compute LOO and optional test-set performance metrics.

Parameters:

model (object) – Must provide: - loo(xi, zi) -> (zloom, zloov, eloo) - predict(xi, zi, xt) -> (zpm, zpv)
xi (array-like) – Observation inputs, shape (n, d).
zi (array-like) – Observation targets.
loo (bool, optional) – If True, compute LOO metrics.
loo_res (tuple, optional) – Precomputed (zloom, zloov, eloo).
xtzt (tuple, optional) – (xt, zt) for test-set metrics.
zpmzpv (tuple, optional) – Precomputed (zpm, zpv) for the test set.
compute_pit (bool, optional) – If True, include PIT arrays when available.

Returns:

LOO keys (if loo is True) - loo_n, loo_std, loo_tss, loo_press - loo_press_over_tss, loo_log10_press_over_tss - loo_rmse, loo_rmse_over_std, loo_Q2 - loo_pit (optional)

Test keys (if xtzt is not None) - test_n, test_std, test_tss, test_rss - test_rss_over_tss, test_log10_rss_over_tss - test_rmse, test_rmse_over_std, test_R2 - test_pit (optional)

Return type:

dict

gpmp.modeldiagnosis.describe_array(x, rownames, sigma_factor=None)[source]¶

Build simple descriptive statistics for an array.

Parameters:

x (array_like) – Input data. Shape (n,) or (n, d).
rownames (list of str) – Row names for the output DataFrame. Length 1 if x is 1D, else d.
sigma_factor (float or array_like, optional) – Scaling used for the last column. If scalar, the same factor is applied to all dimensions. If array_like, must have length d.

Returns:

Statistics per dimension.

Return type:

DataFrame

gpmp.modeldiagnosis.diag(model: Any, info_select_parameters: Any, xi: Any, zi: Any, *, model_type: str = 'linear_mean_matern_anisotropic', param_obj: Any | None = None) → None[source]¶

Build and display a model diagnosis report.

Parameters:

model (object) – GP model.
info_select_parameters (object) – Selection/optimization info passed to modeldiagnosis_init.
xi (array-like) – Observation data.
zi (array-like) – Observation data.
model_type (str, optional) – Passed to modeldiagnosis_init when param_obj is not provided.
param_obj (Param, optional) – If provided, use this Param directly.

gpmp.modeldiagnosis.fast_univariate_stats(single_param_fn: Callable[[float], Any], lower_bound: float, upper_bound: float, n_points: int = 100) → Tuple[float, float, Dict[str, float], float][source]¶

Compute weighted statistics on a scalar function evaluated on a grid.

The pseudo density is w(x) = exp(-single_param_fn(x)).

Parameters:

single_param_fn (callable) – Function of one scalar returning a scalar-like value.
lower_bound (float) – Integration bounds.
upper_bound (float) – Integration bounds.
n_points (int, optional) – Number of grid points.

Returns:

mean_val (float)
variance (float)
quantiles (dict) – Keys are “0.1”, “0.25”, “0.5”, “0.75”, “0.9”.
mode_val (float) – Grid mode (argmax of w).

gpmp.modeldiagnosis.make_single_param_criterion_function(selection_criterion: Callable[[Any], Any], covparam: Any, param_index: int) → Callable[[float], Any][source]¶

Freeze all parameters except one in a covariance parameter vector.

Parameters:

selection_criterion (callable) – Function f(covparam) -> scalar-like.
covparam (array-like) – Reference covariance parameter vector.
param_index (int) – Index of the parameter to vary.

Returns:

Function g(x) that evaluates f(covparam with covparam[param_index]=x).

Return type:

callable

gpmp.modeldiagnosis.model_diagnosis_disp(md: Dict[str, Any], xi: Any, zi: Any, *, model_type: str = 'linear_mean_matern_anisotropic') → None[source]¶

Print a diagnosis report.

Parameters:

md (dict) – Output of modeldiagnosis_init. Must contain “param_obj” and “parameters”.
xi (array-like) – Inputs, shape (n, d).
zi (array-like) – Targets, shape (n,) or (n, p).
model_type (str, optional) – Unused. Kept for backward compatibility with the former monolithic module.

gpmp.modeldiagnosis.modeldiagnosis_init(model: Any, info: Any, *, model_type: str = 'linear_mean_matern_anisotropic', param_obj: Any | None = None) → Dict[str, Any][source]¶

Build a diagnosis dictionary from a model and selection/optimization info.

Parameters:

model (object) –
Must expose these attributes:
- covparam
- meanparam (optional)
info (object) –
Must expose these attributes:
- success
- best_value_returned
- nfev
- total_time
- selection_criterion(params) for initial_val
- initial_params
- fun
May expose:
- bounds : array-like, shape (n_total_params, 2). Bounds in optimizer parameter space, ordered as [meanparam, covparam].
model_type (str, optional) –
Used only when param_obj is not provided. Supported values:
- "linear_mean_matern_anisotropic"
- "linear_mean_matern_anisotropic_noisy"
param_obj (Param, optional) – If provided, this Param is used directly. If info.bounds is present, bounds are still projected onto the covariance part of this Param when possible.

Returns:

md – Keys:

"optim_info" : info
"param_selection" : dict
"parameters" : dict, from param_obj.to_simple_dict()
"param_obj" : Param
"loo" : dict, reserved
"data" : dict, reserved

Return type:

dict

gpmp.modeldiagnosis.perf(model: Any, xi: Any, zi: Any, loo: bool = True, loo_res: Tuple[Any, Any, Any] | None = None, xtzt: Tuple[Any, Any] | None = None, zpmzpv: Tuple[Any, Any] | None = None) → None[source]¶

Print compute_performance() results (PIT omitted) as DataFrames.

Parameters:

model (object, array-like) – See compute_performance.
xi (object, array-like) – See compute_performance.
zi (object, array-like) – See compute_performance.
loo (bool, optional) – If True, include LOO metrics.
loo_res (tuple, optional) – Precomputed (zloom, zloov, eloo).
xtzt (tuple, optional) – (xt, zt) for test-set metrics.
zpmzpv (tuple, optional) – Precomputed (zpm, zpv) for the test set.

gpmp.modeldiagnosis.pretty_print_dictionary(d: Dict[str, Any], fp: int = 4) → None[source]¶

Print a dictionary with aligned keys.

Parameters:

d (dict) – Values can be scalars or backend arrays with one element.
fp (int, optional) – Number of decimal places for float formatting.

Return type:

None

gpmp.modeldiagnosis.pretty_print_dictionnary(d: Dict[str, Any], fp: int = 4) → None[source]¶

Backward-compatible alias for pretty_print_dictionary.

Parameters:

d (dict)
fp (int, optional)

Return type:

None

gpmp.modeldiagnosis.selection_criterion_statistics(info: Any | None = None, model: Any | None = None, xi: Any | None = None, selection_criterion: Callable[[Any], Any] | None = None, covparam: Any | None = None, ind: Iterable[int] | None = None, param_box: Any | None = None, delta: float = 5.0, verbose: bool = False) → Dict[str, Any][source]¶

Integration-based parameter statistics and Fisher information.

Each 1D marginal uses the pseudo log-pdf log p(x) = -criterion(x).

Parameters:

info – Same meaning as in selection_criterion_statistics_fast.
model – Same meaning as in selection_criterion_statistics_fast.
xi – Same meaning as in selection_criterion_statistics_fast.
selection_criterion – Same meaning as in selection_criterion_statistics_fast.
covparam – Same meaning as in selection_criterion_statistics_fast.
ind – Same meaning as in selection_criterion_statistics_fast.
param_box – Same meaning as in selection_criterion_statistics_fast.
delta – Same meaning as in selection_criterion_statistics_fast.
verbose – Same meaning as in selection_criterion_statistics_fast.

Returns:

Keys are “parameter_statistics” (DataFrame) and “fisher_information”.

Return type:

dict

gpmp.modeldiagnosis.selection_criterion_statistics_fast(info: Any | None = None, model: Any | None = None, xi: Any | None = None, selection_criterion: Callable[[Any], Any] | None = None, covparam: Any | None = None, ind: Iterable[int] | None = None, param_box: Any | None = None, delta: float = 5.0, n_points: int = 250, verbose: bool = False) → Dict[str, Any][source]¶

Grid-based parameter statistics and Fisher information.

Parameters:

info (object, optional) – If provided, defaults are taken from attributes: selection_criterion_nograd, covparam, model, xi.
model (object) – Must expose fisher_information(xi, covparam, epsilon=…).
xi (array-like) – Inputs passed to fisher_information.
selection_criterion (callable, optional) – Function f(covparam) -> scalar-like.
covparam (array-like, optional) – Covariance parameter vector.
ind (iterable of int, optional) – Parameter indices. Default is all.
param_box (array-like, optional) – Shape (2, n_params). Bounds per parameter.
delta (float, optional) – Range is [opt - delta, opt + delta] when param_box is None.
n_points (int, optional) – Grid size per parameter.
verbose (bool, optional) – Print per-parameter summaries.

Returns:

Keys are “parameter_statistics” (DataFrame) and “fisher_information”.

Return type:

dict

gpmp.modeldiagnosis.sigma_rho_from_covparam(covparam: Any) → Dict[str, Any][source]¶

Extract sigma and rho parameters from a covariance parameter vector.

Assumes the convention: - covparam[0] = log(sigma^2) - covparam[i] = log(1/rho_{i-1}) for i >= 1

Parameters:: covparam (array-like, shape (p,)) – Covariance parameters.
Returns:: Dictionary with keys: - “sigma”: sigma - “rho0”, “rho1”, … : rho values
Return type:: dict