Diagnostics and inspection¶
Diagnostics answer two questions:
Did parameter selection end in a usable state?
Are the LOO and test-set scores consistent with the expected predictive accuracy?
Run diagnostics after select_params:
model.select_params(xi, zi)
model.run_diagnosis(xi, zi)
model.run_perf(xi, zi, xtzt=(xt, zt), zpmzpv=(zpm, zpv))
Model diagnosis¶
run_diagnosis(xi, zi) prints one block per output. For the seeded Hartmann4
example in Getting started, the output is:
~ Model [0]
[Model diagnosis]
* Parameter selection
cvg_reached: True
optimal_val: True
n_evals: 22
time: 0.2095
initial_val: 43.818328528617435
final_val: 37.7947
* Parameters
Name: Path Norm Bounds Value Denorm
z0_sigma2: covparam->variance log [-9.56, 10.44] 0.191 1.210
z0_rho_0: covparam->lengthscale log_inv [-9.577, 10.42] 0.705 0.494
z0_rho_1: covparam->lengthscale log_inv [-9.545, 10.45] 0.308 0.735
z0_rho_2: covparam->lengthscale log_inv [-9.599, 10.4] -0.568 1.764
z0_rho_3: covparam->lengthscale log_inv [-9.582, 10.42] 0.874 0.417
* Data
count: 40
-----
min max delta mean std delta_over_sigma
zi: -2.678 1.292 3.970 0.0221 1.047 3.280
xi_0: 8.399e-3 0.984 0.976 0.493 0.283 1.974
xi_1: 0.0450 0.991 0.946 0.510 0.294 1.288
xi_2: 1.544e-3 1.000 0.998 0.515 0.300 0.566
xi_3: 0.0164 0.997 0.981 0.526 0.307 2.350
The displayed block is produced by model.run_diagnosis(xi, zi) after
model.select_params(xi, zi).
The first part reports the optimizer status:
cvg_reachedSciPy optimizer success flag.
optimal_valWhether the final criterion value is finite and usable.
n_evalsNumber of criterion evaluations.
initial_valandfinal_valCriterion value before and after optimization. Compare them to detect optimizer-start or bounds problems.
The parameter table prints normalized coordinates, bounds, raw values, and
denormalized values. For Matérn lengthscales, the raw covariance vector stores
-log(rho) while the denormalized display reports rho.
Prediction performance¶
run_perf reports leave-one-out summaries and, when test values are supplied,
test-set summaries:
~ Model [0]
[Prediction performances]
LOO (n=40)
value
std(z): 1.047
tss: 43.829
press: 11.259
press/tss: 0.257
log10(press/tss): -0.590
rmse: 0.531
rmse/std(z): 0.507
Q2: 0.743
Test (n=300)
value
std(z): 0.989
tss: 293.704
rss: 41.061
rss/tss: 0.140
log10(rss/tss): -0.854
rmse: 0.370
rmse/std(z): 0.374
R2: 0.860
The displayed block is produced by
model.run_perf(xi, zi, xtzt=(xt, zt), zpmzpv=(zpm, zpv)).
For each output, the displayed quantities are defined as follows. For reference values \(z_i\),
For leave-one-out prediction, let \(e_i^{\mathrm{LOO}} = z_i - \widehat z_{-i}(x_i)\), where \(\widehat z_{-i}\) is computed without observation \(i\). Then
For a test set, let \(e_i^{\mathrm{test}} = z_i - \widehat z(x_i)\). Then
In both blocks,
\(\mathrm{RMSE} = \sqrt{\mathrm{SSE}/n}\) with
\(\mathrm{SSE}=\mathrm{PRESS}\) for LOO and
\(\mathrm{SSE}=\mathrm{RSS}\) for the test set. std(z) is the
empirical standard deviation of the reference values in the block.
rmse/std(z) is a scale-free error. press/tss and rss/tss are
error-to-variance ratios, so smaller values are better.
A low Q2 or R2 does not identify the cause by itself. Inspect the
observation design, covariance class, bounds, optimizer start, and prior anchors.
Model state checks¶
Use per-output entries to inspect the selected raw vectors, readable Param
object, optimizer report, and prior object when one exists:
covparam = model[0]["model"].covparam
param = model[0].get_param()
info = model[0]["info"]
prior = model[0].get("prior", None)
Use Model state and parameter objects for the storage convention and Priors for REMAP selection for prior objects.
Selection-criterion inspection¶
The optimizer report stores selection_criterion_nograd. Use this callable
for diagnosis, plotting, and posterior parameter sampling when gradients are not
needed:
criterion = model[0]["info"]["selection_criterion_nograd"]
value = criterion(model[0]["model"].covparam)
Cross sections of the selection criterion help check whether the optimizer
result is isolated or lies on a flat region. Cross sections are provided by
gpmp.modeldiagnosis.
Posterior parameter sampling¶
sample_parameters forwards to posterior-sampling functions in
gpmp.mcmc.param_posterior. The common methods are "mh", "nuts",
and "smc":
res = model.sample_parameters(method="mh", n_steps_total=2000)
The sampler uses the stored info object, so select_params must be called
first. The return dictionary contains method-specific objects and, for methods
that record them, criterion or log-target traces.