Parameter selection¶
gpmp-contrib delegates numerical parameter selection to gpmp.kernel.
The container class chooses the criterion, builds an optimizer start, calls
SciPy, and stores the result for each output.
Parameter selection chooses covariance parameters inside the selected GP family. For the provided ML class, it also chooses the constant mean parameter. Background on covariance modeling and kriging is given by Stein [7] and Chilès and Delfiner [2]. Empirical comparisons of GP parameter selection are discussed by Petit et al. [5].
Covariance parameter vector¶
For the Matérn classes, the covariance parameter vector has length
1 + input_dim:
covparam = [log(sigma2), -log(rho_0), ..., -log(rho_{d-1})]
The first entry is the log process variance. The remaining entries use the
inverse log-lengthscale convention. A positive value means a shorter
lengthscale than 1. A negative value means a longer lengthscale than 1.
The physical parameters are recovered by
Optimizer starts, bounds, and reported raw values are expressed in the
covparam coordinates.
Which criterion is used¶
Model_ConstantMean_Maternp_MLUses maximum likelihood. The constant mean parameter and covariance parameters are optimized together.
Model_ConstantMean_Maternp_REMLUses restricted likelihood. The mean basis is constant or linear. Mean coefficients are profiled out by the restricted likelihood, and the criterion is optimized over covariance parameters.
Model_ConstantMean_Maternp_REMAPUses restricted likelihood plus prior terms. The default REMAP class uses a log-variance prior and a log-lengthscale prior. See Priors for REMAP selection.
Maximum likelihood¶
In negative-log form, the ML criterion is
Here theta denotes covariance parameters, beta mean parameters,
K_theta the covariance matrix, and H the mean design matrix. For
Model_ConstantMean_Maternp_ML, H is a column of ones.
Restricted likelihood¶
The REML criterion adds the standard correction for the mean space:
where
The matrix H is built from the mean basis specified by
mean_specification.
REMAP¶
REMAP adds a log-prior term to REML:
In gpmp-contrib, REMAP prior anchors are separate from optimizer starts.
Changing the optimizer start does not by itself change the prior anchor. Use
set_prior to change prior anchors or hyperparameters.
What select_params does¶
Example:
model.select_params(
xi,
zi,
method="SLSQP",
method_options={"maxiter": 100},
)
For each output, select_params:
checks the observation and response shapes.
builds or rebuilds the selection criterion when needed.
chooses an optimizer start.
chooses bounds.
calls
gpmp.kernel.autoselect_parameters.stores the selected raw vectors, readable
Paramobject, and optimizer report inmodel[k].
Optimizer starts¶
param0 sets the optimizer start. For one output, it can be a raw vector or
a Param object. For several outputs, pass a list with one entry per output.
param0 = model[0].get_param()
model.select_params(xi, zi, param0=param0)
When param0 is not provided, the start is selected as follows:
if
force_param_initial_guess=True, use the configured initial-guess procedure.otherwise, if the underlying
gpmp.core.Modelalready has parameters, use the current model parameters.otherwise, use the configured initial-guess procedure.
When param0 is provided, it takes precedence and
force_param_initial_guess has no effect.
Bounds and SciPy options¶
Bounds are expressed in the normalized coordinates optimized by SciPy. They are selected in this order:
explicit
bounds.bounds_factory(model_dict, xi, zi_vector, param0_init).bounds stored in the current
Paramobject whenuse_bounds_from_param_obj=True.no bounds.
bounds_delta intersects the selected bounds with a local interval around the
optimizer start. method and method_options are forwarded to SciPy
through gpmp.kernel.autoselect_parameters.
Example:
model.select_params(
xi,
zi,
bounds_delta=5.0,
method="SLSQP",
method_options={"maxiter": 80, "ftol": 1e-6},
)
Stored results¶
After select_params(xi, zi), each output entry model[k] stores:
model[k]["model"].covparam: selected covariance parameter vector.model[k]["model"].meanparam: selected mean parameter vector, if any.model[k]["param"]ormodel[k].get_param(): readableParamobject.model[k]["info"]: optimizer report and criterion wrappers.model[k]["info"]["param0"]: optimizer start used for that output.model[k]["info"]["selection_criterion_nograd"]: criterion callable for diagnosis, plotting, and posterior sampling without gradient tracking.
For REMAP models with priors, the resolved prior object is stored in
model.get_prior(k).
Failure diagnosis¶
If selection fails or returns low prediction scores, inspect these objects in order:
model[k]["info"]for optimizer success, initial value, final value, and number of evaluations.model[k].get_param()for selected values and bounds.model.get_prior(k)for REMAP prior anchors, when applicable.the design
xiand responsezifor poor coverage, duplicates, or scale problems.the optimizer start
model[k]["info"]["param0"]when the criterion is multimodal.