Sampling criteria and optimization procedures

Sampling criteria operate on posterior means and variances. Strategy classes combine those criteria with observation updates and candidate-set management.

Array conventions

The criterion functions accept backend objects compatible with gpmp.num. For scalar-output criteria, zpm and zpv have shape (m, 1) or (m,). For multi-output box criteria, zpm and zpv have shape (m, q) and box has shape (2, q). Variances must be non-negative.

Sampling criteria

gpmpcontrib.samplingcriteria.box_logprobability(box, zpm, zpv)[source]

Compute log probabilities of predictions being within given bounds.

Returns:

  • sum_log_probs: (n, 1) array of summed log probabilities across dimensions

  • log_probs: (n, dim_output) array of log probabilities per dimension

Return type:

tuple of (gnp.array, gnp.array)

gpmpcontrib.samplingcriteria.box_misclassification_logprobability(box, zpm, zpv, clip_logthreshold=1e-06)[source]

Computes log misclassification probabilities for box estimation.

Numerically stable version using log probabilities to avoid underflow.

Returns:

  • sum_log_tau: (n, 1) array of summed log misclassification probs

  • log_tau: (n, dim_output) array of log misclassification probs

Return type:

tuple of (gnp.array, gnp.array)

gpmpcontrib.samplingcriteria.box_misclassification_probability(box, zpm, zpv)[source]

Computes the probability of misclassification for a box.

Returns:

Misclassification probabilities for each point.

Return type:

gnp.array, shape (M, 1)

gpmpcontrib.samplingcriteria.box_probability(box, zpm, zpv)[source]

Compute probability that predictions fall within given bounds for each dimension.

For Gaussian predictions N(zpm, zpv), calculates P(lower <= value <= upper) per dimension.

Parameters:
  • box ((2, dim_output) array-like) – [lower, upper] bounds for each output dimension

  • zpm ((n, dim_output) gnp.array) – Predictive means

  • zpv ((n, dim_output) gnp.array) – Predictive variances (non-negative)

Returns:

  • product_probs: (n, 1) array of product probabilities across dimensions

  • probs: (n, dim_output) array of probabilities per dimension

Return type:

tuple of (gnp.array, gnp.array)

Notes

  • Returns 1 when zpv=0 and zpm is within bounds, 0 otherwise

  • Handles infinite bounds

gpmpcontrib.samplingcriteria.excursion_logprobability(t, zpm, zpv)[source]

Computes the log probabilities of exceeding the threshold t for Gaussian predictive distributions with means zpm and variances zpv.

Parameters:
  • t (float) – Threshold value.

  • zpm (gnp.array) – Predictive means, shape (M, 1).

  • zpv (gnp.array) – Predictive variances, shape (M, 1).

Returns:

Log probabilities of exceeding the threshold, shape (M, 1).

Return type:

gnp.array

gpmpcontrib.samplingcriteria.excursion_misclassification_probability(t, zpm, zpv)[source]

Computes the probability of misclassification for the excursion set.

The misclassification probability is defined as:

tau_n(x) = min(P_n(ξ(x) > t), 1 - P_n(ξ(x) > t))

This measures the uncertainty in classifying whether a point belongs to the excursion set or not assuming a hard classifier eta_n(x) = 1_{p_n(x) > 1/2}

See Sequential design of computer experiments for the estimation of a probability of failure. J Bect, D Ginsbourger, L Li, V Picheny, E Vazquez. Statistics and Computing 22, 773-793 https://arxiv.org/pdf/1009.5177

Parameters:
  • t (float) – Threshold value defining the excursion set.

  • zpm (gnp.array, shape (M, 1)) – Predictive mean values at M points.

  • zpv (gnp.array, shape (M, 1)) – Predictive variance values at M points.

Returns:

Misclassification probabilities for each point.

Return type:

gnp.array, shape (M, 1)

gpmpcontrib.samplingcriteria.excursion_probability(t, zpm, zpv)[source]

Computes the probabilities of exceeding the threshold t for Gaussian predictive distributions with means zpm and variances zpv. The input argument must have the following sizes:

  • zpm M x 1,

  • zpv M x 1,

where M is the number of points where the EI must be computed. The output has size M x 1.

gpmpcontrib.samplingcriteria.expected_improvement(t, zpm, zpv)[source]
Computes the Expected Improvement (EI) criterion for a

maximization problem given a threshold t and Gaussian predictive distributions with means zpm and variances zpv. The input argument must have the following sizes:

  • zpm M x 1,

  • zpv M x 1,

where M is the number of points where the EI must be computed. The output has size M x 1.

REFERENCES

[1] D. R. Jones, M. Schonlau and William J. Welch. Efficient global

optimization of expensive black-box functions. Journal of Global Optimization, 13(4):455-492, 1998.

[2] J. Mockus, V. Tiesis and A. Zilinskas. The application of Bayesian

methods for seeking the extremum. In L.C.W. Dixon and G.P. Szego, editors, Towards Global Optimization, volume 2, pages 117-129, North Holland, New York, 1978.

gpmpcontrib.samplingcriteria.excursion_wMSE(t, zpm, zpv, alpha=1.0, beta=0.5)[source]

Criterion used for excursion-set learning at threshold t.

zpm and zpv are posterior means and variances. The criterion combines posterior classification uncertainty and posterior variance. It is evaluated pointwise on a candidate set or particle set. Returns a backend object with shape matching zpm.

gpmpcontrib.samplingcriteria.box_wMSE(box, zpm, zpv, normalization=1.0, alpha=1.0, beta=0.5)[source]

Multi-output criterion for inverse-image estimation with an output-space box.

box contains lower and upper target bounds for each output. The function returns a scalar criterion per candidate point and auxiliary per-output quantities: (wmse_sum, wmse) with shapes (m, 1) and (m, q).

gpmpcontrib.samplingcriteria.expected_improvement(t, zpm, zpv)[source]

Expected improvement for a maximization convention. t is the current threshold, and zpm and zpv are Gaussian posterior means and variances. For minimization examples in this package, callers pass -z_best, -zpm, and zpv.

Expected improvement

class gpmpcontrib.optim.expectedimprovement.ExpectedImprovementGridSearch(problem, model, xt, options=None)[source]

Bases: SequentialStrategyGridSearch

Implements the Expected Improvement (EI) optimization algorithm using a finite search space.

sampling_criterion()[source]

Computes the Expected Improvement (EI) at given points.

update_current_estimate()[source]

Updates the current estimate of the minimum.

class gpmpcontrib.optim.expectedimprovement.ExpectedImprovementSMC(problem, model, options=None)[source]

Bases: SequentialStrategySMC

Implements the Expected Improvement (EI) optimization algorithm using a Sequential Monte Carlo (SMC) method to adapt the search space.

sampling_criterion()[source]

Computes the Expected Improvement (EI) at given points.

smc_log_density(x, u)[source]

Defines the log probability of an excursion, used as a target density for SMC.

update_current_estimate()[source]

Updates the current estimate of the minimum.

update_smc_target_log_density_param()[source]

Updates the target log density parameter for SMC based on the current estimate.

class gpmpcontrib.optim.expectedimprovement.ExpectedImprovementGridSearch(problem, model, xt, options=None)[source]

Sequential minimization strategy on a fixed candidate set. xt has shape (m, d). The current estimate is min(zi). The sampling criterion is expected_improvement(-current_estimate, -zpm, zpv).

class gpmpcontrib.optim.expectedimprovement.ExpectedImprovementSMC(problem, model, options=None)[source]

Sequential minimization strategy with an SMC candidate set. The SMC target log-density is the posterior log-probability of improving on the current observed minimum, computed inside the input box and set to -inf outside it.

Excursion sets

class gpmpcontrib.optim.excursionset.ExcursionSetGridSearch(problem, model, xt, u_target, options=None)[source]

Bases: SequentialStrategyGridSearch

Sequential strategy for estimating the excursion set over a finite search space.

This class refines the estimate of the excursion set Γ = {x ∈ X | ξ(x)> u_target}, where ξ(x) is a Gaussian process model of the objective function.

At each iteration, it:

  • Computes excursion (excursion_probability) and misclassification probabilities (excursion_misclassification_probability).

  • Updates key metrics: - alpha: Expected excursion volume (mean probability of exceeding u_target). - beta: Expected misclassification volume (mean probability of classification error).

  • Selects the next evaluation point using excursion_wMSE.

Parameters:
  • problem (object) – Problem instance with the objective function.

  • model (object) – Predictive model (e.g., Gaussian process).

  • xt (array_like, shape (N, d)) – Finite set of candidate points.

  • u_target (float) – Excursion threshold.

  • options (dict, optional) – Configuration options.

g

Excursion probabilities.

Type:

gnp.array, shape (N, 1) or None

tau

Misclassification probabilities.

Type:

gnp.array, shape (N, 1) or None

alpha

Expected excursion volume.

Type:

float or None

beta

Expected misclassification volume.

Type:

float or None

sampling_criterion()[source]

Computes the selection criterion at given points.

update_current_estimate()[source]

Updates estimates.

class gpmpcontrib.optim.excursionset.ExcursionSetBSS(problem, model, u_init, u_target, options=None)[source]

BSS-style strategy for excursion-set estimation.

The target set is {x : xi(x) > u_target}. The strategy introduces mu in [0, 1] and the intermediate threshold u(mu) = (1 - mu) * u_init + mu * u_target. At a fixed mu, SMC particles target the posterior log-probability of excursion above u(mu) inside the input box and -inf outside the box.

Set inversion

class gpmpcontrib.optim.setinversion.SetInversionGridSearch(problem, model, xt, box_target, options=None)[source]

Bases: SequentialStrategyGridSearch

Sequential strategy for estimating the inverse image of a box over a finite search space.

This class refines the estimate of the inversion set Γ = {x ∈ X | ξ(x) ∈ box_target}, where ξ(x) is a Gaussian process model of the objective function.

At each iteration, it:
  • Computes set membership probabilities

  • Updates misclassification metrics

  • Selects points using summed wMSE across output dimensions

Parameters:
  • problem (object) – Problem instance with the objective function.

  • model (object) – Predictive model (e.g., Gaussian process).

  • xt (array_like, shape (N, d)) – Finite set of candidate points.

  • box_target (array_like, shape (2, dim_output)) – Target box bounds [lower, upper] for each dimension.

  • options (dict, optional) – Configuration options including: - normalization: scaling factors for variance (default 1.0) - alpha: misclassification exponent (default 1.0) - beta: variance exponent (default 0.5)

sampling_criterion()[source]

Computes the summed wMSE across output dimensions for point selection.

Returns:

Summed wMSE values for each candidate point

Return type:

gnp.array, shape (N, 1)

update_current_estimate()[source]

Updates probability estimates and metrics.

class gpmpcontrib.optim.setinversion.SetInversionBSS(problem, model, box_init, box_target, options=None)[source]

BSS-style strategy for inverse-image estimation.

The target set is {x : xi(x) in box_target}. The strategy interpolates between box_init and box_target with mu in [0, 1]. At each intermediate box, SMC particles target posterior box-membership probability inside the input domain.

Pareto utilities

Pareto utilities use NumPy arrays. z has shape (n, q) and stores costs to minimize.

pareto_points(z)

Return a boolean mask selecting non-dominated rows.

pareto_filter(zopt, z)

Return a boolean mask indicating rows of z dominated by at least one row of zopt.

dominated_area_2d(z_ref, z_opt)

Return the two-dimensional dominated area relative to z_ref.

hausdorff_distance(z1, z2)

Return the symmetric Hausdorff distance between two point clouds.

gpmpcontrib.optim.pareto.directed_hausdorff_distance(z1, z2)[source]
gpmpcontrib.optim.pareto.distance(x, y)[source]

Compute a distance matrix

Parameters:
  • x (numpy.array(n,dim)) – _description_

  • y (numpy.array(m,dim)) – If y is None, it is assumed y is x, by default None

Returns:

  • numpy.array(n,m) – distance matrix such that

  • .. math:: d_{i,j} = (sum_{k=1}^dim (x_{i,k} - y_{i,k})^2)^(1/2)

gpmpcontrib.optim.pareto.dominated_area_2d(z_ref, z_opt)[source]
gpmpcontrib.optim.pareto.hausdorff_distance(z1, z2)[source]
gpmpcontrib.optim.pareto.pareto_filter(zopt, z)[source]

Returns a boolean array indicating whether each point in z is Pareto-dominated by at least one point in zopt.

gpmpcontrib.optim.pareto.pareto_gt(z, z0)[source]

Get a boolean array corresponding to points in z that are Pareto-dominated by z0

gpmpcontrib.optim.pareto.pareto_lt(z, z0)[source]

Get a boolean array corresponding to points in z that Pareto-dominate z0

gpmpcontrib.optim.pareto.pareto_points(z, idx=None)[source]

Returns the pareto-optimal points :param z: nd array n_points x n_costs :return: A (n_points, ) boolean array, indicating whether each point is Pareto optimal

gpmpcontrib.optim.pareto.pareto_points_unsorted(z)[source]

Returns the pareto-optimal points :param z: nd array n_points x n_costs :return: A (n_points, ) boolean array, indicating whether each point is Pareto optimal

gpmpcontrib.optim.pareto.plot_pareto(axis, z_opt, color='red')[source]
gpmpcontrib.optim.pareto.symmdiff_area_2d(z_ref, z_opt_1, z_opt_2)[source]
gpmpcontrib.optim.pareto.test_pareto()[source]