Benchmark Problems#

This

Overview

Register: How to register a new optimization problem to PrismBO
Synthetic Problem: The list of the synthetic problems available in PrismBO
Hyperparameter Optimization Problem: The list of the HPO problems available in PrismBO
Configurable Software Optimization Problem: The list of the configurable software optimization problems available in PrismBO
RNA Inverse Design Problem: The list of the RNA Inverse design problems available in PrismBO
Protein Inverse Folding Problem: The list of the protein inverse folding problems available in PrismBO
Parallelization: How to parallelize function evaluations

Registering a New Benchmark Problem#

To register a new benchmark problem in the PrismBO framework, follow the steps below.

I. Import the Problem Registry#

First, you need to import the problem_registry from the prismbo.agent.registry module:

from prismbo.agent.registry import problem_registry

II. Define a New Problem Class#

Next, define a new problem class. This class should be decorated with the @problem_registry.register(“ProblemName”) decorator, where “ProblemName” is the unique identifier for the problem. The new problem class must inherit from one of the following base classes:

NonTabularProblem
TabularProblem

For example, to create a new problem named “new_problem”, you would define the class as follows:

@problem_registry.register("new_problem")
class new_problem(NonTabularProblem):
    pass  # Further implementation required

III. Implement Required Methods#

After defining the class, you need to implement the following three abstract methods:

get_configuration_space: This method is responsible for defining the configuration space of the new problem.
```
def get_configuration_space(self):
    # Define and return the configuration space
    pass
```
get_fidelity_space: This method should define the fidelity space for the problem, if applicable.
```
def get_fidelity_space(self):
    # Define and return the fidelity space
    pass
```

objective_function: This method evaluates the problem’s objective function based on the provided configuration and other parameters.

def objective_function(self, configuration, fidelity=None, seed=None, **kwargs) -> Dict:
    # Evaluate the configuration and return the results as a dictionary
    pass

Here’s an example outline of the sphere class:

@problem_registry.register("sphere")
class sphere(NonTabularProblem):

  def get_configuration_space(self):
        # Define the configuration space here
     variables =  [Continuous(f'x{i}', (-5.12, 5.12)) for i in range(self.input_dim)]
     ss = SearchSpace(variables)
     return ss

  def get_fidelity_space(self) -> FidelitySpace:
     fs = FidelitySpace([])
     return fs

  def objective_function(self, configuration, fidelity=None, seed=None, **kwargs) -> Dict:
     # Implement the evaluation logic and return the results as a dictionary
     X = np.array([[configuration[k] for idx, k in enumerate(configuration.keys())]])
     y = np.sum((X) ** 2, axis=1)
     results = {'function_value': float(y)}

     return results

By following these steps, you can successfully register a new benchmark problem in the PrismBO framework.

Synthetic Problem#

The synthetic problems in this section are widely used in the optimization literature for benchmarking optimization algorithms. These problems exhibit diverse characteristics and levels of complexity, making them ideal for testing the robustness and efficiency of different optimization strategies. Below is an overview of the synthetic problems included in this benchmark suite:

Sphere: A simple convex problem that is often used as a baseline. The global minimum is located at the origin, and the objective function value increases quadratically with distance from the origin.
Rastrigin: A non-convex problem characterized by a large number of local minima, making it challenging for optimization algorithms to find the global minimum.
Schwefel: Known for its complex landscape with many local minima, the Schwefel function requires optimization algorithms to balance exploration and exploitation effectively.
Ackley: A multi-modal function with a nearly flat outer region and a large hole at the center, making it difficult for algorithms to escape local minima and converge to the global minimum.
Levy: A multi-modal problem with a complex landscape that tests an algorithm’s ability to handle irregularities and identify global optima.
Griewank: A function with many widespread local minima, making it challenging to converge to the global optimum. It is often used to assess the ability of algorithms to avoid getting trapped in local minima.
Rosenbrock: A non-convex problem with a narrow, curved valley that contains the global minimum. This function is commonly used to test the convergence properties of optimization algorithms.
Dropwave: A challenging multi-modal function with steep drops, requiring careful search strategies to avoid local minima.
Langermann: This problem has many local minima and a highly irregular structure, testing an algorithm’s ability to explore complex search spaces.
Rotated Hyper-Ellipsoid: A rotated version of the ellipsoid function, which tests an algorithm’s capability to optimize problems with rotated and ill-conditioned landscapes.
Sum of Different Powers: A problem where each term in the sum contributes differently to the overall objective, requiring optimization algorithms to handle varying sensitivities across dimensions.
Styblinski-Tang: A function with multiple global minima, commonly used to test an algorithm’s ability to avoid suboptimal solutions.
Powell: A problem designed to challenge optimization algorithms with a mixture of convex and non-convex characteristics across different dimensions.
Dixon-Price: This function has a smooth, narrow valley leading to the global minimum, testing an algorithm’s ability to navigate such features.
Ellipsoid: A test problem that features high conditioning and elliptical level sets, requiring algorithms to efficiently search in skewed spaces.
Discus: A variant of the sphere function with a large difference in scale between the first variable and the rest, making it a test of handling unbalanced scales.
BentCigar: A highly anisotropic function where one direction has a much larger scale than the others, challenging algorithms to adjust their search strategies accordingly.
SharpRidge: This function has a sharp ridge along one dimension, testing an algorithm’s ability to optimize in narrow, high-gradient regions.
Katsuura: A multi-fractal function that combines periodicity and complexity, testing the capability of algorithms to explore intricate landscapes.
Weierstrass: A problem with a fractal structure, characterized by a large number of local minima and requiring algorithms to handle varying scales of roughness.
Different Powers: A problem where each term contributes differently to the objective, challenging algorithms to manage varying sensitivities and scales.
Trid: A function that has a curved and ridge-like structure, often used to assess the convergence properties of optimization algorithms.
LinearSlope: A simple linear function with a varying slope across dimensions, used to test the basic exploration capabilities of optimization methods.
Elliptic: Similar to the Ellipsoid function but with exponentially increasing scales, testing an algorithm’s ability to search efficiently in poorly conditioned spaces.
PERM: A complex combinatorial problem that combines different power terms, testing an algorithm’s ability to handle permutation-based search spaces.
Power Sum: A problem where each dimension contributes a power sum to the objective, requiring algorithms to handle large variations in sensitivity across variables.
Zakharov: A problem with a complex, non-linear interaction between variables, used to test an algorithm’s ability to navigate multi-variable coupling.
Six-Hump Camel: A low-dimensional, multi-modal problem with several local minima, requiring precise search strategies to find the global optimum.
Michalewicz: A problem known for its challenging steepness and periodicity, making it difficult for algorithms to locate the global minimum.
Moving Peak: A dynamic optimization problem where the objective function changes over time, used to assess an algorithm’s adaptability to changing landscapes.

These problems collectively provide a comprehensive suite for evaluating optimization algorithms across a broad range of difficulties, including convexity, multi-modality, separability, and conditioning.

Problem name	Mathematical formulation	Range
Sphere	\(f(\mathbf{x}) = \sum_{i=1}^d x_i^2\)	\(x_i \in [-5.12, 5.12]\)
Rastrigin	\(f(\mathbf{x}) = 10 d + \sum_{i=1}^d \left[ x_i^2 - 10 \cos(2 \pi x_i) \right]\)	\(x_i \in [-32.768, 32.768]\)
Schwefel	\(f(\mathbf{x}) = 418.9829 d - \sum_{i=1}^d x_i \sin\left(\sqrt{\left\|x_i\right\|}\right)\)	\(x_i \in [-500, 500]\)
Ackley	\(f(\mathbf{x}) = -a \exp \left(-b \sqrt{\frac{1}{d} \sum_{i=1}^d x_i^2}\right)\) \(-\exp \left(\frac{1}{d} \sum_{i=1}^d \cos \left(c x_i\right)\right) + a + \exp(1)\)	\(x_i \in [-32.768, 32.768]\)
Levy	\(f(\mathbf{x}) = \sin^2\left(\pi w_1\right) + \sum_{i=1}^{d-1}\left(w_i - 1\right)^2\) \(\left[1 + 10 \sin^2\left(\pi w_i + 1\right)\right] + \left(w_d - 1\right)^2\) \(\left[1 + \sin^2\left(2 \pi w_d\right)\right], w_i = 1 + \frac{x_i - 1}{4}\)	\(x_i \in [-10, 10]\)
Griewank	\(f(\mathbf{x}) = \sum_{i=1}^d \frac{x_i^2}{4000} - \prod_{i=1}^d \cos\left(\frac{x_i}{\sqrt{i}}\right) + 1\)	\(x_i \in [-600, 600]\)
Rosenbrock	\(f(\mathbf{x}) = \sum_{i=1}^{d-1}\left[100\left(x_{i+1} - x_i^2\right)^2 + \left(x_i - 1\right)^2\right]\)	\(x_i \in [-5, 10]\)
Dropwave	\(f(\mathbf{x}) = -\frac{1 + \cos\left(12 \sqrt{x_1^2 + x_2^2}\right)}{0.5\left(x_1^2 + x_2^2\right) + 2}\)	\(x_i \in [-5.12, 5.12]\)
Langermann	\(f(\mathbf{x}) = \sum_{i=1}^m c_i \exp\left(-\frac{1}{\pi} \sum_{j=1}^d \left(x_j - A_{ij}\right)^2\right)\) \(\cos\left(\pi \sum_{j=1}^d\left(x_j - A_{ij}\right)^2\right)\)	\(x_i \in [0, 10]\)
Rotated Hyper-Ellipsoid	\(f(\mathbf{x}) = \sum_{i=1}^d \sum_{j=1}^i x_j^2\)	\(x_i \in [-65.536, 65.536]\)
Sum of Different Powers	\(f(\mathbf{x}) = \sum_{i=1}^d x_i^{i+1}\)	\(x_i \in [-1, 1]\)
Styblinski-Tang	\(f(\mathbf{x}) = \frac{1}{2} \sum_{i=1}^d\left(x_i^4 - 16 x_i^2 + 5 x_i\right)\)	\(x_i \in [-5, 5]\)
Powell	\(f(\mathbf{x}) = \sum_{i=1}^{d/4}\left(x_{4i-3} + 10 x_{4i-2}\right)^2\) \(+ 5\left(x_{4i-1} - x_{4i}\right)^2\) \(+ \left(x_{4i-2} - 2 x_{4i-1}\right)^4\) \(+ 10\left(x_{4i-3} - x_{4i}\right)^4\)	\(x_i \in [-4, 5]\)
Dixon-Price	\(f(\mathbf{x}) = \left(x_1 - 1\right)^2 + \sum_{i=2}^d i\left(2 x_i^2 - x_{i-1}\right)^2\)	\(x_i \in [-10, 10]\)
Ellipsoid	\(f_2(\mathbf{x}) = \sum_{i=1}^D 10^{6 \frac{i-1}{D-1}} z_i^2 + f_{\mathrm{opt}}\)	\(x_i \in [-5, 5]\)
Discus	\(f(\mathbf{x}) = 10^6 x_1^2 + \sum_{i=2}^D x_i^2\)	\(x_i \in [-5, 5]\)
BentCigar	\(f(\mathbf{x}) = x_1^2 + 10^6 \sum_{i=2}^n x_i^2\)	\(x_i \in [-5, 5]\)
SharpRidge	\(f(\mathbf{x}) = x_1^2 + 100 \sqrt{\sum_{i=2}^D x_i^2}\)	\(x_i \in [-5, 5]\)
Katsuura	\(f(\mathbf{x}) = \frac{10}{D^2} \prod_{i=1}^D \left(1 + i \sum_{j=1}^{32} \frac{2^j x_i - \left[2^j x_i\right]}{2^j}\right)^{10 / D^{1.2}}\) \(- \frac{10}{D^2} + f_{\mathrm{pen}}(\mathbf{x})\)	\(x_i \in [-5, 5]\)
Weierstrass	\(f_{16}(\mathbf{x}) = 10 \left(\frac{1}{D} \sum_{i=1}^D \sum_{k=0}^{11} \frac{1}{2^k} \cos \left(2 \pi 3^k\left(z_i + \frac{1}{2}\right)\right) - f_0\right)^3\) \(+ \frac{10}{D} f_{\mathrm{pen}}(\mathbf{x})\)	\(x_i \in [-5, 5]\)
DifferentPowers	\(f(\mathbf{x}) = \sqrt{\sum_{i=1}^D x_i^{2 + 4 \frac{i-1}{D-1}}}\)	\(x_i \in [-5, 5]\)
Trid	\(f(\mathbf{x}) = \sum_{i=1}^d \left(x_i - 1\right)^2 - \sum_{i=2}^d x_i x_{i-1}\)	\(x_i \in [-d^2, d^2]\)
LinearSlope	\(f(\mathbf{x}) = \sum_{i=1}^D 5 s_i - s_i x_i\) \(s_i = \operatorname{sign}\left(x_i^{\mathrm{opt}}\right) 10^{\frac{i-1}{D-1}},\) \(\text{for } i=1, \ldots, D\)	\(x_i \in [-5, 5]\)
Elliptic	\(f(\mathbf{x}) = \sum_{i=1}^D \left(10^6\right)^{\frac{i-1}{D-1}} x_i^2\)	\(x_i \in [-5, 5]\)
PERM	\(f(\mathbf{x}) = \sum_{i=1}^d \left(\sum_{j=1}^d \left(j + \beta\right)\left(x_j^i - \frac{1}{j^i}\right)\right)^2\)	\(x_i \in [-d, d]\)
Power Sum	\(f(\mathbf{x}) = \sum_{i=1}^d \left[\left(\sum_{j=1}^d x_j^i\right) - b_i\right]^2\)	\(x_i \in [0, d]\)
Zakharov	\(f(\mathbf{x}) = \sum_{i=1}^d x_i^2 + \left(\sum_{i=1}^d 0.5 i x_i\right)^2\) \(+ \left(\sum_{i=1}^d 0.5 i x_i\right)^4\)	\(x_i \in [-5, 10]\)
Six-Hump Camel	\(f(\mathbf{x}) = \left(4 - 2.1 x_1^2 + \frac{x_1^4}{3}\right) x_1^2 + x_1 x_2\) \(+ \left(-4 + 4 x_2^2\right) x_2^2\)	\(x_1 \in [-3, 3], x_2 \in [-2, 2]\)
Michalewicz	\(f(\mathbf{x}) = -\sum_{i=1}^d \sin \left(x_i\right) \sin ^{2 m}\left(\frac{i x_i^2}{\pi}\right)\)	\(x_i \in [0, \pi]\)
Moving Peak	\(f(\mathbf{x}) = \sum_{i=1}^D \left(10^6\right)^{\frac{i-1}{D-1}} x_i^2\)	\(x_i \in [0, 100]\)
PERM 2	\(f(\mathbf{x}) = \sum_{i=1}^d\left(\sum_{j=1}^d\left(j^i+\beta\right)\left(\left(\frac{x_j}{j}\right)^i-1\right)\right)^2\)	\(x_i \in [-d, d]\)

Hyperparameter Optimization Problem#

This section provides an overview of the hyperparameter optimization problem including the hyperparameters used for various machine learning models and machine learning tasks used for generate problem instances.

Hyperparameters for Support Vector Machine (SVM)#

Support Vector Machines (SVM) are widely used for classification and regression tasks. They are particularly effective in high-dimensional spaces and situations where the number of dimensions exceeds the number of samples. The hyperparameters for SVM control the regularization and the kernel function, which are crucial for model performance.

Hyperparameter	Range	Type
C	\([-10, 10]\)	Continuous
gamma	\([-10, 10]\)	Continuous

Hyperparameters for AdaBoost#

AdaBoost is a popular ensemble method that combines multiple weak learners to create a strong classifier. It is particularly useful for boosting the performance of decision trees. The hyperparameters control the number of estimators and the learning rate, which affects the contribution of each classifier.

Hyperparameter	Range	Type
n_estimators	\([1, 100]\)	Integer
learning_rate	\([0.01, 1]\)	Continuous

Hyperparameters for Random Forest#

Random Forest is an ensemble learning method that builds multiple decision trees and merges them to get a more accurate and stable prediction. It is widely used for both classification and regression tasks. The hyperparameters include the number of trees, the depth of the trees, and various criteria for splitting nodes.

Hyperparameter	Range	Type
n_estimators	\([1, 1000]\)	Integer
max_depth	\([1, 100]\)	Integer
criterion	{gini, entropy}	Categorical
min_samples_leaf	\([1, 20]\)	Integer
min_weight_fraction_leaf	\([0.0, 0.5]\)	Continuous
min_impurity_decrease	\([0.0, 1.0]\)	Continuous

Hyperparameters for XGBoost#

XGBoost is an efficient and scalable implementation of gradient boosting, designed for speed and performance. It is widely used in machine learning competitions and industry for classification and regression tasks. The hyperparameters include learning rates, tree depths, and regularization parameters, which control the complexity of the model and its ability to generalize.

Hyperparameter	Range	Type
eta	\([-10.0, 0.0]\)	Continuous
max_depth	\([1, 15]\)	Integer
min_child_weight	\([0.0, 7.0]\)	Continuous
colsample_bytree	\([0.01, 1.0]\)	Continuous
colsample_bylevel	\([0.01, 1.0]\)	Continuous
reg_lambda	\([-10.0, 10.0]\)	Continuous
reg_alpha	\([-10.0, 10.0]\)	Continuous
subsample_per_it	\([0.1, 1.0]\)	Continuous
n_estimators	\([1, 50]\)	Integer
gamma	\([0.0, 1.0]\)	Continuous

Hyperparameters for GLMNet#

GLMNet is a regularized regression model that supports both LASSO and ridge regression. It is particularly useful for high-dimensional datasets where regularization is necessary to prevent overfitting. The hyperparameters control the strength of the regularization and the balance between L1 and L2 penalties.

Hyperparameter	Range	Type
lambda	\([0, 10^5]\)	Log-integer
alpha	\([0, 1]\)	Continuous
nlambda	\([1, 100]\)	Integer

Hyperparameters for AlexNet#

AlexNet is a convolutional neural network (CNN) architecture that revolutionized the field of computer vision by achieving significant improvements on the ImageNet dataset. The hyperparameters include learning rate, dropout rate, weight decay, and the choice of activation function, all of which are crucial for training deep neural networks.

Hyperparameter	Range	Type
learning_rate	\([10^{-5}, 10^{-1}]\)	Continuous
dropout_rate	\([0.0, 0.5]\)	Continuous
weight_decay	\([10^{-5}, 10^{-2}]\)	Continuous
activation_function	{ReLU, Leaky ReLU, ELU}	Categorical

Hyperparameters for 2-Layer Bayesian Neural Network (BNN)#

Bayesian Neural Networks (BNNs) provide a probabilistic interpretation of deep learning models by introducing uncertainty in the weights. This allows BNNs to express model uncertainty, which is crucial for tasks where uncertainty quantification is important. The hyperparameters include layer sizes, step length, burn-in period, and momentum decay.

Hyperparameter	Range	Type
layer 1	\([2^4, 2^9]\)	Log-integer
layer 2	\([2^4, 2^9]\)	Log-integer
step_length	\([10^{-6}, 10^{-1}]\)	Log-continuous
burn_in	\([0, 8]\)	Integer
momentum_decay	\([0, 1]\)	Log-continuous

Hyperparameters for CNNs#

Convolutional Neural Networks (CNNs) are the backbone of most modern computer vision systems. They are designed to automatically and adaptively learn spatial hierarchies of features through backpropagation. The hyperparameters include learning rate, momentum, regularization parameter, dropout rate, and activation function.

Hyperparameter	Range	Type
learning_rate	\([10^{-6}, 10^{-1}]\)	Continuous
momentum	\([0.0, 0.9]\)	Continuous
regularization_parameter	\([10^{-6}, 10^{-2}]\)	Continuous
dropout_rate	\([0, 0.5]\)	Continuous
activation_function	{ReLU, Leaky ReLU, Tanh, Sigmoid}	Categorical

Hyperparameters for ResNet18#

ResNet18 is a residual network architecture that introduced the concept of residual connections, allowing for the training of very deep networks by mitigating the vanishing gradient problem. The hyperparameters include learning rate, momentum, dropout rate, and weight decay.

Hyperparameter	Range	Type
learning_rate	\([10^{-5}, 10^{-1}]\)	Continuous
momentum	\([0, 1]\)	Continuous
dropout_rate	\([0, 0.5]\)	Continuous
weight_decay	\([10^{-5}, 10^{-2}]\)	Continuous

Hyperparameters for DenseNet#

DenseNet is a densely connected convolutional network that connects each layer to every other layer in a feed-forward fashion. This architecture improves the flow of information and gradients throughout the network, making it easier to train. The hyperparameters include learning rate, momentum, dropout rate, and weight decay.

Hyperparameter	Range	Type
learning_rate	\([2^3, 2^8]\)	Integer
momentum	\([0, 1]\)	Continuous
dropout_rate	\([0, 0.5]\)	Continuous
weight_decay	\([10^{-5}, 10^{-1}]\)	Continuous

Machine Learning Tasks#

This section lists the various sources of machine learning tasks used for hyperparameter optimization, including classification and regression problems. These datasets are widely recognized in the machine learning community and are used for benchmarking algorithms.

Source	Type	Number	IDs
OpenML-CC18	Classification	78	1-78
UC Irvine Repository	Classification/Regression	10	79-88
NAS-Bench-360	Classification/Regression	5	89-93
NATS-Bench	Classification	3	94-96
SVHN	Classification	1	97

Configurable Software Optimization Problem#

This section provides a summary of the configurable software optimization (CSO) tasks, which involve optimizing various software systems. The tasks are characterized by the number of variables, objectives, and workloads, along with the sources of these workloads.

Software Name	Variables	Objectives	Workloads	Workloads Source
LLVM	93	8	50	PolyBench, mibench
GCC	105	8	50	PolyBench, mibench
Mysql	28	14	18	benchbase, sysbench
Hadoop	206	1	29	HiBench

RNA Inverse Design Problem#

RNA inverse design involves designing RNA sequences that fold into specific secondary structures. This task is crucial for understanding and manipulating RNA function in various biological processes. The datasets listed here are commonly used benchmarks for RNA design algorithms.

Source	Min-Max Length (nt)	Samples
Eterna100	11-399	100
Rfam-learn test	50-446	100
RNA-Strand	4-4381	50
RNAStralign	30-1851	37149
ArchiveII	28-2968	2975

Protein Inverse Folding Problem#

Protein Inverse Folding involves creating new amino acids sequence folding into desiered backbone structure. These problems are essential for applications in drug design, biotechnology, and synthetic biology. The datasets listed here are widely used in protein inverse folding research.

Source	Type	Numbers
Absolute	Antibody design	159
CATH	Single-chain protein design	19752
Protein Data Bank	Multi-chain protein design	26361

Parallelization#

To-do

Benchmark Problems

Contents