here we illustrate the integration of SoTA BO method TuRBO [2] with LA-MCTS. We use TuRBO-1 (no bandit) for solving minf(x) within the selected region, and make the following changes inside TuRBO, which is summarized in Fig. 2(c).
a) At every re-starts, we initialize TuRBO with random samples only in Ωselected. The shape of Ωselected can be arbitrary,so we use the rejected sampling (uniformly samples and reject outliers with SVM) to get a few points inside Ωselected. Since we only need a few samples for the initialization, the reject sampling is sufficient.
b) TuRBO centers a bounding box at the best solution so far, while we restrict the center to be the best solution in Ωselected.
c) TuRBO uniformly samples from the bounding box to feed the acquisition to select the best as the next sample, and we restrict the TuRBO to uniformly sample from the intersection of the bounding box and Ωselected. The intersection is guaranteed to exist because the center is within Ωselected. At each iteration, we keep TuRBO running until the
size of trust-region goes 0, and all the evaluations, i.e. xi and f(xi), are returned to LA-MCTS to
refine learned boundaries in the next iteration. Noted our method is also extensible to other solvers