View Publication

Benchmarking Cross Entropy Exploiting partial Decomposability: Analysis of a Batch Sampling Method for Large-Scale Structured Optimization

Siamak Ravanbakhsh, Dept of Computing Science, University of Alberta
Russ Greiner, Dept of Computing Science; PI of AICML

Full Text: poster_final copy_from_jpeg.pdf

While many of the Monte Carlo methods for optimization are sequential sampling methods (e.g., simulated annealing), some important ones are batch sampling techniques, including Cross Entropy (a.k.a. CE) [1] and Probability Collectives [2]. These methods each start by defining a distribution over the optimization domain, which is iteratively refined, to improve the chance that an instance from this distribution is likely to be (near) an optimum. CE performs iterative improvements by sampling the current distribution, identifying the "elite" instances (the ones producing the best scores), then producing the maximum likelihood distribution over these instances. At convergence, CE returns the mode of final distribution as an approximation for a global optimum. These batch sampling methods are highly parallelizable, and also provide sensitivity information about the optimization landscape: Finding a distribution is peaked wrt a variable means a small change to that variable's value (in that region) will have a large impact on the value of the optimization function. This observation inspired our Cross Entropy Method for Exploiting partial Decomposability; CEED [3,4], which can deal with an optimization function that is the sum of loss sub-functions L(x) = L_i(ÃÂx_i), each involving a subset of variables. (Note a variable may appear in several loss sub-functions.) CEED first uses the CE approach for each Li individually: given a current distribution D_i over its X_i variables, CEED produces a new distribution D_i^t+1,. based on the elite subset of instances drawn from the D_i. CEED then forms a single distribution Dt over all of the X variables by combining the distributions {Dt_ii}: this involves first computing a single distribution over each variable Xj , by combining the marginals wrt each Dit that involves Xj . To be effective, this combination process must use the sensitivity of each Dti wrt Xj ; here we considered Fisher Information and the variance of the estimated max-likelihood parameters for the marginals of Dti over each single variable [3]. While variance of this ML estimate can be readily computed from the estimated parameters for discrete variables, it it is not readily available for continuous variables (when dealing with a member of the exponential family) We therefore use an adaptive binning approach to approximate a continuous variable as a categorical one. This report explores the effect of different characteristics of the problem as well as that of CEED''ÃÂÃÂs parameters on the convergence rate, for a difficult high-dimensional multi-modal optimization prob- lem. We evaluate how entanglement of the loss subfunctions {L_i} (corresponding to the variables in common across subfunctions) affects the performance of CEED, and illuminate the situations where CEED is most advantageous. The results demonstrate that incorporating information from the different L_i, rather than only looking at their sum, can increase the convergence rate by many folds. We compare our CEED method against its "father"Â CE, as well as other methods, including simulated annealing method. These analyses will also help us to further extend CEED, towards developing extensions capable of solving some major challenges, such as protein folding, protein- protein and protein-ligand docking and in general any energy minimization task that involves many losses (energy terms), when most of the terms depend on only a small subset of variables.

Citation

S. Ravanbakhsh, R. Greiner. "Benchmarking Cross Entropy Exploiting partial Decomposability: Analysis of a Batch Sampling Method for Large-Scale Structured Optimization". NIPS: Monte Carlo Methods for Bayesian Inference in Modern Day Applications, December 2010.

Keywords:	Optimization, Benchmark, decomposable, Importance Sampling, Monte-Carlo, Machine Learning
Category:

BibTeX

@incollection{Ravanbakhsh+Greiner:10,
  author = {Siamak Ravanbakhsh and Russ Greiner},
  title = {Benchmarking Cross Entropy Exploiting partial Decomposability:
    Analysis of a Batch Sampling Method for Large-Scale Structured
    Optimization},
  booktitle = {NIPS: Monte Carlo Methods for Bayesian Inference in Modern Day
    Applications},
  year = 2010,
}

Last Updated: March 30, 2011
Submitted by Russ Greiner

Not Logged In

PapersDB

Benchmarking Cross Entropy Exploiting partial Decomposability: Analysis of a Batch Sampling Method for Large-Scale Structured Optimization

Citation

BibTeX