CMA-ES with Optimal Covariance Update and Storage Complexity

Authors: Oswin Krause, Dídac Rodríguez Arbonès, Christian Igel

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compared the Cholesky-CMA-ES with other CMA-ES variants.We provide empirical performance results comparing the original CMA-ES with the new Cholesky-CMA-ES using various benchmark functions in section 4.
Researcher Affiliation Academia Oswin Krause Dept. of Computer Science University of Copenhagen Copenhagen, Denmark oswin.krause@di.ku.dkDídac R. Arbonès Dept. of Computer Science University of Copenhagen Copenhagen, Denmark didac@di.ku.dkChristian Igel Dept. of Computer Science University of Copenhagen Copenhagen, Denmark igel@di.ku.dk
Pseudocode Yes Algorithm 1: The Cholesky-CMA-ES.Algorithm 2: rank One Update(A, β, v)
Open Source Code Yes We added our algorithm to the open-source machine learning library Shark [Igel et al., 2008] and used LAPACK for high efficiency.
Open Datasets No We considered standard benchmark functions for derivative-free optimization given in Table 1.The paper uses mathematical benchmark functions (Sphere, Rosenbrock, etc.) which are defined by equations, not public datasets that would require specific links or citations for access.
Dataset Splits No The paper describes running 100 trials from different initial points and monitoring metrics. However, it does not specify explicit training, validation, or test dataset splits as it evaluates optimization algorithms on mathematical benchmark functions, not fixed datasets.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using 'Shark [Igel et al., 2008]' and 'LAPACK', but does not provide specific version numbers for these software components.
Experiment Setup Yes All parameters (µ, λ, ω, cσ, dσ, cc, c1, cµ) are set to their default values [Hansen, 2015, Table 1].All starting points were drawn uniformly from [0, 1], except for Sphere, where we sampled from N(0, I).For each function, we vary d {4, 8, 16, . . . , 256}.