Transferring Optimality Across Data Distributions via Homotopy Methods
Authors: Matilde Gargiani, Andrea Zanelli, Quoc Tran Dinh, Moritz Diehl, Frank Hutter
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on a toy regression dataset and for transferring optimized parameters from MNIST to Fashion-MNIST and CIFAR-10 show substantial improvement of the numerical performance over random initialization and pre-training. |
| Researcher Affiliation | Collaboration | Matilde Gargiani1, Andrea Zanelli2, Quoc Tran-Dinh3, Moritz Diehl2,4, Frank Hutter1,5 1Department of Computer Science, University of Freiburg {gargiani, fh}@cs.uni-freiburg.de 2Department of Microsystems Engineering (IMTEK), University of Freiburg {andrea.zanelli, moritz.diehl}@imtek.uni-freiburg.de 3Department of Statistics and Operations Research, University of North Carolina quoctd@email.unc.edu 4Department of Mathematics, University of Freiburg 5Bosch Center for Artificial Intelligence |
| Pseudocode | Yes | Conceptually, Algorithm 1 describes the basic steps of a general homotopy algorithm. Algorithm 1 A Conceptual Homotopy Algorithm 1: θ0 θ 0 arg minθ H(θ, 0) 2: γ > 0 , γ Z 3: λ0 = 0, λ = 1/γ 4: k > 0 , k Z 5: for i = 1, . . . , γ do 6: λi λi 1 + λ 7: procedure θi ITERATIVESOLVER(θi 1, k, H(θ, λi)) 8: return θγ |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | Empirical evaluations on a toy regression dataset and for transferring optimized parameters from MNIST to Fashion-MNIST and CIFAR-10 show substantial improvement of the numerical performance over random initialization and pre-training. |
| Dataset Splits | No | Section 6.1 states: 'Each considered dataset has 10000 samples split across training and testing...' While a train/test split is mentioned, no specific information about a validation split, exact percentages for splits, or cross-validation methodology is provided. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as CPU or GPU models, memory, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions using 'Adam as optimizer' and 'VGG-type network'. However, it does not specify any version numbers for these or other software components (e.g., Adam version, PyTorch version, Python version), which are necessary for reproducibility. |
| Experiment Setup | Yes | For the experiments in Figures 1a 1b, Figures 7a 7b in the appendix, and Figure 2a, we set α = 0.001, γ = 10, k = 200 and then performed an additional 500 epochs on the final target problem, while for the experiments in Figure 2b, we set γ = 10, k = 300 and performed an additional 600 epochs on the final target problem. In this last scenario we set α = 0.001 and then decrease it with a cosine annealing schedule to observe convergence to an optimum. |