Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness

Authors: Chenghan Xie, Chenxi Li, Chuwen Zhang, Qi Deng, Dongdong Ge, Yinyu Ye

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform three sets of experiments in machine learning with a focus on DRO to justify our analysis. Figure (1a) and (1b) present the training curves of SGD, first-order (FOTRGS) and second-order (SOTRGS) trust region methods and also their corresponding variance-reduced variants on MNIST and Fashion MNIST datasets, respectively. Table (2a) and (2b) presents the test accuracy of different methods.
Researcher Affiliation Academia 1School of Information Management and Engineering, Shanghai University of Finance and Economics 2School of Mathematical Sciences, Fudan University 3Department of Management Science and Engineering, Stanford University
Pseudocode Yes Algorithm 1: The trust region framework. Algorithm 2: Variance-reduced trust region method.
Open Source Code Yes The complete description is left in the Appendix, and code is available at https://github.com/bzhangcw/pydrsom-dro.
Open Datasets Yes We focus on classification tasks with imbalanced distributions arising from applications with heterogeneous (but often latent) subpopulations. Since in standard datasets like MNIST, Fashion MNIST and CIFAR-10, the population ratios (number of images per class) are the same, we create a perturbed dataset that inherits a disparity (Hashimoto et al. 2018) by choosing only a subset of training samples for each one of the categories.
Dataset Splits No The paper mentions using training samples and reports test accuracy, but it does not specify explicit percentages or counts for training, validation, and test splits needed for reproduction.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No The paper states that a "grid search over the parameters" was performed and that "The complete description is left in the Appendix," but it does not provide specific hyperparameter values or detailed training configurations within the main text.