Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Swift Sampler: Efficient Learning of Sampler by 10 Parameters

Authors: Jiawei Yao, Chuming Li, Canran Xiao

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on various tasks demonstrate that SS powered sampling can achieve obvious improvements (e.g., 1.5% on Image Net) and transfer among different neural networks.
Researcher Affiliation Academia 1 University of Washington 2 Shanghai Artifcial Intelligence Laboratory 3 The University of Sydney 4 Central South University EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 SS
Open Source Code Yes Project page: https://github.com/Alexander-Yao/Swift-Sampler.
Open Datasets Yes We apply SS to training neural networks with various sizes, including Res Net-18 and SE-Res Next101, with training data from different data sets including Image Net [Russakovsky et al., 2015], CIFAR10 and CIFAR100 [Krizhevsky et al., 2009].
Dataset Splits Yes For a target task, e.g., image classification, its training set and validation set are respectively denoted by Dt and Dv, and the parameters of the target model are denote by w. ... Specifically, the network with parameters w (τ) obtained from the inner loop is used for searching the sampler τ that has the best score P (Dv; w (τ)) on validation set Dv
Hardware Specification Yes We set the number of segments S as 4 in all cases and utilize 8 NVIDIA A100 GPUs to ensure efficient processing.
Software Dependencies No The paper mentions software components and optimizers like 'SGD with Nesterov' and 'L2 regularization' but does not provide specific version numbers for any libraries or frameworks (e.g., TensorFlow, PyTorch, or Python).
Experiment Setup Yes In all experiments, the optimization step Eo is fixed as 40, and the fine-tune epochs Ef are set to 5. We set the number of segments S as 4 in all cases... We set batch size as 128 and the L2 regularization as 1e-3. The training process lasts 80 epochs, and the learning rate is initialized as 0.1 and decays by time 0.1 at the 40-th and 80-th epoch. We adopt mini-batch SGD with Nesterov and set the momentum as 0.9.