Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent

Authors: Xingchao Liu, Xin Tong, Qiang Liu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our method, especially the SVGD algorithm, through extensive experiments, showing its superiority over existing gradient-based algorithms. We show our method, Multi-Objective Stein Variational Gradient Descent (MOO-SVGD) can efficiently profile the Pareto front on a variety of problems, including the ZDT problem set, tri-objective problem Ma F1, trade-off between accuracy and fairness metrics in fair ML, as well as multi-task learning on Multi MNIST, Multi Fashion and Multi Fashion+MNIST.
Researcher Affiliation Academia Xingchao Liu Department of Computer Science University of Texas at Austin xcliu@utexas.edu Xin T. Tong Department of Mathematics National University of Singapore mattxin@nus.edu.sg Qiang Liu Department of Computer Science University of Texas at Austin lqiang@cs.texas.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks labeled as such. Equation (3) describes the update rule but is not formatted as an algorithm.
Open Source Code Yes Code is available at https://github.com/gnobitab/Multi Objective Sampling.
Open Datasets Yes Multi MNIST, Multi Fashion and Multi Fashion+MNIST, introduced in [34], are three standard benchmarks for multi-task deep learning. For this experiment, please refer to the appendix to see how we construct our dataset.
Dataset Splits No The paper mentions training and test sets but does not specify a separate validation set or its split percentage/count. For example, 'All three datasets have 120,000 samples in the training set, and 20,000 samples for test set.' provides no validation split information.
Hardware Specification No The paper states 'we run all the algorithms with CPU on the same computational platform with 192GB memory and 48 cores.' While this provides some detail, it lacks specific CPU model numbers or GPU specifications that would allow for precise replication of the hardware environment.
Software Dependencies No The paper mentions that models are 'optimized with SGD' but does not provide specific software names with version numbers for libraries, frameworks, or other dependencies needed for replication.
Experiment Setup Yes We use 50 particles for MOO-SVGD. We initialize the networks with same parameters, then warm-up them by training with naive linear scalarization objective for one epoch, where the weights are sampled uniformly from [0, 1]. We switch back to MOO-SVGD in the remaining epochs. For all the experiments, all the methods are trained for 100 epochs with a batch size of 256. Moreover, they are optimized with SGD and the learning rate is 0.001.