Improving Pareto Front Learning via Multi-Sample Hypernetworks

Authors: Long P. Hoang, Dung D. Le, Tran Anh Tuan, Tran Ngoc Thang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines in producing the trade-off Pareto front. For all methods base on hypernetwork, we use a feed-forward network with various outputs to parameterize h(ϕ, r). The target network s weight tensor is produced by each output in a distinct way. Specifically, the input r is first mapped using a multi-layer perceptron network to a higher dimensional space in order to create shared features. A weight matrix for each layer in the target network is created by passing these features across fully connected layers. Experiments demonstrate that PHN-HVI outperforms other methods.
Researcher Affiliation Academia 1 College of Engineering and Computer Science, Vin University 2 School of Applied Mathematics and Informatics, Hanoi University of Science and Technology long.hp@vinuni.edu.vn, dung.ld@vinuni.edu.vn, tuan.ta181295@sis.hust.edu.vn, thang.tranngoc@hust.edu.vn
Pseudocode Yes Algorithm 1: PHN-HVI optimization algorithm
Open Source Code Yes 1We publish the code at https://github.com/longhoangphi225/Multi Sample-Hypernetworks
Open Datasets Yes Three benchmark datasets Multi-MNIST, Multi-Fashion, and Multi-Fashion+MNIST (Lin et al. 2019) are used in our evaluation. In this investigation, we concentrated on the Drug Review dataset (Grässer et al. 2018). Jura (Goovaerts et al. 1997): In this experiment, the goal variables are zinc, cadmium, copper, and lead (4 tasks), whereas the predictive features are the other metals, the type of land use, the type of rock, and the position coordinates at 359 different locations. SARCOS (Vijayakumar 2000): The goal is predict pertinent 7 joint torques (7 tasks) from a 21-dimensional input space (7 joint locations, 7 joint velocities, 7 joint accelerations).
Dataset Splits Yes On multi-task problems, the dataset is split into three subsets: training, validation, and testing. 10% of the training data are used for the validation split.
Hardware Specification Yes All experiments and methods in this paper are implemented with Pytorch (Paszke et al. 2019) and trained on a single NVIDIA GeForce RTX3090.
Software Dependencies No The paper mentions 'implemented with Pytorch (Paszke et al. 2019)' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes For toy examples, the optimization processes are run for 10,000 iterations, and 200 evenly distributed preference vectors are used for testing. We set p = 16, λ = 5 on the Multi-MNIST dataset, p = 16, λ = 4 on the Multi-Fashion dataset, and Multi-Fashion+MNIST dataset for PHN-HVI.