Learning Neural Set Functions Under the Optimal Subset Oracle

Authors: Zijing Ou, Tingyang Xu, Qinliang Su, Yingzhen Li, Peilin Zhao, Yatao Bian

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical studies on three real-world applications (including Amazon product recommendation, set anomaly detection and compound selection for virtual screening) demonstrate that Equi VSet outperforms the baselines by a large margin.
Researcher Affiliation Collaboration 1Tencent AI Lab, China 2Imperial College London, United Kingdom 3Sun Yat-sen University, China
Pseudocode Yes Algorithm 1 MFVI(ψ, V, K), Algorithm 2 Diff MF(V, S ), Algorithm 3 Equi VSet(V, S )
Open Source Code Yes Code is available at: https://github.com/Subset Selection/Equi VSet.
Open Datasets Yes In this experiment, we use the Amazon baby registry dataset (Gillenwater et al., 2014)...; Double MNIST: The dataset consists of 1000 images for each digit ranging from 00 to 99. (Sun, 2019). URL https://github.com/ shaohua0116/Multi Digit MNIST.; Celeb A: The Celeb A dataset contains 202, 599 images with 40 attributes. (Liu et al., 2015b).; PDBBind (Liu et al., 2015a): This dataset consists of experimentally measured binding affinities for bio-molecular complexes.; Binding DB9: It is a public database of measured binding affinities... We take the curated one from https://tdcommons.ai/multi_pred_tasks/dti/
Dataset Splits Yes Then we split the remaining subset collection S into training, validation and test folds with a 1 : 1 : 1 ratio. We collect 1, 000 samples for training, validation, and test, respectively. For each dataset, we randomly split the training, validation, and test set to the size of 10, 000, 1, 000, and 1, 000, respectively. We finally obtain the training, validation, and test set with the size of 1,000, 100, and 100, respectively, for both two datasets.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper mentions using 'BERT model (Devlin et al., 2018)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries).
Experiment Setup Yes The model architectures and training details are deferred to Appendix E. (Main text) Appendix E.2: Training Details. We use Adam (Kingma & Ba, 2014) as the optimizer with initial learning rate 0.001, weight decay 0.0001, gradient clipping by value 1.0, and batch size 64. We train the model for 200 epochs, and apply a cosine annealing learning rate scheduler with 5 warm-up epochs. (Appendix E.2)