Learning Neural Set Functions Under the Optimal Subset Oracle
Authors: Zijing Ou, Tingyang Xu, Qinliang Su, Yingzhen Li, Peilin Zhao, Yatao Bian
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on three real-world applications (including Amazon product recommendation, set anomaly detection and compound selection for virtual screening) demonstrate that Equi VSet outperforms the baselines by a large margin. |
| Researcher Affiliation | Collaboration | 1Tencent AI Lab, China 2Imperial College London, United Kingdom 3Sun Yat-sen University, China |
| Pseudocode | Yes | Algorithm 1 MFVI(ψ, V, K), Algorithm 2 Diff MF(V, S ), Algorithm 3 Equi VSet(V, S ) |
| Open Source Code | Yes | Code is available at: https://github.com/Subset Selection/Equi VSet. |
| Open Datasets | Yes | In this experiment, we use the Amazon baby registry dataset (Gillenwater et al., 2014)...; Double MNIST: The dataset consists of 1000 images for each digit ranging from 00 to 99. (Sun, 2019). URL https://github.com/ shaohua0116/Multi Digit MNIST.; Celeb A: The Celeb A dataset contains 202, 599 images with 40 attributes. (Liu et al., 2015b).; PDBBind (Liu et al., 2015a): This dataset consists of experimentally measured binding affinities for bio-molecular complexes.; Binding DB9: It is a public database of measured binding affinities... We take the curated one from https://tdcommons.ai/multi_pred_tasks/dti/ |
| Dataset Splits | Yes | Then we split the remaining subset collection S into training, validation and test folds with a 1 : 1 : 1 ratio. We collect 1, 000 samples for training, validation, and test, respectively. For each dataset, we randomly split the training, validation, and test set to the size of 10, 000, 1, 000, and 1, 000, respectively. We finally obtain the training, validation, and test set with the size of 1,000, 100, and 100, respectively, for both two datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions using 'BERT model (Devlin et al., 2018)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries). |
| Experiment Setup | Yes | The model architectures and training details are deferred to Appendix E. (Main text) Appendix E.2: Training Details. We use Adam (Kingma & Ba, 2014) as the optimizer with initial learning rate 0.001, weight decay 0.0001, gradient clipping by value 1.0, and batch size 64. We train the model for 200 epochs, and apply a cosine annealing learning rate scheduler with 5 warm-up epochs. (Appendix E.2) |