Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Neural Set Functions Under the Optimal Subset Oracle
Authors: Zijing Ou, Tingyang Xu, Qinliang Su, Yingzhen Li, Peilin Zhao, Yatao Bian
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on three real-world applications (including Amazon product recommendation, set anomaly detection and compound selection for virtual screening) demonstrate that Equi VSet outperforms the baselines by a large margin. |
| Researcher Affiliation | Collaboration | 1Tencent AI Lab, China 2Imperial College London, United Kingdom 3Sun Yat-sen University, China |
| Pseudocode | Yes | Algorithm 1 MFVI(ψ, V, K), Algorithm 2 Diff MF(V, S ), Algorithm 3 Equi VSet(V, S ) |
| Open Source Code | Yes | Code is available at: https://github.com/Subset Selection/Equi VSet. |
| Open Datasets | Yes | In this experiment, we use the Amazon baby registry dataset (Gillenwater et al., 2014)...; Double MNIST: The dataset consists of 1000 images for each digit ranging from 00 to 99. (Sun, 2019). URL https://github.com/ shaohua0116/Multi Digit MNIST.; Celeb A: The Celeb A dataset contains 202, 599 images with 40 attributes. (Liu et al., 2015b).; PDBBind (Liu et al., 2015a): This dataset consists of experimentally measured binding affinities for bio-molecular complexes.; Binding DB9: It is a public database of measured binding affinities... We take the curated one from https://tdcommons.ai/multi_pred_tasks/dti/ |
| Dataset Splits | Yes | Then we split the remaining subset collection S into training, validation and test folds with a 1 : 1 : 1 ratio. We collect 1, 000 samples for training, validation, and test, respectively. For each dataset, we randomly split the training, validation, and test set to the size of 10, 000, 1, 000, and 1, 000, respectively. We finally obtain the training, validation, and test set with the size of 1,000, 100, and 100, respectively, for both two datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions using 'BERT model (Devlin et al., 2018)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries). |
| Experiment Setup | Yes | The model architectures and training details are deferred to Appendix E. (Main text) Appendix E.2: Training Details. We use Adam (Kingma & Ba, 2014) as the optimizer with initial learning rate 0.001, weight decay 0.0001, gradient clipping by value 1.0, and batch size 64. We train the model for 200 epochs, and apply a cosine annealing learning rate scheduler with 5 warm-up epochs. (Appendix E.2) |