Most Influential Subset Selection: Challenges, Promises, and Beyond
Authors: Yuzheng Hu, Pingbang Hu, Han Zhao, Jiaqi Ma
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on both synthetic and real-world datasets. The experimental results not only corroborate the theoretical findings but also extend to more complex settings including classification tasks and non-linear models, showcasing the consistent benefits of adaptivity. |
| Researcher Affiliation | Academia | Yuzheng Hu1 Pingbang Hu2 Han Zhao1 Jiaqi W. Ma2 1Department of Computer Science 2School of Information Sciences University of Illinois Urbana-Champaign {yh46,pbb,hanzhao,jiaqima}@illinois.edu |
| Pseudocode | No | The paper describes algorithms like ZAMinfluence and adaptive greedy algorithm in text, but it does not present them in a structured pseudocode block or algorithm box. |
| Open Source Code | Yes | 1Our code is publicly available at https://github.com/sleepymalc/MISS. |
| Open Datasets | Yes | For regression, we choose a popular UCI dataset Concrete Compressive Strength [Yeh, 2007]. For classification, we experiment with a moderate-scale UCI tabular dataset Waveform Database Generator [Breiman and Stone, 1988] and an image dataset MNIST [Le Cun et al., 1998]. |
| Dataset Splits | No | For the first two UCI datasets, we randomly sample 50 data points as the test set and use the remaining for training. For MNIST, to control the scale of the experiments, we sample 5000 data points from the train split for training and 50 data points from the test split for testing. |
| Hardware Specification | Yes | We conduct our experiments on Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz with Nvidia A40 GPU. |
| Software Dependencies | No | The paper mentions activation functions (Re LU) and optimizers (SGD), and an approximation algorithm (EK-FAC), but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions). |
| Experiment Setup | Yes | We train the model using Stochastic Gradient Descent (SGD) [Ruder, 2016] till convergence, with a learning rate of 0.01 and momentum of 0.9. Empirically, we observe that after 30 epochs the model converges, hence for simplicity, we set the default epochs to be 30. Hyper-parameter selection. The reported hyper-parameters above were selected via grid search. We swept across hidden unit number (denoted as width ) {64, 128}, learning rate (denoted as lr ) {0.01, 0.05, 0.1, 0.5}, momentum (denoted as β) {0.9, 0.95}, and training epochs (denoted as epochs ) {30, 50}. |