Most Influential Subset Selection: Challenges, Promises, and Beyond

Authors: Yuzheng Hu, Pingbang Hu, Han Zhao, Jiaqi Ma

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on both synthetic and real-world datasets. The experimental results not only corroborate the theoretical findings but also extend to more complex settings including classification tasks and non-linear models, showcasing the consistent benefits of adaptivity.
Researcher Affiliation Academia Yuzheng Hu1 Pingbang Hu2 Han Zhao1 Jiaqi W. Ma2 1Department of Computer Science 2School of Information Sciences University of Illinois Urbana-Champaign {yh46,pbb,hanzhao,jiaqima}@illinois.edu
Pseudocode No The paper describes algorithms like ZAMinfluence and adaptive greedy algorithm in text, but it does not present them in a structured pseudocode block or algorithm box.
Open Source Code Yes 1Our code is publicly available at https://github.com/sleepymalc/MISS.
Open Datasets Yes For regression, we choose a popular UCI dataset Concrete Compressive Strength [Yeh, 2007]. For classification, we experiment with a moderate-scale UCI tabular dataset Waveform Database Generator [Breiman and Stone, 1988] and an image dataset MNIST [Le Cun et al., 1998].
Dataset Splits No For the first two UCI datasets, we randomly sample 50 data points as the test set and use the remaining for training. For MNIST, to control the scale of the experiments, we sample 5000 data points from the train split for training and 50 data points from the test split for testing.
Hardware Specification Yes We conduct our experiments on Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz with Nvidia A40 GPU.
Software Dependencies No The paper mentions activation functions (Re LU) and optimizers (SGD), and an approximation algorithm (EK-FAC), but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions).
Experiment Setup Yes We train the model using Stochastic Gradient Descent (SGD) [Ruder, 2016] till convergence, with a learning rate of 0.01 and momentum of 0.9. Empirically, we observe that after 30 epochs the model converges, hence for simplicity, we set the default epochs to be 30. Hyper-parameter selection. The reported hyper-parameters above were selected via grid search. We swept across hidden unit number (denoted as width ) {64, 128}, learning rate (denoted as lr ) {0.01, 0.05, 0.1, 0.5}, momentum (denoted as β) {0.9, 0.95}, and training epochs (denoted as epochs ) {30, 50}.