Set Based Stochastic Subsampling
Authors: Bruno Andreis, Seanie Lee, A. Tuan Nguyen, Juho Lee, Eunho Yang, Sung Ju Hwang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate SSS on multiple datasets and tasks such as 1D function regression, 2D image reconstruction and classification for both feature and instance selection. The experimental results show that SSS is able to subsample with minimal degradation on the target task performance under extremely low subsampling rates, largely outperforming the relevant baselines. |
| Researcher Affiliation | Collaboration | 1Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST), Seoul, South Korea 2University of Oxford, Oxford, United Kingdom 3AITRICS, Seoul, South Korea. |
| Pseudocode | Yes | Algorithm 1 Greedy Training Algorithm... Algorithm 2 Fixed Size Subsampling. |
| Open Source Code | No | The paper refers to an 'open-source implementation' for a baseline (FPS) but does not state that the code for their own proposed methodology (SSS) is open-source or publicly available. |
| Open Datasets | Yes | We validate SSS on multiple datasets and tasks such as 1D function regression, 2D image reconstruction and classification for both feature and instance selection. ... MNIST Given an MNIST image ... Celeb A The Celeb A dataset ... mini Image Net dataset (Vinyals et al., 2016) |
| Dataset Splits | No | The paper mentions testing on the 'full MNIST test set' but does not provide explicit details on training/validation/test splits for all datasets used, such as specific percentages, sample counts, or explicit cross-validation setups. |
| Hardware Specification | No | The paper mentions 'GPU memory' in Appendix J but does not provide specific details on the hardware used for running the experiments, such as exact GPU/CPU models or processor types. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | Constraining the size of Dc For computational efficiency, we want to restrict the size of Dc to save computational cost when constructing Ds. Hence we introduce a sparse Bernoulli prior p(Z) = Qn i=1 Ber(zi; r) with small r > 0 and minimize the KL divergence along with a target downstream task loss ℓ( , Ds) w.r.t θ as follows: Ep(D) Epθ(Ds|D)[ℓ( , Ds)] + βKL[pθ(Z|D)||p(Z)] where pθ(Z|D) = Qn i=1 pθ(zi|di, D) and β > 0 is a hyperparmeter used to control the sparsity level in Z. ... MLP We use an architecture with 5 layers with outputs 784, 256, 128, 128 and 10 respectively. With the exception of the last layer, all layers are followed by a Leaky Re LU activation function. The 3rd linear layer is also followed by a dropout layer with p = 0.2. |