Active Mini-Batch Sampling Using Repulsive Point Processes

Authors: Cheng Zhang, Cengiz Öztireli, Stephan Mandt, Giampiero Salvi5741-5748

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show empirically that our approach improves over standard SGD both in terms of convergence speed as well as final model performance.
Researcher Affiliation Collaboration 1. Microsoft Research, Cambridge, UK, Cheng.Zhang@microsoft.com 2. Disney Research, Zurich, Switzerland, cengiz.oztireli@disneyresearch.com 3. University of California,Irvine,Los Angeles, USA, stephan.mandt@gmail.com 4. KTH Royal Institute of Technology, Stockholm, Sweden, giampi@kth.se
Pseudocode Yes Algorithm 1 Draw throwing for Dense PDS
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper, nor does it explicitly state that its code is released.
Open Datasets Yes Oxford flower classification task as in (Zhang et al. 2017b), MNIST dataset (Le Cun et al. 1998a), speech command classification task as described in (Sainath and Parada 2015).
Dataset Splits Yes We use half of the training data and the full test data. (MNIST); Figure 6 shows the accuracy on the validation set evaluated every 50 training iterations. (Speech Command Recognition)
Hardware Specification No The paper mentions CPU time measurements but does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments.
Software Dependencies Yes standard multi-layer convolutional neural network (CNN) from Tensorflow3 is used in this experiment (footnote 3 links to https://www.tensorflow.org/versions/r0.12/tutorials/mnist/pros/).
Experiment Setup Yes We sample one mini-batch with batch size 30 using different sampling methods. For each method, we train a neural network classifier with one hidden layer of five units, using a single mini-batch. (Synthetic Data); A standard multi-layer convolutional neural network (CNN) from Tensorflow3 is used in this experiment with standard experimental settings (details in appendix). (MNIST)