Scalable Subset Sampling with Neural Conditional Poisson Networks

Authors: Adeel Pervez, Phillip Lippe, Efstratios Gavves

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our approach extensively, on image and text model explanation, image subsampling and stochastic k-nearest neighbor tasks outperforming existing methods in accuracy, efficiency and scalability.
Researcher Affiliation Academia Adeel Pervez QUVA Lab, Informatics Institute University of Amsterdam a.a.pervez@uva.nl
Pseudocode Yes The full algorithm is described in Algorithm 3 and pseudocode is given in the appendix in Algorithm 4.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, such as a specific repository link or an explicit code release statement.
Open Datasets Yes For the text classification experiments we use the Large Movie Review Dataset (Maas et al., 2011). We also use the 20Newsgroups dataset (Rennie & Lang, 2008). CIFAR-10 (Krizhevsky, 2009) and STL-10 (Coates et al., 2011). Celeb A-HQ (Lee et al., 2020) dataset
Dataset Splits Yes The models to be explained in both cases are convolutional neural networks, which achieve 90% and 70% test set accuracy on IMDB and 20Newgsroups, respectively. For CIFAR-10 we explain a simple CNN model with 8 convolutional layers that achieves 80% val. accuracy, and for STL-10 a Res Net-10 model that achieves 75% val. accuracy.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes In our experiments, we choose t between 5 and 8. For this we add a squared loss term in the loss expression as γ(µk ˆk)2, where µk is the mini-batch average k computed the network, and γ is the regularization strength chosen from {0.1, 0.01, 0.001}. We use a small 6-layer CNN with max-pooling layers for downsampling, a final global average pooling layer for the output, and train the model for 80 epochs.