reproducibilityindex.ai

Deep probabilistic subsampling for task-adaptive compressed sensing

Authors: Iris A.M. Huijben, Bastiaan S. Veeling, Ruud J.G. van Sloun

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate strong performance on reconstruction and classiﬁcation tasks of a toy dataset, MNIST, and CIFAR10 under stringent subsampling rates in both the pixel and the spatial frequency domain. We test the applicability of the proposed task-adaptive DPS framework for three datasets and two distinct tasks: image classiﬁcation and image reconstruction. The results presented in ﬁg. 2a show that image-domain sampling using DPS signiﬁcantly outperforms the ﬁxed sampling baselines (uniform and disk).
Researcher Affiliation	Academia	Iris A.M. Huijben Department of Electrical Engineering Eindhoven University of Technology Eindhoven, The Netherlands i.a.m.huijben@tue.nl Bastiaan S. Veeling Department of Computer Science, University of Amsterdam Amsterdam, The Netherlands basveeling@gmail.com Ruud J.G. van Sloun Department of Electrical Engineering Eindhoven University of Technology Eindhoven, The Netherlands r.j.g.v.sloun@tue.nl
Pseudocode	Yes	Algorithm 1 Deep Probabilistic Subsampling (DPS) Require: Training dataset D, Number of iterations niter, temperature parameter τ, initialized trainable parameters Φ and θ. Ensure: Trained logits matrix Φ and task network parameters θ.
Open Source Code	Yes	The code used for this paper is made publicly available1. 1https://github.com/Iam Huijben/Deep-Probabilistic-Subsampling.git
Open Datasets	Yes	Classiﬁcation performance was tested on the MNIST database (Le Cun et al., 1998), comprising 70,000 28 28 grayscale images of handwritten digits 0 to 9. The CIFAR10 database (Krizhevsky et al., 2009) contains 60,000 images of 32 32 pixels in 10 different classes.
Dataset Splits	Yes	We split the dataset into 50,000 training images, 5,000 validation, and 5,000 test images. We converted all images to grayscale, and subsequently split them into 50,000 training images, 5,000 validation and 5,000 test images.
Hardware Specification	No	The paper mentions training models and using optimizers, but does not specify any particular hardware components such as GPU models, CPU types, or cloud computing instance specifications.
Software Dependencies	No	The paper mentions using the 'ADAM solver' and provides its hyperparameters but does not list specific versions for any software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	Task model: After sampling M elements, all N zero-masked samples (or 2N in the case of complex Fourier samples) are passed through a series of 5 fully-connected layers, having N, 256, 128, 128 and 10 output nodes, respectively. The activations for all but the last layer were leaky Re LUs, and 20% dropout was applied after the ﬁrst three layers. Training details: We train the network to maximize the log-likelihood of the observations D = {(xi, si) \| i 0, . . . , L} through minimization of the categorical cross-entropy between the predictions and the labels, denoted by Ls. Penalty multiplier µ was set to linearly increase 1e 5 per epoch, starting from 0.0. The temperature parameter τ in eq. (7) was set to 2.0, and the sampling distribution parameters Φ were initialized randomly, following a zero-mean Gaussian distribution with standard deviation 0.25. Equation 9 was optimized using stochastic gradient descent on batches of 32 examples, approximating the expectation by a mean across the train dataset. To that end, we used the ADAM solver (β1 = 0.9, β2 = 0.999, and ϵ = 1e 7 ). We adopted different learning rates for the sampling parameters Φ and the parameters of the task model θ, being 2e 3 and 2e 4, respectively. For reconstruction tasks, learning rates for Φ and θ were 1e 3 and 1e 4, respectively, and µ and τ were respectively set to 2e 4 and 5.0. Mini-batches of 128 examples were used. For CIFAR10, λ was set to 0.004, learning rates for {Φ, ψ} and θ were 1e 3 and 2e 4, µ set to 1e 6, τ constant at 2.0. Batches of 8 images were used.