Learning Ordered Representations with Nested Dropout
Authors: Oren Rippel, Michael Gelbart, Ryan Adams
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we present results on ordered representations of data in which different dimensions have different degrees of importance. ... We then extend the algorithm to deep models and demonstrate the relevance of ordered representations to a number of applications. ... We also show that ordered representations are a promising way to learn adaptive compression for efficient online data reconstruction. ... Specifically, we use the ordered property of the learned codes to construct hash-based data structures that permit very fast retrieval, achieving retrieval in time logarithmic in the database size and independent of the dimensionality of the representation. |
| Researcher Affiliation | Academia | Oren Rippel RIPPEL@MATH.MIT.EDU Department of Mathematics, MIT; School of Engineering and Applied Sciences, Harvard University. Michael A. Gelbart MGELBART@SEAS.HARVARD.EDU Program in Biophysics and School of Engineering and Applied Sciences, Harvard University. Ryan P. Adams RPA@SEAS.HARVARD.EDU School of Engineering and Applied Sciences, Harvard University |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm', nor does it present any structured, code-like blocks outlining a procedure. |
| Open Source Code | No | The paper does not contain any statements indicating that source code for the described methodology is publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We trained on 80MTI a binarized nested dropout autoencoder... The 80MTI are 79,302,017 color images of size 32x32. (Torralba et al., 2008) |
| Dataset Splits | No | The paper mentions training on 'minibatches of size 10,000' and training for '2 epochs', but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions training 'on two GPUs', but it does not provide specific hardware details such as the model or type of GPUs, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions 'nonlinear conjugate gradients algorithm' and 'libjpeg library', but it does not specify any version numbers for these or any other software components used in the experiments. |
| Experiment Setup | Yes | We train for 2 epochs on minibatches of size 10,000. ... with probability 0.1 we independently corrupt input elements to 0. ... For all layers other than the representation layer, we apply standard dropout with probability 0.2. ... We chose p B( ) Geom(0.97) and the binarization quantile β = 0.2. |