Learning to Linearize Under Uncertainty
Authors: Ross Goroshin, Michael F. Mathieu, Yann LeCun
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 presents experimental results on relatively simple datasets to illustrate the main ideas of our work. The following experiments evaluate the proposed feature learning architecture and loss. In the next set of experiments we trained deep feature hierarchies that have the capacity to linearize a richer class of transformations. |
| Researcher Affiliation | Collaboration | Ross Goroshin 1 Michael Mathieu 1 Yann Le Cun1,2 1Dept. of Computer Science, Courant Institute of Mathematical Science, New York, NY 2Facebook AI Research, New York, NY {goroshin,mathieu,yann}@cs.nyu.edu |
| Pseudocode | Yes | Algorithm 1 Minibatch stochastic gradient descent training for prediction with uncertainty. The number of δ-gradient descent steps (k) is treated as a hyper-parameter. |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | trained on simulated 32x32 movie frames taken from You Tube videos [4]. trained on simulated videos generated using the NORB dataset |
| Dataset Splits | No | The paper discusses training and testing on datasets like NORB and YouTube videos but does not provide specific details regarding the percentages or absolute counts for training, validation, or test splits. It only generally refers to 'test set' for evaluation. |
| Hardware Specification | Yes | We also gratefully acknowledge NVIDIA Corporation for the donation of a Tesla K40 GPU used for this research. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the research, only describing the architectural components. |
| Experiment Setup | Yes | The number of δ-gradient descent steps (k) is treated as a hyper-parameter. Table 1: Summary of architectures |