The Sparse Manifold Transform

Authors: Yubei Chen, Dylan Paiton, Bruno Olshausen

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a theoretical description of the transform and demonstrate properties of the learned representation on both synthetic data and natural videos. ... All experiments performed on natural scenes used the same dataset, described in Supplement D.
Researcher Affiliation Academia Yubei Chen1,2 Dylan M Paiton1,3 Bruno A Olshausen1,3,4 1Redwood Center for Theoretical Neuroscience 2Department of Electrical Engineering and Computer Science 3Vision Science Graduate Group 4Helen Wills Neuroscience Institute & School of Optometry University of California, Berkeley Berkeley, CA 94720 yubeic@eecs.berkeley.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets No The paper states, 'All experiments performed on natural scenes used the same dataset, described in Supplement D.' However, Supplement D is not provided in the main text, and no specific name, citation, or link to a publicly available dataset is given within the paper itself.
Dataset Splits No The paper mentions applying the SMT to 'sequences of whitened 20x20 pixel image patches extracted from natural videos' and training networks, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We applied the SMT optimization procedure on sequences of whitened 20 20 pixel image patches extracted from natural videos. We first learned a 10 overcomplete spatial dictionary Φ 2 IR400 4000 and coded each frame xt as a 4000-dimensional sparse coefficient vector t. We then derived an embedding matrix P 2 IR200 4000 by solving equation 8.