Task structure and nonlinearity jointly determine learned representational geometry
Authors: Matteo Alleman, Jack Lindsey, Stefano Fusi
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we conduct an in-depth investigation of the impact of input geometry, label geometry, and nonlinearity on learned representations. We employ a parameterized family of classification tasks that allows us to probe the impact of each of these factors independently and focus on single-hidden-layer networks in which we can precisely describe representation learning dynamics over the course of training. |
| Researcher Affiliation | Academia | Matteo Alleman , Jack Lindsey & Stefano Fusi Department of Neuroscience, Columbia University ma3811@columbia.edu, jackwlindsey@gmail.com, sf2237@columbia.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | To assess the applicability of our findings to more realistic tasks, we trained convolutional networks image classification task, experimenting with two architectures a small network with two convolutional and two fully connected layers, and the Res Net-18 architecture and two datasets, CIFAR-10 and STL-10. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | No | The paper lacks specific experimental setup details such as concrete hyperparameter values (learning rate, batch size, number of epochs), optimizer settings, or other system-level training configurations in the main text. |