Estimating Shape Distances on Neural Representations with Limited Samples
Authors: Dean A Pospisil, Brett W. Larsen, Sarah E Harvey, Alex H Williams
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 APPLICATIONS AND EXPERIMENTS 4.1 VALIDATION ON SYNTHETIC DATA We first validate our method-of-moments estimator (section 3.3) on simulated responses from a multivariate normal distribution. 4.2 APPLICATION TO BIOLOGICAL DATA Here we investigate noisy non-Gaussian data where the covariance of the ˆWp and the denominator of the similarity score must be estimated from data. We do so by applying our estimator to neural data: calcium recordings from mouse primary visual cortex in responses to a set of 2,800 natural images repeated twice (Stringer et al., 2019). 4.3 APPLICATION TO ARTIFICIAL NEURAL NETWORK REPRESENTATIONS In Appendix E we apply the plug-in and moment-based estimator to penultimate layer representations between two Res Net-50 architectures (He et al., 2016) trained on Image Net classification Deng et al. (2009). |
| Researcher Affiliation | Collaboration | 1Princeton University, Princeton, NJ, 08544; dp4846@princeton.edu 2New York University, Center for Neural Science, New York, NY, 10003 3Flatiron Institute, Center for Computational Neuroscience, New York, NY, 10010 4Flatiron Institute, Center for Computational Mathematics, New York, NY, 10010 |
| Pseudocode | No | The paper does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for our project is available at https://github.com/dp4846/eigmom_shape_stats/. |
| Open Datasets | Yes | All analyses done in this paper were performed on the pre-processed data available on figshare (https://figshare.com/articles/Recordings_of_ten_thousand_neurons_ in_visual_cortex_in_response_to_2_800_natural_images/6845348). |
| Dataset Splits | No | The paper describes using different sample sizes (M) for analysis and comparison, and setting 'ground truth' through specific data manipulations (e.g., 'To set similarity to 0 we measured similarity between different subpopulations...', 'To set the similarity to 1 we measured similarity between the same subpopulation...'). It mentions 'three independent folds of the stimulus set (M = 400 stimuli each)' for analysis in Figure 4C, and 'When we included all stimuli (M 2800)'. However, it does not specify conventional training, validation, and test dataset splits for model development or evaluation in the way machine learning papers typically do. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions software components such as 'Suite2p toolbox', 'non-negative spike deconvolution (Frierich et. al., 2017)', 'Res Net-50 (He et al., 2016) architectures', and 'Pytorch'. However, it does not provide specific version numbers for these software dependencies, which is necessary for reproducible setup details. |
| Experiment Setup | No | The paper focuses on the statistical properties of shape distance estimators and their application. While it describes how it applies its estimator to neural data and pre-trained neural networks (ResNet-50), it does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training schedules for these neural networks. For its own estimator, it mentions controlling 'bias less than 5 %' or '10% bias', but this is a characteristic of the estimator, not a system-level training parameter. |