Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the special role of class-selective neurons in early training
Authors: Omkar Ranadive, Nikhil Thakurdesai, Ari S. Morcos, Matthew L Leavitt, Stephane Deny
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We attempt to answer this question in a series of experiments on Res Net-50s trained on Image Net. We first show that class-selective neurons emerge during the first few epochs of training, before receding rapidly but not completely; this suggests that class-selective neurons found in trained networks are in fact vestigial remains of early training. With single-neuron ablation experiments, we then show that class-selective neurons are important for network function in this early phase of training. We also observe that the network is close to a linear regime in this early phase; we thus speculate that class-selective neurons appear early in training as quasi-linear shortcut solutions to the classification task. Finally, in causal experiments where we regularize against class selectivity at different points in training, we show that the presence of class-selective neurons early in training is critical to the successful training of the network; |
| Researcher Affiliation | Collaboration | Omkar Ranadive EMAIL Alchera X Northwestern University Nikhil Thakurdesai EMAIL Independent Researcher Ari S Morcos EMAIL Meta AI (FAIR) Matthew Leavitt EMAIL Mosaic ML Stรฉphane Deny EMAIL Aalto University |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided. The methodology is described in narrative text and mathematical equations for the Class Selectivity Index and regularizer. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is released, nor does it provide a link to a code repository. It mentions using the torch_cka library (Subramanian, 2021) but not their own code. |
| Open Datasets | Yes | We use Res Net-50s (He et al., 2016) trained on Image Net for all our experiments. |
| Dataset Splits | Yes | The class selectivity indices were calculated over the validation set of 50k images for every epoch from epoch 0 to epoch 90. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using PyTorch (Paszke et al., 2019), Seaborn Library (Waskom, 2021), and torch_cka library (Subramanian, 2021) but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | All instances were trained for 90 epochs with a batch size of 256, learning rate of 0.1, weight decay of 1e-4, and momentum of 0.9. |