reproducibilityindex.ai

Usable Information and Evolution of Optimal Representations During Training

Authors: Michael Kleinman, Alessandro Achille, Daksh Idnani, Jonathan Kao

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show these effects on both perceptual decision-making tasks inspired by neuroscience literature, as well as on standard image classiﬁcation tasks. We trained multiple network architectures on tasks and assessed the usable information in representations across layers and training epochs.
Researcher Affiliation	Academia	1University of California, Los Angeles 2Caltech {michael.kleinman,dakshidnani}@ucla.edu aachille@caltech.edu kao@seas.ucla.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	We then use this framework to examine how relevant and irrelevant information are represented in more realistic tasks and architectures, and how hyper-parameters affect the learning dynamics. We deﬁne a coarse labelling of task labels and study how the network represents the ﬁne and coarse labelling through training, using a Res Net-18 (He et al., 2016) and All-CNN (Springenberg et al., 2015) on CIFAR-10 and CIFAR-100.
Dataset Splits	Yes	FC Small, n = 2: batch size: 32, learning rate: 0.05, number of data samples: 10000 (90% train, 10% validation)
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions general software like ResNet-18 and All-CNN architectures, but does not provide specific version numbers for software dependencies such as Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup	Yes	We trained a Res Net-18 (He et al., 2016) to output the coarse label of CIFAR-10, using an initial learning rate of 0.1 with exponential annealing (0.97), momentum (0.9), and a batch size of 128. For the All-CNN (Springenberg et al., 2015) we used a batch size of 128, initial learning rate of 0.05 decaying smoothly by a factor of 0.97 at each epoch, momentum of 0.9, and weight decay with coefﬁcient 0.001.