Symmetry Induces Structure and Constraint of Learning

Authors: Liu Ziyin

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply the result to four different problems and numerically validate the theory in Section 3.
Researcher Affiliation Collaboration 1MIT, NTT Research.
Pseudocode No The paper describes an algorithm (DCS) in prose and mathematical formulation in Section 2.6 but does not provide structured pseudocode or an algorithm block.
Open Source Code No The paper does not explicitly state that source code for the methodology described in this paper is released or provide a link to it.
Open Datasets Yes We train a Resnet18 on the CIFAR-10 dataset, following the standard training procedures.
Dataset Splits No The paper mentions training on the CIFAR-10 dataset and using unseen test points but does not explicitly detail the training/validation/test splits or cross-validation methodology.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions using SGD and Adam optimizers but does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train a Resnet18 on the CIFAR-10 dataset, following the standard training procedures. We compute the correlation matrix of neuron firing of the penultimate layer of the model, which follows a fully connected layer. We compare the matrix for both training with and without weight decay and for both preand post-activations (see Appendix B). See Figure 3-right, which shows that homogeneous solutions are preferred when weight decay is used, in agreement with the prediction of Theorem 1. Here, the training proceeds with SGD with 0.9 momentum and batch size 128, consistent with standard practice. We use a cosine learning rate scheduler for 200 epochs.