reproducibilityindex.ai

Directional convergence and alignment in deep learning

Authors: Ziwei Ji, Matus Telgarsky

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we additionally provide empirical support not just close to the theory (e.g., the Alex Net), but also on non-homogeneous networks (e.g., the Dense Net).The experiments in Figures 1 and 2 are performed in as standard a way as possible to highlight that directional convergence is a reliable property; full details are in Appendix A. Brieﬂy, Figure 1 uses synthetic data and vanilla gradient descent... Figure 2 uses standard cifar ﬁrstly with a modiﬁed homogeneous Alex Net and secondly with an unmodiﬁed Dense Net
Researcher Affiliation	Academia	Ziwei Ji Matus Telgarsky {ziweiji2,mjt}@illinois.edu University of Illinois, Urbana-Champaign
Pseudocode	No	The paper contains mathematical theorems and proofs but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	Yes	Figure 2 uses standard cifar ﬁrstly with a modiﬁed homogeneous Alex Net and secondly with an unmodiﬁed Dense Net
Dataset Splits	No	The paper mentions using synthetic data and standard CIFAR, but does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification	No	All computations were performed on standard CPUs. This is a vague statement and does not provide specific hardware details such as CPU models, number of cores, or memory specifications.
Software Dependencies	No	Pytorch [Paszke et al., 2019] was used for implementation. This mentions a software name but does not provide a specific version number. No other software with version numbers is listed.
Experiment Setup	Yes	Figure 1 uses synthetic data and vanilla gradient descent (no momentum, no weight decay, etc.) on a 10,000 node wide 2-layer squared Re LU network.Figure 2 uses standard cifar ﬁrstly with a modiﬁed homogeneous Alex Net and secondly with an unmodiﬁed Dense Net; SGD was used on cifar due to training set size.