Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Dual Problem of Convexified Convolutional Neural Networks

Authors: Site Bai, Chuyang Ke, Jean Honorio

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate DCCNN on real-world data as a sanity check for the proposed method. The performance is by no means state-of-the-art for relevant tasks, and the baselines reﬂect only its performance in our experimental settings, which is the same for all methods. For DCCNN, we solve the dual problem by a coordinate-descent approach (Algorithm 2 in Appendix E.1). For CCNN, we used a projected gradient descent approach and r = 25. We apply hinge loss for evaluation. The speciﬁc form of the problem in Eq. (6) with hinge loss can be found in Appendix D. We brieﬂy report results in Table 2, and leave the detailed analysis in Appendix G. In Table 2a, for the binary classiﬁcation task of the MNIST data (Lecun et al., 1998), we can see the DCCNN outperforms CNNs trained using SGD and diﬀerent kernel matrix factorizations for CCNN on one-conv-layer and two-conv-layer networks with only one exception, which veriﬁes the eﬀectiveness of DCCNN. On the more complicated Image Net dataset (Deng et al., 2009), DCCNN also performs comparably well with the end-to-end SGD optimized CNNs under both Alex Net (Krizhevsky et al., 2012) and VGG11 (Simonyan & Zisserman, 2015) architectures, and signiﬁcantly outperforms the CCNN method. From Table 2b, we see that in multiclass classiﬁcation, the performance level of DCCNN is better than various factorization versions of CCNN and is comparable with SGD.
Researcher Affiliation	Academia	Site Bai EMAIL Department of Computer Science Purdue University Chuyang Ke EMAIL Department of Computer Science Purdue University Jean Honorio EMAIL School of Computing and Information Systems The University of Melbourne
Pseudocode	Yes	The pseudocode of the algorithm is provided in Algorithm 2 in Appendix E.1. The complete algorithm workﬂow is demonstrated in Algorithm 1. The complete algorithm of learning a D-layer DCCNN is illustrated in Algorithm 3 in Appendix E.2.
Open Source Code	No	The paper does not provide any explicit statement about open-sourcing the code, nor does it include a link to a code repository.
Open Datasets	Yes	In Table 2a, for the binary classiﬁcation task of the MNIST data (Lecun et al., 1998), we can see the DCCNN outperforms CNNs trained using SGD and diﬀerent kernel matrix factorizations for CCNN on one-conv-layer and two-conv-layer networks with only one exception, which veriﬁes the eﬀectiveness of DCCNN. On the more complicated Image Net dataset (Deng et al., 2009), DCCNN also performs comparably well with the end-to-end SGD optimized CNNs under both Alex Net (Krizhevsky et al., 2012) and VGG11 (Simonyan & Zisserman, 2015) architectures, and signiﬁcantly outperforms the CCNN method.
Dataset Splits	No	The paper mentions using MNIST and Image Net datasets for binary and multi-class classification tasks but does not explicitly provide information on how these datasets were split into training, validation, or test sets, nor does it cite a standard split.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU models, CPU types, memory amounts) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers, or programming language versions used for implementation.
Experiment Setup	Yes	For CCNN, we used a projected gradient descent approach and r = 25. We apply hinge loss for evaluation. The speciﬁc form of the problem in Eq. (6) with hinge loss can be found in Appendix D. We brieﬂy report results in Table 2, and leave the detailed analysis in Appendix G.