Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Dual Problem of Convexified Convolutional Neural Networks
Authors: Site Bai, Chuyang Ke, Jean Honorio
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate DCCNN on real-world data as a sanity check for the proposed method. The performance is by no means state-of-the-art for relevant tasks, and the baselines reflect only its performance in our experimental settings, which is the same for all methods. For DCCNN, we solve the dual problem by a coordinate-descent approach (Algorithm 2 in Appendix E.1). For CCNN, we used a projected gradient descent approach and r = 25. We apply hinge loss for evaluation. The specific form of the problem in Eq. (6) with hinge loss can be found in Appendix D. We briefly report results in Table 2, and leave the detailed analysis in Appendix G. In Table 2a, for the binary classification task of the MNIST data (Lecun et al., 1998), we can see the DCCNN outperforms CNNs trained using SGD and different kernel matrix factorizations for CCNN on one-conv-layer and two-conv-layer networks with only one exception, which verifies the effectiveness of DCCNN. On the more complicated Image Net dataset (Deng et al., 2009), DCCNN also performs comparably well with the end-to-end SGD optimized CNNs under both Alex Net (Krizhevsky et al., 2012) and VGG11 (Simonyan & Zisserman, 2015) architectures, and significantly outperforms the CCNN method. From Table 2b, we see that in multiclass classification, the performance level of DCCNN is better than various factorization versions of CCNN and is comparable with SGD. |
| Researcher Affiliation | Academia | Site Bai EMAIL Department of Computer Science Purdue University Chuyang Ke EMAIL Department of Computer Science Purdue University Jean Honorio EMAIL School of Computing and Information Systems The University of Melbourne |
| Pseudocode | Yes | The pseudocode of the algorithm is provided in Algorithm 2 in Appendix E.1. The complete algorithm workflow is demonstrated in Algorithm 1. The complete algorithm of learning a D-layer DCCNN is illustrated in Algorithm 3 in Appendix E.2. |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing the code, nor does it include a link to a code repository. |
| Open Datasets | Yes | In Table 2a, for the binary classification task of the MNIST data (Lecun et al., 1998), we can see the DCCNN outperforms CNNs trained using SGD and different kernel matrix factorizations for CCNN on one-conv-layer and two-conv-layer networks with only one exception, which verifies the effectiveness of DCCNN. On the more complicated Image Net dataset (Deng et al., 2009), DCCNN also performs comparably well with the end-to-end SGD optimized CNNs under both Alex Net (Krizhevsky et al., 2012) and VGG11 (Simonyan & Zisserman, 2015) architectures, and significantly outperforms the CCNN method. |
| Dataset Splits | No | The paper mentions using MNIST and Image Net datasets for binary and multi-class classification tasks but does not explicitly provide information on how these datasets were split into training, validation, or test sets, nor does it cite a standard split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers, or programming language versions used for implementation. |
| Experiment Setup | Yes | For CCNN, we used a projected gradient descent approach and r = 25. We apply hinge loss for evaluation. The specific form of the problem in Eq. (6) with hinge loss can be found in Appendix D. We briefly report results in Table 2, and leave the detailed analysis in Appendix G. |