HoD-Net: High-Order Differentiable Deep Neural Networks and Applications
Authors: Siyuan Shen, Tianjia Shao, Kun Zhou, Chenfanfu Jiang, Feng Luo, Yin Yang8249-8258
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implement Ho D-Net with Py Torch in CR form on a workstation computer with Intel i9-10980XE CPU and an n Vidia 3090 GPU with 24G G-memory. In time critical applications, we port the trained network to CUDA, and use cu BLAS (Nvidia 2008) for CR-based network pass. ... The result is reported in Fig. 3. In the first-order case, CSFD-BP is not an option, and we only use CSFD-based Ho D-Net implementation. ... The datasets used are MNIST, SVHN, and CIFAR10. The training curves are plotted in Fig. 4. In a nutshell, Ho D-Net provides an efficient approach to extract curvature information of the optimization manifold, and our method shows a superior performance in those DL tasks. |
| Researcher Affiliation | Academia | 1State Key Lab of CAD&CG, Zhejiang University 2Department of Mathematics, University of California, Los Angeles 3School of Computing, Clemson University |
| Pseudocode | Yes | Algorithm 1: Ho D-Net Newton-Krylov training. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | The datasets used are MNIST, SVHN, and CIFAR10. ... We compare Ho D CAE with the traditional autoencoder and Jacobian-penalized CAE (Rifai et al. 2011b) on MNIST, MNIST-rot (Vincent et al. 2010), and CIFAR-bw, a grey-scale version of CIFAR10. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It mentions "validation error plateau" but no explicit split percentages or counts are given. |
| Hardware Specification | Yes | We implement Ho D-Net with Py Torch in CR form on a workstation computer with Intel i9-10980XE CPU and an n Vidia 3090 GPU with 24G G-memory. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'Tensor Flow' and 'CUDA' but does not provide specific version numbers for these or other key software components used in the experiments. While a citation to PyTorch (Paszke et al. 2019) is given, it does not state that this specific version was used for the experiments. |
| Experiment Setup | No | The paper mentions the network structures (e.g., '8 FC layers, and each layer is activated by ELU', '4-layer MLP activated by ELU', 'network structure is 2 100 100 100 100 100 8, and each hidden layer is activated by tanh'), but it does not provide specific hyperparameters such as learning rate, batch size, number of epochs, or detailed optimizer settings used for training. |