Towards the Dynamics of a DNN Learning Symbolic Interactions
Authors: Qihan Ren, Junpeng Zhang, Yang Xu, Yue Xin, Dongrui Liu, Quanshi Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our theory well predicts the real dynamics of interactions on different DNNs trained for various tasks. ... We have conducted experiments to train DNNs with various architectures for different tasks. It shows that our theory can well predict the learning dynamics of interactions in real DNNs. ... We conducted experiments to examine whether our theory could predict the real dynamics of interaction strength of different orders when we trained DNNs in practice. |
| Researcher Affiliation | Academia | Qihan Ren1 , Junpeng Zhang1,2 , Yang Xu3, Yue Xin1, Dongrui Liu1,4, Quanshi Zhang1 1Shanghai Jiao Tong University 2Beijing Institute for General Artificial Intelligence 3Zhejiang University 4Shanghai Artificial Intelligence Laboratory {renqihan, zhangjp63, zqs1022}@sjtu.edu.cn |
| Pseudocode | No | The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code or an algorithm. |
| Open Source Code | No | Neur IPS Paper Checklist - 5. Open access to data and code: Answer: [No] Justification: The code will be released when the paper is accepted. |
| Open Datasets | Yes | We trained various DNNs on different datasets. Specifically, for image data, we trained VGG-11 on the MNIST dataset (Creative Commons Attribution-Share Alike 3.0 license), VGG-11/VGG-16 on the CIFAR-10 dataset (MIT license), Alex Net/VGG-16 on the CUB-200-2011 dataset (license unknown), and VGG-16 on the Tiny Image Net dataset (license unknown). For natural language data, we trained BERT-Tiny and BERT-Medium on the SST-2 dataset (license unknown). For point cloud data, we trained DGCNN on the Shape Net dataset (Custom (non-commerical) license). |
| Dataset Splits | No | The paper mentions 'training samples' and a 'testing set' but does not specify explicit validation dataset splits (percentages, counts, or specific predefined validation sets). |
| Hardware Specification | Yes | All DNNs can be trained within 12 hours on a single NVIDIA GeForce RTX 3090 GPU (with 24G GPU memory). |
| Software Dependencies | No | The paper mentions using specific DNN architectures (e.g., Alex Net, VGG, BERT, DGCNN) but does not provide specific version numbers for software dependencies like programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | Yes | We trained all DNNs using the SGD optimizer with a learning rate of 0.01 and a momentum of 0.9. No learning rate decay was used. We trained VGG models, Alex Net models, and BERT models for 256 epochs, and trained the DGCNN model for 512 epochs. The batchsize was set to 128 for all DNNs on all datasets. |