Data-free Neural Representation Compression with Riemannian Neural Dynamics
Authors: Zhengqi Pei, Anran Zhang, Shuhui Wang, Xiangyang Ji, Qingming Huang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using backbones like Res Net and Vision Transformer, we conduct extensive experiments on datasets such as MNIST, CIFAR-100, Image Net-1k, and COCO object detection. Empirical results show that, under equal compression rates and computational complexity, models compressed with Rie M achieve superior inference accuracy compared to existing data-free compression methods. |
| Researcher Affiliation | Academia | 1Institute of Computing Technology, Chinese Academy of Sciences. 2School of Artificial Intelligence, University of Chinese Academy of Sciences. 3Peng Cheng Laboratory. 4Department of Automation, Tsinghua University. 5School of Computer Science and Technology, University of Chinese Academy of Sciences. |
| Pseudocode | No | The paper does not contain any structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Codes are publicly available1. 1https://github.com/pzqpzq/flat-learning |
| Open Datasets | Yes | We conduct experiments on public datasets including MNIST (Deng, 2012), CIFAR-100 (Krizhevsky et al., 2009), Image Net-1k (Deng et al., 2009) and COCO (Lin et al., 2014). |
| Dataset Splits | Yes | After this data-free training, we select the top three performing models based on their performance on the development set and average their evaluations on the test set as the final results. In the experiments, the global objective is to maximize the model s inference accuracy on the validation set. |
| Hardware Specification | Yes | All experiments can be conducted on a single 24GB memory Ge Force RTX 4090 GPU, and the total time required to convert a pre-trained model, e.g., Swin-T, to a trained Rie Mbased model typically ranges from a few hours to several days, depending on the desired model performance. |
| Software Dependencies | No | The paper does not explicitly mention specific software dependencies with their version numbers required for reproducibility (e.g., specific Python, PyTorch, or CUDA versions). |
| Experiment Setup | Yes | For each pre-trained model to be compressed, we randomly select ten sets of seeds, yielding ten different randomly initialized Rie M-based models, including trainable neural dynamics states and the Riemannian metrics for each weight matrix. We first adjust hyperparameters, such as the neuronal dimension Dq, the internal dimension Dµ of the Riemannian metrics, ensuring that the resulting Rie M-based models have storage model size and computational complexity (measured in FLOPs or BOPs) consistent with various baselines. |