Data-free Neural Representation Compression with Riemannian Neural Dynamics

Authors: Zhengqi Pei, Anran Zhang, Shuhui Wang, Xiangyang Ji, Qingming Huang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using backbones like Res Net and Vision Transformer, we conduct extensive experiments on datasets such as MNIST, CIFAR-100, Image Net-1k, and COCO object detection. Empirical results show that, under equal compression rates and computational complexity, models compressed with Rie M achieve superior inference accuracy compared to existing data-free compression methods.
Researcher Affiliation Academia 1Institute of Computing Technology, Chinese Academy of Sciences. 2School of Artificial Intelligence, University of Chinese Academy of Sciences. 3Peng Cheng Laboratory. 4Department of Automation, Tsinghua University. 5School of Computer Science and Technology, University of Chinese Academy of Sciences.
Pseudocode No The paper does not contain any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Codes are publicly available1. 1https://github.com/pzqpzq/flat-learning
Open Datasets Yes We conduct experiments on public datasets including MNIST (Deng, 2012), CIFAR-100 (Krizhevsky et al., 2009), Image Net-1k (Deng et al., 2009) and COCO (Lin et al., 2014).
Dataset Splits Yes After this data-free training, we select the top three performing models based on their performance on the development set and average their evaluations on the test set as the final results. In the experiments, the global objective is to maximize the model s inference accuracy on the validation set.
Hardware Specification Yes All experiments can be conducted on a single 24GB memory Ge Force RTX 4090 GPU, and the total time required to convert a pre-trained model, e.g., Swin-T, to a trained Rie Mbased model typically ranges from a few hours to several days, depending on the desired model performance.
Software Dependencies No The paper does not explicitly mention specific software dependencies with their version numbers required for reproducibility (e.g., specific Python, PyTorch, or CUDA versions).
Experiment Setup Yes For each pre-trained model to be compressed, we randomly select ten sets of seeds, yielding ten different randomly initialized Rie M-based models, including trainable neural dynamics states and the Riemannian metrics for each weight matrix. We first adjust hyperparameters, such as the neuronal dimension Dq, the internal dimension Dµ of the Riemannian metrics, ensuring that the resulting Rie M-based models have storage model size and computational complexity (measured in FLOPs or BOPs) consistent with various baselines.