Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation

Authors: Yaofo Chen, Shuaicheng Niu, Yaowei Wang, Shoukai Xu, Hengjie Song, Mingkui Tan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on Image Net-C and Image Net-R verify the effectiveness of our CEMA. 3 EXPERIMENTS
Researcher Affiliation Academia Yaofo Chen12 , Shuaicheng Niu3 , Yaowei Wang2 , Shoukai Xu1, Hengjie Song1, Mingkui Tan145 South China University of Technology1 Pengcheng Laboratory2 Nanyang Technological University3 Key Laboratory of Big Data and Intelligent Robot, Ministry of Education4 Pazhou Laboratory5 chenyaofo@gmail.com; mingkuitan@scut.edu.cn
Pseudocode Yes Algorithm 1 Adaptation process in edge. Require: Test samples Dtest={xj}M j=1, the edge model gw( ), parameters B, Emax and Emin. 1: for a batch X={xb}B b=1 in Dtest do 2: Calculate predictions ˆy for all x X via fΘ( ). 3: Calculate S(x) via Eqn. (4) with Emax and Emin. 4: Update the threshold Emax via Eqn. (2). 5: Upload samples {x|S(x)=1, x X} to cloud. 6: Update the parameters w w from the cloud. 7: end for Ensure: The predictions {ˆy}M k=1 for all x Dtest. Algorithm 2 Adaptation process in cloud. Require: Test samples ˆ X={xn}N n=1, the foundation model fθ( ) and edge model gw( ). 1: Update parameters θ θ of the foundation model fθ( ) via entropy minimization (Eqn. 5) and meanwhile stores uploaded samples into a replay buffer. 2: With both the uploaded samples and the samples randomly sampled from the replay buffer, CEMA adapts the edge model gw( ) with the guidance from the foundation model fθ( ) via the knowledge distillation loss (Eqn. 6). 3: Distribute the parameters w to edge.
Open Source Code Yes The code is available at https://github.com/chenyaofo/CEMA.
Open Datasets Yes We evaluate our method and considered methods on Image Net-C (Hendrycks & Dietterich, 2019). which is a distribution shift dataset by applying 4 main category corruption (i.e., noise, blur, weather, and digital), with a total of 15 diverse corruption types, to the Image Net validation set. We also verify our CEMA on Image Net R (Hendrycks et al., 2021), which contains 30,000 images with various artistic renditions of 200 Image Net classes collected from Flickr. Image Net-C1. ... 1https://github.com/hendrycks/robustness Image Net-R2. ... 2https://github.com/hendrycks/imagenet-r
Dataset Splits No The paper mentions using ImageNet validation set as a base for ImageNet-C, but does not provide explicit training/validation/test splits with percentages, sample counts, or predefined citations for their own experimental setup.
Hardware Specification Yes Taking Res Net101 as the foundation model and Res Net18 as the edge model, our CEMA can run at 220 images/second on NVIDIA A100 GPU.
Software Dependencies No The paper mentions models from torchvision, facebookresearch/deit, and openai/CLIP, but does not provide specific version numbers for software dependencies like PyTorch, CUDA, or other libraries used in the implementation.
Experiment Setup Yes We set the entropy thresholds Emax=0.4 ln C as the initialized value following (Niu et al., 2022) and Emin=0.02 ln C, where C denotes the number of classes. Then the threshold Emax decreases based on Eqn. (2) with λ=1.0. For the adaptation of the foundation and edge model, we use both an SGD optimizer with a learning rate of 0.00025 and a momentum of 0.9. For the adaptation of the edge model, we set the batch size to 128, in which 32 samples are newly uploaded and the remaining 96 samples are randomly sampled from the replay buffer. The hyper-parameter α and β are both set to 3.