GACT: Activation Compressed Training for Generic Network Architectures
Authors: Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation shows that GACT can reduce activation memory by up to 8.1 , enabling training with a 24.7 larger batch size on the same GPU. |
| Researcher Affiliation | Collaboration | 1UC Berkeley 2Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua-Bosch Joint Center for ML, BNRist Center, State Key Lab for Intell. Tech. & Sys., Tsinghua University 3ICSI 4LBNL. |
| Pseudocode | Yes | Algorithm 1 Numerical algorithm for computing cl(h, θ). Require: A gradient evaluation function g( ; θ) Require: A series of L + 1 random seeds (rl)L+1 l=1 . Require: Any compression scheme b = (bl)L l=1 l, seed Q(l) with rl g0 g(Qb(h); θ) {First iteration} l, seed Q(l) with rl seed Q(l) with r L+1 g1 g(Qb(h); θ) {Second iteration, with another seed} Return 1 2 g0 g1 2 /S(bl) |
| Open Source Code | No | The paper implements GACT as a Py Torch library and shows a usage example (Figure 2) but does not provide an explicit statement or link for public code availability. |
| Open Datasets | Yes | We conduct experiments on four node classification datasets with standard splits, including Flickr, Reddit, Yelp from Graph SAINT (Zeng et al., 2019), and ogbn-arxiv from Open Graph Benchmark (OGB) (Hu et al., 2020). |
| Dataset Splits | No | We report accuracy on validation sets (Div. indicates diverge) and the compression rate of context tensors (numbers in brackets) for both tasks. While validation sets are mentioned, explicit training/validation/test split percentages or sample counts are not provided in the paper's main text. |
| Hardware Specification | Yes | We implement the benchmark with Py Torch 1.10 and measure the memory saving and overhead of GACT on an AWS g4dn.4xlarge instance, which has a 16GB NVIDIA T4 GPU and 64GB CPU memory. |
| Software Dependencies | Yes | We implement the benchmark with Py Torch 1.10 and measure the memory saving and overhead of GACT on an AWS g4dn.4xlarge instance... |
| Experiment Setup | Yes | All experiments are run with the same learning rate as the full precision. |