reproducibilityindex.ai

GACT: Activation Compressed Training for Generic Network Architectures

Authors: Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation shows that GACT can reduce activation memory by up to 8.1 , enabling training with a 24.7 larger batch size on the same GPU.
Researcher Affiliation	Collaboration	1UC Berkeley 2Dept. of Comp. Sci. & Tech., Institute for AI, Tsinghua-Bosch Joint Center for ML, BNRist Center, State Key Lab for Intell. Tech. & Sys., Tsinghua University 3ICSI 4LBNL.
Pseudocode	Yes	Algorithm 1 Numerical algorithm for computing cl(h, θ). Require: A gradient evaluation function g( ; θ) Require: A series of L + 1 random seeds (rl)L+1 l=1 . Require: Any compression scheme b = (bl)L l=1 l, seed Q(l) with rl g0 g(Qb(h); θ) {First iteration} l, seed Q(l) with rl seed Q(l) with r L+1 g1 g(Qb(h); θ) {Second iteration, with another seed} Return 1 2 g0 g1 2 /S(bl)
Open Source Code	No	The paper implements GACT as a Py Torch library and shows a usage example (Figure 2) but does not provide an explicit statement or link for public code availability.
Open Datasets	Yes	We conduct experiments on four node classification datasets with standard splits, including Flickr, Reddit, Yelp from Graph SAINT (Zeng et al., 2019), and ogbn-arxiv from Open Graph Benchmark (OGB) (Hu et al., 2020).
Dataset Splits	No	We report accuracy on validation sets (Div. indicates diverge) and the compression rate of context tensors (numbers in brackets) for both tasks. While validation sets are mentioned, explicit training/validation/test split percentages or sample counts are not provided in the paper's main text.
Hardware Specification	Yes	We implement the benchmark with Py Torch 1.10 and measure the memory saving and overhead of GACT on an AWS g4dn.4xlarge instance, which has a 16GB NVIDIA T4 GPU and 64GB CPU memory.
Software Dependencies	Yes	We implement the benchmark with Py Torch 1.10 and measure the memory saving and overhead of GACT on an AWS g4dn.4xlarge instance...
Experiment Setup	Yes	All experiments are run with the same learning rate as the full precision.