LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models
Authors: Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Chukwunyere Osi, Prateek Sharma, Fan Chen, Lei Jiang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | When validated against Google s published LLM carbon footprints, the results generated by LLMCarbon exhibit differences of only 8.2%, and thus are more accurate than those of mlco2. We employ LLMCarbon to compute the operational footprints of five LLMs, including dense and Mo E architectures, developed by Google, Open AI, and Meta during their training phases. We also compute the operational footprint of another LLM, Noor (Lakim et al., 2022), during its storage phase. To validate the predictions of LLMCarbon, we compare our calculated operational footprint values with the previously published data for these LLMs. |
| Researcher Affiliation | Academia | Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi , Prateek Sharma, Fan Chen, Lei Jiang Indiana University Jackson State University {afaiz,skaneda,ruhwang,prateeks,fc7,jiang60}@iu.edu j00967039@students.jsums.edu |
| Pseudocode | No | No pseudocode or algorithm blocks are explicitly provided or labeled in the paper. |
| Open Source Code | Yes | The source code is released at https://github.com/Sotaro Kaneda/MLCarbon. |
| Open Datasets | Yes | The inputs on the parameters of LLMs, hardware, and data centers, and the actual training operational carbon footprint values of these LLMs were collected from (Patterson et al., 2021) and (Wu et al., 2022). |
| Dataset Splits | No | The paper validates LLMCarbon against published operational and embodied carbon footprint data from other research, such as 'Google s published LLM carbon footprints' and 'Meta XLM', rather than defining explicit training, validation, and test splits for its own experimental setup. |
| Hardware Specification | Yes | Table 4: The validation on the operational carbon footprints of various LLMs. LLM T5 GPT3 GShard Switch XLM ... computing device TPUv3 V100 TPUv3 TPUv3 V100 device TPD (W) 450 300 450 450 300 avg. system power (W) 310 330 288 245 342 peak TFLOPs/s 123 125 123 123 125 achieved TFLOPs/s 45.6 24.6 48 34.4 26.5 hardware efficiency 37% 19.7% 39% 28% 21.2% device # 512 10K 1K 1K 512. Also Table 5: The embodied carbon footprint validation against Meta XLM. hardware number CO2eqchip time lifetime CO2eqemb (kg CO2eq) (t CO2eq) GPU 512 9.78 1.12% 0.056 CPU 64 1.47 1.12% 0.0018 SSD 64 576 1.12% 0.412 DRAM 64 102.4 1.12% 0.073 others 64 148.2 1.12% 0.096 predicted sum 0.64 actual 0.66 t CO2eq, 3.05% |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch, or specific library versions) are mentioned in the paper. |
| Experiment Setup | Yes | Table 4 presents the validation results of LLMCarbon s predictions on the training operational carbon footprint. To validate the training operational carbon footprint estimations yielded by LLMCarbon, we selected five LLMs: T5 (Raffel et al., 2020), GPT-3 (Brown et al., 2020), GShard (Lepikhin et al., 2021), Switch (Fedus et al., 2022), and XLM (Conneau et al., 2020). We list the inputs and outputs of LLMCarbon in Table 4. Within the table, device TPD (W) indicates the Chip Thermal Design Power of a computing device, while avg. system power (W) conveys the average system power per computing device, including TPU/GPU, host CPU, DRAM, and network interface. The inputs on the parameters of LLMs, hardware, and data centers, and the actual training operational carbon footprint values of these LLMs were collected from (Patterson et al., 2021) and (Wu et al., 2022). |