A Structure-Aware Framework for Learning Device Placements on Computation Graphs
Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Panagiotis Kyriakis, Nesreen K. Ahmed, Peiyu Zhang, Guixiang Ma, Mihai Capotă, Shahin Nazarian, Theodore Willke, Paul Bogdan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate our approach we use the computation graphs created from three popular benchmarks: (1) Inception-V3, Res Net, and BERT. The effectiveness and robustness of the proposed approach are demonstrated through multiple experiments with different benchmark models and a detailed ablation study. |
| Researcher Affiliation | Collaboration | Shukai Duan Center for Complex Particle Systems University of Southern California Los Angeles, USA shukaidu@usc.edu; Panagiotis Kyriakis Meta pkyriakis@meta.com; Nesreen K. Ahmed Cisco Outshift nesahmed@cisco.com; Guixiang Ma Intel Labs guixiang.ma@intel.com; Mihai Capot a Intel Labs mihai.capota@intel.com; Theodore L. Willke Intel Labs ted.willke@intel.com |
| Pseudocode | Yes | Algorithm 1 Hierarchical Structure-Aware Device Assignment Graph (HSDAG); Algorithm 2 Graph Parsing Network |
| Open Source Code | Yes | A Code availability The source code is available at https://github.com/hping666/HSDAG. |
| Open Datasets | Yes | To evaluate our approach we use the computation graphs created from three popular benchmarks: (1) Inception-V3: The Inception-V3 architecture [25] is extensively employed for image recognition and visual feature extraction [12]. (2) Res Net: Res Net [10] is a widely-used model for image classification. (3) BERT: BERT [6] is a language model relying on the transformer architecture. |
| Dataset Splits | No | The paper focuses on optimizing device placement using reinforcement learning and does not specify traditional training/validation/test dataset splits for model training. |
| Hardware Specification | Yes | Devices. The available devices for our experiments are the following: (1) CPU: 12th Gen Intel(R) Core(TM) i9-12900K, (2) GPU.0: Intel(R) UHD Graphics 770 (i GPU) and (3) GPU.1: Intel(R) Data Center GPU Flex 170 (d GPU). Our server has 64GB of memory. |
| Software Dependencies | Yes | We run our experiments on real hardware using the Open VINO toolkit version 2023.3.0 |
| Experiment Setup | Yes | Table 6: Model Parameters provides specific values for num_devices, hidden_channel, layer_trans, layer_gnn, layer_parsingnet, gnn_model, dropout_network, dropout_parsing, link_ignore_self_loop, act_final, learning_rate, max_episodes, update_timestep, and K_epochs. |