A Structure-Aware Framework for Learning Device Placements on Computation Graphs

Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Panagiotis Kyriakis, Nesreen K. Ahmed, Peiyu Zhang, Guixiang Ma, Mihai Capotă, Shahin Nazarian, Theodore Willke, Paul Bogdan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate our approach we use the computation graphs created from three popular benchmarks: (1) Inception-V3, Res Net, and BERT. The effectiveness and robustness of the proposed approach are demonstrated through multiple experiments with different benchmark models and a detailed ablation study.
Researcher Affiliation Collaboration Shukai Duan Center for Complex Particle Systems University of Southern California Los Angeles, USA shukaidu@usc.edu; Panagiotis Kyriakis Meta pkyriakis@meta.com; Nesreen K. Ahmed Cisco Outshift nesahmed@cisco.com; Guixiang Ma Intel Labs guixiang.ma@intel.com; Mihai Capot a Intel Labs mihai.capota@intel.com; Theodore L. Willke Intel Labs ted.willke@intel.com
Pseudocode Yes Algorithm 1 Hierarchical Structure-Aware Device Assignment Graph (HSDAG); Algorithm 2 Graph Parsing Network
Open Source Code Yes A Code availability The source code is available at https://github.com/hping666/HSDAG.
Open Datasets Yes To evaluate our approach we use the computation graphs created from three popular benchmarks: (1) Inception-V3: The Inception-V3 architecture [25] is extensively employed for image recognition and visual feature extraction [12]. (2) Res Net: Res Net [10] is a widely-used model for image classification. (3) BERT: BERT [6] is a language model relying on the transformer architecture.
Dataset Splits No The paper focuses on optimizing device placement using reinforcement learning and does not specify traditional training/validation/test dataset splits for model training.
Hardware Specification Yes Devices. The available devices for our experiments are the following: (1) CPU: 12th Gen Intel(R) Core(TM) i9-12900K, (2) GPU.0: Intel(R) UHD Graphics 770 (i GPU) and (3) GPU.1: Intel(R) Data Center GPU Flex 170 (d GPU). Our server has 64GB of memory.
Software Dependencies Yes We run our experiments on real hardware using the Open VINO toolkit version 2023.3.0
Experiment Setup Yes Table 6: Model Parameters provides specific values for num_devices, hidden_channel, layer_trans, layer_gnn, layer_parsingnet, gnn_model, dropout_network, dropout_parsing, link_ignore_self_loop, act_final, learning_rate, max_episodes, update_timestep, and K_epochs.