MLink: Linking Black-Box Models for Collaborative Multi-Model Inference
Authors: Mu Yuan, Lan Zhang, Xiang-Yang Li9475-9483
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated MLink on a multi-modal dataset with seven different ML models and two real-world video analytics systems with six ML models and 3,264 hours of video. Experimental results show that our proposed model links can be effectively built among various black-box models. Under the budget of GPU memory, MLink can save 66.7% inference computations while preserving 94% inference accuracy, which outperforms multi-task learning, deep reinforcement learning-based scheduler and frame filtering baselines. |
| Researcher Affiliation | Academia | Mu Yuan, Lan Zhang*, Xiang-Yang Li University of Science and Technology of China ym0813@mail.ustc.edu.cn, zhanglan@ustc.edu.cn, xiangyangli@ustc.edu.cn |
| Pseudocode | Yes | Algorithm. 1 shows the workflow of integrating MLinks with multi-model inference workloads. |
| Open Source Code | Yes | We implemented our designs in Python based on TensorFlow 2.0 (TensorFlow 2021) as a pluggable middleware for inference systems 1. 1https://github.com/yuanmu97/MLink |
| Open Datasets | Yes | Multi-modal dataset and ML models. We used the Hollywood2 video dataset (Marszalek, Laptev, and Schmid 2009). |
| Dataset Splits | No | The original training and test splits in Hollywood2 dataset contain 823 video clips (around 48%) and 884 video clips, respectively. To test the performance of the model linking with different sizes of training data, we further randomly sampled four subsets of training data with 1%, 5%, 10%, 20% ratios, with respect to the total dataset. |
| Hardware Specification | Yes | We use an edge server with one NVIDIA 2080Ti GPU. ... We used five servers, each with four NVIDIA T4 GPUs. ... Intel Xeon Silver 4214R, Intel Core i7-6700, Intel Core i5-8259U, Qualcomm Kryo 485 |
| Software Dependencies | Yes | We implemented our designs in Python based on TensorFlow 2.0 (TensorFlow 2021) as a pluggable middleware for inference systems... We tested the integration on programs implemented with TensorFlow (TensorFlow 2021), PyTorch (PyTorch 2021) and MindSpore (MindSpore 2021). |
| Experiment Setup | Yes | We trained pairwise model links with the RMSprop (Tieleman and Hinton 2012) optimizer and the same hyper-parameters (0.01 learning rate, 100 epochs, 32 batch size). |