reproducibilityindex.ai

MGit: A Model Versioning and Management System

Authors: Wei Hao, Daniel Mendoza, Rafael Mendes, Deepak Narayanan, Amar Phanishayee, Asaf Cidon, Junfeng Yang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We find that MGit works well in practice: MGit is able to reduce model storage footprint by up to 7 . Additionally, in a user study with 20 ML practitioners, users complete a model updating task 3 faster on average with MGit.
Researcher Affiliation	Collaboration	Wei Hao * 1 2 Daniel Mendoza * 1 3 Rafael da Silva 4 Deepak Narayanan 1 5 Amar Phanishayee 4 Asaf Cidon 2 Junfeng Yang 2 1Work done while authors were at Microsoft Research 2Columbia University 3Stanford University 4Microsoft Research 5NVIDIA.
Pseudocode	Yes	Algorithm 1 Pseudocode for model updating. Algorithm 2 Peudocode for diff between two models m1 and m2. Algorithm 3 Pseudocode for delta compression.
Open Source Code	Yes	We have open sourced our implementation of MGit at https://github.com/msr-fiddle/mgit.
Open Datasets	Yes	G2. We started with a vanilla Ro BERTa model trained on the standard masked language modeling (MLM) objective, and then fine-tuned task-specific models for each of the GLUE tasks (Wang et al., 2018). ; G3. We trained a Res Net-50 image classification model (He et al., 2016) on the Image Net-1K dataset (Deng et al., 2009) using federated learning. ; G1 is a lineage graph created from NLP models downloaded directly from the Hugging Face model hub (Hugging Face, b).
Dataset Splits	No	The paper refers to the use of well-known datasets like GLUE and ImageNet, which typically have predefined splits. However, it does not explicitly state the specific training, validation, and test splits (e.g., percentages or sample counts) used for its own experiments or for the models it trained (e.g., fine-tuning details for G2 models, or federated learning for G3).
Hardware Specification	Yes	Unless otherwise noted, experiments were run on a workstation with 4 NVIDIA RTX A6000 GPUs with CUDA 11.7.
Software Dependencies	No	The paper mentions "CUDA 11.7" as part of the hardware setup. While it discusses software frameworks like PyTorch, TensorFlow, and Hugging Face Transformers, it does not provide specific version numbers for these or other key software components, which is required for a reproducible software description.
Experiment Setup	Yes	By default, we set ϵ = 10 4. ; We trained a Res Net-50 image classification model ... using federated learning. Each worker operates on a data silo with a subset of the 1000 labels in the Image Net-1K dataset. We ran experiments with 40 workers (data silos), and 10 rounds of federated averaging. In each round, 5 of 40 workers are randomly sampled. ; For each model architecture, we create models progressively greater sparsities in a two-step process.