Transfer Learning for Diffusion Models
Authors: Yidong Ouyang, Liyan Xie, Hongyuan Zha, Guang Cheng
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the effectiveness of TGDP on both simulated and real-world datasets.In this section, we present empirical evidence demonstrating the efficacy of the proposed Transfer Guided Diffusion Process (TGDP) on limited data from a target domain. In Section 4.1, we conduct proof-of-concept experiments using a Gaussian mixture model to showcase that the guidance network of TGDP can successfully steer the pre-trained diffusion model toward the target domain. In Section 4.2, we illustrate the effectiveness of TGDP using a real-world electrocardiogram (ECG) dataset. |
| Researcher Affiliation | Academia | 1Department of Statistics and Data Science, University of California, Los Angeles 2Department of Industrial and Systems Engineering, University of Minnesota Twin Cities 3School of Data Science, Chinese University of Hong Kong, Shenzhen |
| Pseudocode | Yes | TGDP adopts Algorithm 1 and 2 for training a domain classifier and Algorithm 3 and 4 for training the guidance network. |
| Open Source Code | Yes | We provide the code for all of the experiments together with clear instructions. |
| Open Datasets | Yes | We follow the setup of existing benchmarks on biomedical signal processing [37] that regard PTB-XL dataset [41] as the source domain and ICBEB2018 dataset [27] as the target domain.PTB-XL CC-BY 4.0 https://physionet.org/content/ptb-xl/1.0.3/ICBEB2018 CC0: Public Domain https://www.kaggle.com/datasets/bjoernjostein/china-12lead-ecg-challenge-database |
| Dataset Splits | No | The paper describes the number of samples drawn for source (m=10000) and target (n=10, 100, 1000) domains in simulations, and states "We randomly select 10% samples as limited target distribution by stratified sampling" for the ECG dataset, but does not explicitly detail the train/validation/test splits within these datasets for the experiments. |
| Hardware Specification | Yes | We utilized a computing cluster equipped with 6 NVIDIA Ge Force 3090 GPUs with memory 24268Mi B and Intel(R) Xeon(R) Platinum 8352Y CPUs @ 2.20GHz. |
| Software Dependencies | No | The paper mentions using specific models like SSSM-ECG [2] and the Adam optimizer, but does not provide specific version numbers for software dependencies such as programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries. |
| Experiment Setup | Yes | We train the diffusion model on data from the source domain for 100 epochs using the Adam optimizer with a learning rate of 1e 4 and batch size of 4096. The guidance network is a 4-layer MLP with 512 hidden units and Si LU activation function. We train the guidance network 20 epochs for our TGDP and train a vanilla diffusion model or finetune the diffusion model target domain 50 epochs.For Vanilla Diffusion, we train the diffusion model for 100k iterations by Adam optimizer with a learning rate 2e 4. For Finetune Generator, we finetune the pre-trained diffusion model for 50k iterations by Adam optimizer with a learning rate 2e 5. For TGDP, we adopt a 4-layer MLP with 512 hidden units and Si LU activation function as the backbone of the guidance network. We train the guidance network for 50k iterations by Adam optimizer with a learning rate 2e 4. |