reproducibilityindex.ai

Interpretations of Domain Adaptations via Layer Variational Analysis

Authors: Huan-Hsin Tseng, Hsin-Yi Lin, Kuo-Hsuan Hung, Yu Tsao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments over diverse tasks validated our theory and veriﬁed that our analytic expression achieved better performance in domain adaptation than the gradient descent method.
Researcher Affiliation	Academia	Research Center for Information Technology Innovation, Academia Sinica, Taiwan {htseng, hylin, khhung, yu.tsao}@citi.sinica.edu.tw
Pseudocode	No	The paper contains mathematical derivations and equations, but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks or structured code-like procedures.
Open Source Code	Yes	The code is available on Github1. 1https://github.com/HHTseng/Layer-Variational-Analysis.git
Open Datasets	Yes	8,000 utterances (corresponding to N1 = 112,000 patches) were randomly excerpted from the Deep Noise Suppression Challenge (Reddy et al., 2020) dataset; 2000 high-resolution images were randomly selected as labels from the CUFED dataset (Wang et al., 2016) to train SRCNN.
Dataset Splits	Yes	Speech data pairs were prepared for the source domain D (served as the training set) and target domain e D (served as the adaptation set); For the training set, the 8000 clean utterances were equally divided and contaminated by the ﬁve noise types... to form the training set. For the testing set, 100 clean utterances were contaminated... For the adaptation set, we prepared 20-400 patches...
Hardware Specification	Yes	One NVIDIA V100 GPU (32 GB GPU memory) with 4 CPUs (128 GB CPU memory).
Software Dependencies	No	The paper mentions optimizers like ADAM and network architectures like BLSTM, but does not provide specific version numbers for any software libraries, programming languages, or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The pretrained model f using D was a 3 fully-connected-layer network with 64 nodes at each layer and Re Lu used as activation functions except the output layer. A ﬁnetuned model g GD using Gradient Descent (GD) retrained the last layer of f by e D. f and g GD were trained under L2-loss with ADAM optimizer at learning rate 10 3 by 8,000 and 12,000 epochs, respectively.