reproducibilityindex.ai

Population Matching Discrepancy and Applications in Deep Learning

Authors: Jianfei Chen, Chongxuan LI, Yizhong Ru, Jun Zhu

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that PMD overcomes the aforementioned drawbacks of MMD, and outperforms MMD on both tasks in terms of the performance as well as the convergence speed.
Researcher Affiliation	Academia	Jianfei Chen, Chongxuan Li, Yizhong Ru, Jun Zhu Dept. of Comp. Sci. & Tech., TNList Lab, State Key Lab for Intell. Tech. & Sys. Tsinghua University, Beijing, 100084, China {chenjian14,licx14,ruyz13}@mails.tsinghua.edu.cn, dcszj@tsinghua.edu.cn
Pseudocode	Yes	Figure 1: Pseudocode of PMD for parameter learning with graphical illustration of an iteration.
Open Source Code	No	The paper references a GitHub link for "Generative Moment Matching Networks" by Siddharth Agrawal [2], which is a third-party reference and not a link to the authors' own implementation code for the method described in this paper.
Open Datasets	Yes	We compare the performance of PMD and MMD on the standard Ofﬁce [41] object recognition benchmark for domain adaptation. ... We compare PMD with MMD for image generation on the MNIST [28], SVHN [36] and LFW [20] dataset.
Dataset Splits	Yes	Following [8], we validate the domain regularization strength λ and the MMD kernel bandwidth σ on a random 100-sample labeled dataset on the target domain, but the model is trained without any labeled data from the target domain.
Hardware Specification	Yes	Our experiment is conducted on a machine with Nvidia Titan X (Pascal) GPU and Intel E5-2683v3 CPU.
Software Dependencies	Yes	We implement the models in Tensor Flow [1]. The CUDA program is compiled with nvcc 8.0 and the C++ program is compiled with g++ 4.8.4, while -O3 ﬂag is used for both programs.
Experiment Setup	Yes	The classiﬁer is a fully-connected neural network with a single hidden layer of 256 Re LU [15] units, trained with Ada Delta [51]. We apply batch normalization [21] on the hidden layer... We set the population size N = 2000 for both PMD and MMD, and the mini-batch size \|B\| = 100 for PMD. We use the Ada M optimizer [22] with batch normalization [21], and train the model for 100 epoches for PMD, and 500 epoches for MMD.