Dual Swap Disentangling

Authors: Zunlei Feng, Xinchao Wang, Chenglong Ke, An-Xiang Zeng, Dacheng Tao, Mingli Song

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on image datasets from a wide domain show that our model yields state-of-the-art disentangling performances.
Researcher Affiliation Collaboration Zunlei Feng Zhejiang University zunleifeng@zju.edu.cnXinchao Wang Stevens Institute of Technology xinchao.wang@stevens.eduChenglong Ke Zhejiang University chenglongke@zju.edu.cnAnxiang Zeng Alibaba Group renzhong@taobao.comDacheng Tao University of Sydney dctao@sydney.edu.auMingli Song Zhejiang University brooksong@zju.edu.cn
Pseudocode Yes Algorithm 1 The Dual Swap Disentangling (DSD) algorithm
Open Source Code No The paper does not include an explicit statement about releasing source code, nor does it provide a specific repository link for the methodology described in the paper.
Open Datasets Yes We conduct experiments on six image datasets of different domains: a synthesized Square dataset, Teapot (Moreno et al. [2016], Eastwood and Williams [2018]), MNIST (Haykin and Kosko [2009]), d Sprites (Higgins et al. [2016]), Mugshot (Shen et al. [2016]), and CAS-PEAL-R1 (Gao et al. [2008]).
Dataset Splits Yes Square: The training, validation and testing dataset are set as {(20, 000), (9, 000) and (1, 000)}, respectively.Teapot: In the experiment, we used 50, 000 training, 10, 000 validation and 10, 000 testing samples.d Sprites: We sample 100, 000 pairs from original d Sprites, which are divided into {(80, 000), (10, 000), (10, 000)} for training, validation and testing.Mugshot: For Mugshot dataset, we divided it into {(20, 000), (9, 000), (1, 000)} for training, validation and testing.CAS-PEAL-R1: They are divided into {(40, 000), (9, 000), (1, 000)} for training, validation and testing.
Hardware Specification No The paper mentions network architectures and optimizers but does not provide specific hardware details such as GPU/CPU models or memory used for experiments.
Software Dependencies No The paper mentions software components like 'Adam optimizer', 'Info GAN', 'Wasserstein GAN', and 'layer normalization' but does not provide specific version numbers for them.
Experiment Setup Yes Adam optimizer (Kingma and Ba [2014]) is adopted with learning rates of 1e 4 (64 64 network) and 0.5e 4 (32 32 network). The batch size is 64. For the above two network architecture, α and β are all set as 5 and 0.2, respectively. The encoder / discriminatior (D) / auxilary network (Q) and the decoder / generator (G) are shown in Table 1.