reproducibilityindex.ai

X-model: Improving Data Efficiency in Deep Learning with A Minimax Model

Authors: Ximei Wang, Xinyang Chen, Jianmin Wang, Mingsheng Long

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments verify the superiority of the χ-Model among various tasks, from a single-value prediction task of age estimation to a dense-value prediction task of keypoint localization, a 2D synthetic and a 3D realistic dataset, as well as a multi-category object recognition task.
Researcher Affiliation	Academia	Ximei Wang, Xinyang Chen, Jianmin Wang, Mingsheng Long (B) School of Software, BNRist, Tsinghua University, China wxm17@mails.tsinghua.edu.cn, chenxinyang95@gmail.com jimwang@tsinghua.edu.cn, mingsheng@tsinghua.edu.cn,
Pseudocode	No	No explicitly labeled 'Pseudocode' or 'Algorithm' blocks were found.
Open Source Code	No	Code will be made available at https://github.com.
Open Datasets	Yes	d Sprites (Higgins et al., 2017) is a standard 2D synthetic dataset...; MPI3D (Gondal et al., 2019) is a simulation-to-real dataset for 3D object.; IMDB-WIKI(Rasmus Rothe, 2016) is a face dataset with age and gender labels...; Hand-3D-Studio (H3D) (Zhao et al., 2020) is a real-world dataset...; We adopt the most difﬁcult CIFAR-100 dataset (Krizhevsky, 2009)...
Dataset Splits	Yes	On IMDB-WIKI, following the data pre-process method of a recent work (Yang et al., 2021), we also ﬁlter out unqualiﬁed images, and manually construct balanced validation and test set over the supported ages. After splitting, there are 191.5K images for training and 11.0K images for validation and testing.
Hardware Specification	Yes	We use Py Torch1 with Titan V to implement our methods.
Software Dependencies	No	We use Py Torch1 with Titan V to implement our methods. No specific version number for PyTorch or other software dependencies is provided.
Experiment Setup	Yes	The tradeoff hyperparameter η is set as 0.1 for all tasks unless speciﬁed. The learning rates of the heads are set as 10 times to those of the backbone layers, following the common ﬁne-tuning principle (Yosinski et al., 2014). We adopt the mini-batch SGD with momentum of 0.95.