StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Authors: Songhua Liu, Xin Jin, Xingyi Yang, Jingwen Ye, Xinchao Wang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Sty De Sty on multiple benchmarks and demonstrate that it yields encouraging results, outperforming the state of the art by up to 13.44% on classification accuracy. Codes are available here. ... 4. Experiments ... 4.2. Comparison with Sate-of-the-arts |
| Researcher Affiliation | Academia | 1National University of Singapore, Singapore 2Eastern Institute of Technology, Ningbo, China. Correspondence to: Xinchao Wang <xinchao@nus.edu.sg>. |
| Pseudocode | Yes | Algorithm 1 Training of Sty De Sty |
| Open Source Code | Yes | Codes are available here. |
| Open Datasets | Yes | Digits consists of 5 digit recognition datasets including MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011), MNIST-M (Ganin & Lempitsky, 2015), SYN (Ganin & Lempitsky, 2015), and USPS (Denker et al., 1988), with variance on foreground shapes and background patterns. MNIST is used as the source domain containing 60,000 training images. |
| Dataset Splits | No | The paper discusses training and testing on different domains (e.g., “MNIST is used as the source domain... while the others are for evaluation.”, “the original CIFAR-10 dataset is used as the training domain and the corrupted images are used for evaluation.”, “using each of the four domains for training respectively, and the other three for evaluation”). However, it does not provide specific training/validation/test splits within a given dataset (e.g., 80/10/10 percentages or specific sample counts for validation). |
| Hardware Specification | No | The paper does not specify the hardware used for experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Adam) and network architectures but does not specify any software libraries (e.g., PyTorch, TensorFlow) with version numbers that would be necessary for reproduction. |
| Experiment Setup | Yes | For our method, by default, the batch size is set as 64 and the optimizer is SGD. The optimizer for F and H uses a 0.0005 weight decay and 0.9 momentum with the Nesterov mode (Nesterov, 1983). Learning rates for F, H, and G are 0.001, 0.001, and 0.005. The times of inner iteration in Algorithm 1 are all 1 except that TH is 10. The hyper parameters α, λ, and β are 0.1, 1, and 1 respectively. |