Simple Disentanglement of Style and Content in Visual Representations

Authors: Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify its efficacy empirically. Our post-processed features yield significant domain generalization performance improvements... We verify the ability of PISCO to disentangle style and content on three image datasets of varying size and complexity via post-processing of various pre-trained deep visual feature extractors.
Researcher Affiliation Collaboration 1Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York, United States 2Department of Statistics, University of Michigan, Ann Arbor, Michigan, United States 3IBM Research, Cambridge, Massachusetts, United States 4MIT-IBM Watson AI Lab, Cambridge, Massachusetts, United States.
Pseudocode Yes Algorithm 1 PISCO
Open Source Code Yes The experiments code is available on Git Hub.2 Code: github.com/lilianngweta/PISCO.
Open Datasets Yes We consider nine transformations in our experiments: four types of image corruptions (rotation, contrast, blur, and saturation) on CIFAR-10 (Krizhevsky et al., 2009)... on Image Net (Russakovsky et al., 2015)... and a color transformation on MNIST, similar to Colored MNIST (Arjovsky et al., 2019)...
Dataset Splits Yes Specifically, in the training dataset, we corrupt images from the first half of the classes with probability α and from the second half of the classes with probability 1 α. In test data the correlation is reversed, i.e., images from the first half of the classes are corrupted with probability 1 α and images from the second half with probability α (see C for details).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions "Pytorch’s Torchvision package" but does not specify version numbers for this or any other key software dependencies.
Experiment Setup Yes The batch size used when training the logistic regression model on Image Net was 32768, the learning rate was 0.0001, and the number of epochs was 50.