Weakly Supervised Representation Learning with Sparse Perturbations
Authors: Kartik Ahuja, Jason S. Hartford, Yoshua Bengio
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted two sets of experiments low-dimensional synthetic and image-based inputs that follow the DGP in equation (24). In the low-dimensional synthetic experiments we experimented with two choices for PZ a) uniform distribution with independent latents, b) normal distribution with latents that are blockwise independent (with block length d/2). We used an invertible multi-layer perceptron (MLP) (with 2 hidden layers) from Zimmermann et al. (2021) for g. We evaluated for latent dimensions d {6, 10, 20}. The training and test data size was 10000 and 5000 respectively. For the image-based experiments we used Py Game (Shinners, 2011) s rendering engine for g and generated 64 64 pixel images that look like those shown in Figure 1. The coordinates of each ball, zi, were drawn independently from a uniform distribution, zi U(0.1, 0.9). We varied the number of balls from 2 (d = 4) to 4 (d = 8). |
| Researcher Affiliation | Academia | Mila Quebec AI Institute, Université de Montréal. CIFAR Fellow. Correspondence to: kartik.ahuja@mila.quebec. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Finally, the code to reproduce the experiments presented above can be found at https://github.com/ahujak/WSRL. |
| Open Datasets | No | For the image-based experiments we used Py Game (Shinners, 2011) s rendering engine for g and generated 64 64 pixel images that look like those shown in Figure 1. The coordinates of each ball, zi, were drawn independently from a uniform distribution, zi U(0.1, 0.9). ... In the low-dimensional synthetic experiments we experimented with two choices for PZ a) uniform distribution with independent latents, b) normal distribution with latents that are blockwise independent (with block length d/2)." The paper describes generating its own experimental data rather than using a pre-existing publicly available dataset for which concrete access information is provided. |
| Dataset Splits | No | The training and test data size was 10000 and 5000 respectively. For the image-based experiments... instead the images are generated online and we trained to convergence." The paper specifies training and test data sizes but does not explicitly mention a separate validation split or its size/percentage. |
| Hardware Specification | Yes | The experiments were conducted on a machine with 8 NVIDIA GeForce RTX 3090 GPUs. Each run took approximately 12 hours for the image-based experiments. For the low-dimensional synthetic experiments, each run took approximately 3 hours. |
| Software Dependencies | No | The code was implemented in PyTorch (Paszke et al., 2019) and Python 3.9." The paper mentions Python 3.9, which has a specific version, but for PyTorch, it only provides a citation without a specific version number. |
| Experiment Setup | Yes | For the low-dimensional synthetic experiments, we used Adam optimizer (Kingma and Ba, 2014) with a learning rate of 0.001. We trained for 100 epochs with a batch size of 256. For the image-based experiments, we used Adam optimizer with a learning rate of 0.0001. We trained to convergence with a batch size of 64. |