reproducibilityindex.ai

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Authors: Sebastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments We present experiments on disentanglement and few-shot learning.
Researcher Affiliation	Academia	1Mila & DIRO, Université de Montréal 2Canada CIFAR AI Chair.
Pseudocode	No	The paper describes mathematical optimization problems (e.g., Problem (6) and (10)) and algorithmic steps in prose, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our implementation relies on jax and jaxopt (Bradbury et al., 2018; Blondel et al., 2022) and is available here: https://github.com/tristandeleu/ synergies-disentanglement-sparsity.
Open Datasets	Yes	We validate our theory by showing our approach can indeed disentangle latent factors on tasks constructed from the 3D Shapes dataset (Burgess & Kim, 2018). It obtains competitive results on standard few-shot classiﬁcation benchmarks, while each task is using only a fraction of the learned representations.
Dataset Splits	Yes	In every task, the dataset has size n = 50. As opposed to the multi-task setting (i.e., unlike in Section 3.1), one is also given separate test datasets (Dtest t )1 t T of n samples for each task t, to evaluate how well the learned model generalizes to new test samples. In meta-learning, the goal is to learn a learning procedure that will generalize well on new unseen tasks.
Hardware Specification	No	The experiments were in part enabled by computational resources provided by Calcul Quebec and Compute Canada. (This statement is too general and does not provide specific hardware models like CPU, GPU, or memory details).
Software Dependencies	No	Our implementation relies on jax and jaxopt (Bradbury et al., 2018; Blondel et al., 2022). (The paper mentions software libraries but does not provide specific version numbers for reproducibility).
Experiment Setup	Yes	We use the four-layer convolutional neural network typically used in the disentanglement literature (Locatello et al., 2019). In inner-Lasso, we set λmax := 1/n \|\|F^T y\|\|∞ (F Rn m is the design matrix of the features of the samples of a task), while in inner-Ridge we have λmax := 1/n \|\|F\|\|2. We consider the experimental setting 5-shot 5-way.