Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Enriching Disentanglement: From Logical Definitions to Quantitative Metrics
Authors: Yivan Zhang, Masashi Sugiyama
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations. |
| Researcher Affiliation | Academia | Yivan Zhang The University of Tokyo, RIKEN AIP Tokyo, Japan EMAIL Masashi Sugiyama RIKEN AIP, The University of Tokyo Tokyo, Japan EMAIL |
| Pseudocode | No | The paper includes Python code snippets in Appendix D.6 (e.g., 'def q_product(y: np.ndarray, z: np.ndarray, aggregate, deviation):'), but these are embedded in text for illustration rather than presented as formally labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | No | The paper provides code snippets for implementing metrics in Appendix D.6 and mentions using a 'public PyTorch implementation' for existing models, but it does not include an unambiguous statement of releasing its own source code for the described methodology or a direct link to a dedicated repository for its novel contributions. |
| Open Datasets | Yes | We also report the results of several widely used unsupervised disentangled representation learning methods (VAE [Kingma and Welling, 2014], β-VAE [Higgins et al., 2017], Factor VAE [Kim and Mnih, 2018], and β-TCVAE [Chen et al., 2018]) evaluated on four image datasets (3D Cars [Reed et al., 2015], d Sprites [Matthey et al., 2017], 3D Shapes [Burgess and Kim, 2018], and MPI3D [Gondal et al., 2019]). |
| Dataset Splits | No | The paper states, 'We used a public Py Torch implementation ... and used the same encoder/decoder architecture with the default hyperparameters described in Locatello et al. [2019b] for all methods for a fair comparison.' While it refers to external hyperparameters, it does not explicitly provide the training/validation/test dataset splits within the paper itself. |
| Hardware Specification | Yes | The experiments were conducted on a NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions software like Num Py [Harris et al., 2020], Py Torch [Paszke et al., 2019], and scikit-learn [Pedregosa et al., 2011], but it does not provide specific version numbers for these key software components required for reproducibility. |
| Experiment Setup | Yes | We used a public Py Torch implementation [Paszke et al., 2019] of these methods and used the same encoder/decoder architecture with the default hyperparameters described in Locatello et al. [2019b] for all methods for a fair comparison. |