Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Discrete Concepts in Latent Hierarchical Models
Authors: Lingjing Kong, Guangyi Chen, Biwei Huang, Eric Xing, Yuejie Chi, Kun Zhang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We substantiate our theoretical claims with synthetic data experiments. Further, we discuss our theory s implications for understanding the underlying mechanisms of latent diffusion models and provide corresponding empirical evidence for our theoretical insights. |
| Researcher Affiliation | Academia | 1Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence 3University of California San Diego |
| Pseudocode | Yes | Algorithm 1: The overall procedure for Rank-based Discrete Latent Causal Model Discovery. |
| Open Source Code | Yes | The code can be found here. |
| Open Datasets | No | We generate the hierarchical model G with randomly sampled parameters, and follow [24] to build the generating process from d to the observed variables x (i.e., graph Γ) by a Gaussian mixture model. |
| Dataset Splits | No | The paper describes generating synthetic data and evaluating on specific graphs, but does not specify training, validation, and test dataset splits. |
| Hardware Specification | Yes | We conduct our experiments on a cluster of 64 CPUs. All experiments can be finished within half an hour. The search algorithm implementation is adapted from Dong et al. [20]. We conduct all our experiments on 2 Nvidia L40 GPUs. |
| Software Dependencies | No | We employ the pre-trained latent diffusion model [28] SD v1.4 across all our experiments. |
| Experiment Setup | Yes | Experimental setup. We generate the hierarchical model G with randomly sampled parameters, and follow [24] to build the generating process from d to the observed variables x (i.e., graph Γ) by a Gaussian mixture model. The inference process consists of 50 steps. For experiments in Section 7.1, we inject concepts by appending keywords to the original prompt. We evaluate ranks in {2, 4, 8} and scales {1, 2, 3, 4, 5}. |