Continuously Parameterized Mixture Models

Authors: Christopher M Bender, Yifeng Shi, Marc Niethammer, Junier Oliva

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments
Researcher Affiliation Academia 1Department of Computer Science, The University of North Carolina, Chapel Hill, North Carolina, USA. Correspondence to: Christopher M. Bender <bender@cs.unc.edu>.
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code can be found at https://github.com/lupalab/cpmm.
Open Datasets Yes We test training CPMMs on MNIST (Deng, 2012) and Fashion-MNIST (Xiao et al., 2017)
Dataset Splits No The paper mentions using MNIST and Fashion-MNIST datasets but does not explicitly provide details about the specific training, validation, and test splits (e.g., percentages or sample counts) used for reproduction.
Hardware Specification No The paper mentions '1-2 GB of GPU RAM to train' but does not specify any particular GPU model, CPU model, or other detailed hardware specifications used for experiments.
Software Dependencies No The paper mentions using 'PyTorch (Paszke et al., 2019)' and 'PyTorch Lightning (Falcon, 2019)' but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes Unless otherwise stated, we extract 25 components per trajectory for each hierarchical CPMM. CPMM NODEs are constructed using CNNs with a depth of 3, a hidden channel width of 64, and utilize sin activations. We additionally augment the state space by a factor of four and explicitly condition the ODEs on pseudotime by concatenating it as an extra channel. To avoid degeneracies, we soft-clip log d so that the minimum value cannot be less than -6. We used Adam (Kingma & Ba, 2015) as the optimizer in all cases. In general, we choose to use 51 total spaces (including the latent space, u1, and the input space, x) and allocate one epoch per space. We then fine-tune on the input space.