Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Efficient CDF Approximations for Normalizing Flows
Authors: Chandramouli Shama Sastry, Andreas Lehrmann, Marcus A Brubaker, Alexander Radovic
TMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on popular flow architectures and UCI benchmark datasets show a marked improvement in sample efficiency as compared to traditional estimators. |
| Researcher Affiliation | Collaboration | Chandramouli Sastry EMAIL Dalhousie University Vector Institute Borealis AI Andreas M. Lehrmann EMAIL Borealis AI Marcus Brubaker EMAIL York University Vector Institute Borealis AI Alexander Radovic EMAIL Borealis AI |
| Pseudocode | Yes | See Appendix B for a summary of the entire splitting process in pseudo code. |
| Open Source Code | Yes | The code to reproduce our results, including training popular normalizing flow architectures, approximating cumulative densities with the proposed adaptive boundary estimator, and other baseline methods is publicly available.1 https://github.com/Borealis AI/nflow-cdf-approximations |
| Open Datasets | Yes | For the purpose of evaluation, we train normalizing flows on d-dimensional (d {2, 3, 4, 5}) data derived from 4 tabular datasets open sourced as part of the UCI Machine Learning Repository (Dua & Graff, 2017) and preprocessed as in Papamakarios et al. (2017): Power, Gas, Hepmass, and Miniboone. |
| Dataset Splits | No | The paper mentions obtaining "2 random d-dimensional slices of the dataset over which we train the normalizing flows" and creating "5 convex hulls for each choice of the radius" for evaluation. However, it does not provide specific training, validation, or test dataset splits in terms of percentages, sample counts, or references to standard pre-defined splits for the models being trained or evaluated. |
| Hardware Specification | Yes | In order to obtain a fair evaluation, we ran all of the timing experiments in a single non-preemptible job having access to 8 CPUs, 64GB RAM and one Tesla T4 GPU (16GB). |
| Software Dependencies | No | The paper mentions "Py Torch" but does not specify a version number or list any other software dependencies with their corresponding versions. |
| Experiment Setup | Yes | For all normalizing flows, we train the models with a batch size of 10k and stop when the log-likelihoods do not improve over 5 epochs. For the continuous flows, we used the exact divergence for computing the log-determinant. We used exp-scaling in the affine coupling layer of both MAF and Glow models and, in order to prevent numerical overflows, we applied a tanh nonlinearity before the exp-scaling. Finally, we used softplus as our activation function for both the Neural ODE and coupling networks. From Fig. 6 and Fig. 7, we observe both the Continuous and Discrete flows obtain similar log-likelihoods and are able to fit the training data well. For constructing discrete flows, we choose 3, 5 or 7 flow layers and construct coupling layers with 16, 32 or 64 hidden units. While one Glow layer corresponds to a sequence of (Act Norm) (Glow Coupling) (Invertible 1 1) transformations, one MAF layer corresponds to a sequence of (Act Norm) (MAF Coupling) transformations. For continuous flows, we parameterize the neural ODE with 2 hidden layers, each consisting of 16, 32 or 64 hidden units. |