Normalizing Flows for Interventional Density Estimation

Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Across various experiments, we demonstrate that our Interventional Normalizing Flows are expressive and highly effective, and scale well with both sample size and high-dimensional confounding.
Researcher Affiliation Academia 1LMU Munich & Munich Center for Machine Learning (MCML), Munich, Germany. Correspondence to: Valentyn Melnychuk <melnychuk@lmu.de>.
Pseudocode Yes Algorithm 1 Training procedure of INFs
Open Source Code Yes Code is available at https://github.com/ Valentyn1997/INFs.
Open Datasets Yes To show the effectiveness of our INFs, we use established (semi-)synthetic datasets that have been previously used for treatment effect estimation (Shi et al., 2019; Curth & van der Schaar, 2021).
Dataset Splits Yes We performed hyperparameters tuning of the nuisance parameters models for all the baselines based on five-fold cross-validation using the train subset. For each baseline, we performed a grid search with respect to different tuning criteria, evaluated on the validation subsets.
Hardware Specification Yes Experiments are carried out on Intel(R) Xeon(R) Silver 4316 CPU @ 2.30GHz.
Software Dependencies No We implemented our INFs using Py Torch and Pyro. (No version numbers provided for PyTorch or Pyro).
Experiment Setup Yes For the nuisance flow, we use fully-connected subnetworks each with one hidden layer (with h = 10 hidden units), and the dimensionality of representation is set to d R = 10. ... We use stochastic gradient descent (SGD) for fitting the parameters of the nuisance flow, and Adam optimizer (Kingma & Ba, 2015) for the target flow with learning rates ηN and ηT, respectively. We fix the weighting hyperparameters of the loss to α = 1 and the EMA smoothing hyperparameter to γ = 0.995. ... Those include the minibatch size b T = 64 and the learning rate ηT = 0.005.