Normalizing Flows for Interventional Density Estimation
Authors: Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Across various experiments, we demonstrate that our Interventional Normalizing Flows are expressive and highly effective, and scale well with both sample size and high-dimensional confounding. |
| Researcher Affiliation | Academia | 1LMU Munich & Munich Center for Machine Learning (MCML), Munich, Germany. Correspondence to: Valentyn Melnychuk <melnychuk@lmu.de>. |
| Pseudocode | Yes | Algorithm 1 Training procedure of INFs |
| Open Source Code | Yes | Code is available at https://github.com/ Valentyn1997/INFs. |
| Open Datasets | Yes | To show the effectiveness of our INFs, we use established (semi-)synthetic datasets that have been previously used for treatment effect estimation (Shi et al., 2019; Curth & van der Schaar, 2021). |
| Dataset Splits | Yes | We performed hyperparameters tuning of the nuisance parameters models for all the baselines based on five-fold cross-validation using the train subset. For each baseline, we performed a grid search with respect to different tuning criteria, evaluated on the validation subsets. |
| Hardware Specification | Yes | Experiments are carried out on Intel(R) Xeon(R) Silver 4316 CPU @ 2.30GHz. |
| Software Dependencies | No | We implemented our INFs using Py Torch and Pyro. (No version numbers provided for PyTorch or Pyro). |
| Experiment Setup | Yes | For the nuisance flow, we use fully-connected subnetworks each with one hidden layer (with h = 10 hidden units), and the dimensionality of representation is set to d R = 10. ... We use stochastic gradient descent (SGD) for fitting the parameters of the nuisance flow, and Adam optimizer (Kingma & Ba, 2015) for the target flow with learning rates ηN and ηT, respectively. We fix the weighting hyperparameters of the loss to α = 1 and the EMA smoothing hyperparameter to γ = 0.995. ... Those include the minibatch size b T = 64 and the learning rate ηT = 0.005. |