Counterfactual Density Estimation using Kernel Stein Discrepancies
Authors: Diego Martinez-Taboada, Edward Kennedy
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | First, we present a novel estimator for modelling counterfactual distributions given a parametric class of distributions, along with its theoretical analysis. Second, we illustrate the empirical performance of the estimator in a variety of scenarios. and 5 EXPERIMENTS We provide a number of experiments with (semi)synthetic data. |
| Researcher Affiliation | Academia | Diego Martinez-Taboada Department of Statistics & Data Science Carnegie Mellon University Pittsburgh, PA 15213, USA diegomar@andrew.cmu.edu Edward H. Kennedy Department of Statistics & Data Science Carnegie Mellon University Pittsburgh, PA 15213, USA edward@stat.cmu.edu |
| Pseudocode | Yes | Algorithm 1 DR-MKSD |
| Open Source Code | Yes | Reproducible code for all experiments is provided in the supplementary materials. |
| Open Datasets | Yes | We start by training a dense neural network with layers of size [784, 100, 20, 10] on the MNIST train dataset. |
| Dataset Splits | Yes | For this, we minimize the log entropy loss on 80% of such train data and we store the parameters that minimize the log entropy loss for the remaining validation data (remaining 20% of the MNIST train dataset). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models, memory, or specific cloud instance types. |
| Software Dependencies | No | The paper mentions using 'scikit-learn package' for various classifiers (Logistic Regression, Ada Boost, Random Forest) but does not provide specific version numbers for these packages or for Python. |
| Experiment Setup | Yes | Parameter θ was estimated by gradient descent for a number of 1000 steps. (Appendix B.1, B.2) and We estimate the minimizer of gn(θ) by gradient descent. (Section 5) and Figure 3 exhibits the values of gn over a grid {(θ1, θ2) : θ1, θ2 { 5, 4.9, 4.8 . . . , 5}}. (Appendix B.3) and π with Default Logistic Regression from the scikit-learn package with C = 1e5 and max iter = 1000 (Appendix B.1). |