Learning Causal Models from Conditional Moment Restrictions by Importance Weighting

Authors: Masahiro Kato, Masaaki Imaizumi, Kenichiro McAlinn, Shota Yasui, Haruo Kakehi

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, we confirm the soundness of our proposed method. We implement the following three methods based on our proposed method: first, we use neural networks to predict f and train the model by penalized least-squares in (4) (IW-LS); second, we use neural networks to predict f and train the model by minimizing the sum of approximated moment restrictions in (2) (IW-MM), which is the same as IW-LS except for the penalized term in the IW-LS; third, we use a linear-in-parameter model with the Gaussian kernel to predict f and train the model by GMM (IW-Krnl). For all cases, we use neural networks for estimating r . We compare our proposed methods with four methods: Deep GMM (Bennett et al. (2019)), DFIV (Xu et al. (2021a)), Deep IV (Hartford et al. (2017)), and KIV (Singh et al. (2019)). We use the datasets proposed in Newey & Powell (2003), Ai & Chen (2003), and Hartford et al. (2017).
Researcher Affiliation Collaboration Masahiro Kato1,2, Masaaki Imaizumi2, Kenichiro Mc Alinn3, Shota Yasui1, and Haruo Kakehi1 1AI Lab, Cyber Agent, Inc. 2The University of Tokyo 3Temple University
Pseudocode No The paper describes methods in prose and mathematical equations but does not include any pseudocode or algorithm blocks.
Open Source Code No For Deep GMM, DFIV, Deep IV, and KIV, we use the code and hyperparameters used in Xu et al. (2021a)1. 1https://github.com/liyuan9988/Deep Feature IV. The paper does not provide a link or statement that their own proposed methods' code is open-source.
Open Datasets Yes We use the datasets proposed in Newey & Powell (2003), Ai & Chen (2003), and Hartford et al. (2017).
Dataset Splits No The paper mentions using specific sample sizes (e.g., 'n = 1,000', '1,000 samples', '5,000 samples') for experiments and calculating MSE, but does not provide explicit training, validation, or test dataset splits or percentages.
Hardware Specification No The paper does not specify any hardware details like GPU models, CPU types, or cloud computing instances used for running the experiments.
Software Dependencies No The paper mentions using 'neural networks' and methods like 'least-squares', but it does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes A regularization coefficient η is set to 0.001 as a result of cross-validation. Each fully-connected layer (FC) is followed by leaky Re LU activations with leakiness α = 0.2. For Deep GMM, DFIV, Deep IV, and KIV, we use the code and hyperparameters used in Xu et al. (2021a).