reproducibilityindex.ai

Towards a Learning Theory of Cause-Effect Inference

Authors: David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, Iliya Tolstikhin

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct an array of experiments to test the effectiveness of a simple implementation of the presented causal learning framework. Given the use of random embeddings (14) in our classiﬁer, we term our method the Randomized Causation Coefﬁcient (RCC). Throughout our simulations, we featurize each sample S = {(xi, yi)}n i=1 as ν(S) = (µk,m(PSx), µk,m(PSy), µk,m(PSxy)), (15) where the three elements forming (15) stand for the lowdimensional representations (14) of the empirical kernel mean embeddings of {xi}n i=1, {yi}n i=1, and {(xi, yi)}n i=1, respectively. The representation (15) is motivated by the typical conjecture in causal inference about the existence of asymmetries between the marginal and conditional distributions of causally-related pairs of random variables (Sch olkopf et al., 2012). Each of these three embeddings has random features sampled to approximate the sum of three Gaussian kernels (2) with hyper-parameters 0.1γ, γ, and 10γ, where γ is found using the median heuristic. In practice, we set m = 1000, and observe no signiﬁcant improvements when using larger amounts of random features. To classify the embeddings (15) in each of the experiments, we use the random forest3 implementation from Python s sklearn-0.16-git. The number of trees is chosen from {100, 250, 500, 1000, 5000} via cross-validation.
Researcher Affiliation	Collaboration	David Lopez-Paz1,2 DAVID@LOPEZPAZ.ORG Krikamol Muandet1 KRIKAMOL@TUEBINGEN.MPG.DE Bernhard Sch olkopf1 BS@TUEBINGEN.MPG.DE Ilya Tolstikhin1 ILYA@TUEBINGEN.MPG.DE 1Max-Planck-Institute for Intelligent Systems 2University of Cambridge
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our experiments can be replicated using the source code at https://github.com/lopezpaz/causation_learning_theory.
Open Datasets	Yes	The Tübingen cause-effect pairs is a collection of heterogeneous, hand-collected, real-world cause-effect samples (Zscheischler, 2014). URL http://webdav.tuebingen.mpg.de/cause-effect/. ... The cause-effect challenges organized by Guyon (2014) provided N = 16, 199 training causal samples Si, each drawn from the distribution of Xi Yi, and labeled either Xi Yi , Xi Yi , Xi Zi Yi , or Xi Yi . URL https://www.codalab.org/ competitions/1381.
Dataset Splits	Yes	The number of trees is chosen from {100, 250, 500, 1000, 5000} via cross-validation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments.
Software Dependencies	Yes	To classify the embeddings (15) in each of the experiments, we use the random forest3 implementation from Python s sklearn-0.16-git.
Experiment Setup	Yes	In practice, we set m = 1000, and observe no signiﬁcant improvements when using larger amounts of random features. To classify the embeddings (15) in each of the experiments, we use the random forest3 implementation from Python s sklearn-0.16-git. The number of trees is chosen from {100, 250, 500, 1000, 5000} via cross-validation. ... Each of these three embeddings has random features sampled to approximate the sum of three Gaussian kernels (2) with hyper-parameters 0.1γ, γ, and 10γ, where γ is found using the median heuristic. ... A cause vector (ˆxij)n j=1 is sampled from a mixture of Gaussians with c components. The mixture weights are sampled from U(0, 1), and normalized to sum to one. The mixture means and standard deviations are sampled from N(0, σ1), and N(0, σ2), respectively, accepting only positive standard deviations. ... A noise vector (ˆϵij)n j=1 is sampled from a centered Gaussian, with variance sampled from U(0, σ3). ... A mapping mechanism ˆfi is conceived as a spline ﬁtted using an uniform grid of df elements from min((ˆxij)n j=1) to max((ˆxij)n j=1) as inputs, and df normally distributed outputs. ... We set n = 1000, and N = 10, 000.