reproducibilityindex.ai

Orthogonal Machine Learning: Power and Limitations

Authors: Lester Mackey, Vasilis Syrgkanis, Ilias Zadik

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply these techniques in the setting of demand estimation from pricing and purchase data, where highly non Gaussian treatment residuals are standard. In this setting, the treatment is the price of a product, and commonly, conditional on all observable covariates, the treatment follows a discrete distribution representing random discounts offered to customers over a baseline price linear in the observables. In Figure 1 we portray the results of a synthetic demand estimation problem with dense dependence on observables. Here, the standard orthogonal moment estimation has large bias, comparable to variance, while our second-order orthogonal moments lead to nearly unbiased estimation.
Researcher Affiliation	Collaboration	1Microsoft Research New England, USA 2Operations Research Center, MIT, USA. Correspondence to: Lester Mackey <lmackey@microsoft.com>, Vasilis Syrgkanis <vasy@microsoft.com>, Ilias Zadik <izadik@mit.edu>.
Pseudocode	No	The paper provides mathematical derivations and descriptions of methods but does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Python code recreating all experiments is available at https://github.com/Ilias Zadik/double_orthogonal_ml.
Open Datasets	No	We generated n independent replicates of outcome Y , treatment T, and confounding covariates X.
Dataset Splits	Yes	This sample-splitting procedure proceeds as follows. 1. First stage. Form an estimate ˆh H of h0 using (Zt)2n t=n+1 (e.g., by running a nonparametric or highdimensional regression procedure). 2. Second stage. Compute a Z-estimate ˆθSS Θ of θ0 using an empirical version of the moment conditions (1) and ˆh as a plug-in estimate of h0: ˆθSS solves 1 n Pn t=1 m(Zt, θ, ˆh(Xt)) = 0. ... A form of repeated sample splitting called K-fold crossﬁtting (see, e.g., Chernozhukov et al., 2017) addresses both of these concerns. K-fold cross-ﬁtting partitions the index set of the datapoints [2n] into K subsets I1, . . . , IK of cardinality 2n K (assuming for simplicity that K divides 2n) and produces the following two-stage estimate: ... For the ﬁrst-order method all remaining n/2 points were used for the second stage estimation of θ0. For the second-order method, the moments E[η2] and E[η3] were estimated using a subsample of n/4 points as described in Theorem 10, and the remaining n/4 sample points were used for the second stage estimation of θ0. For each method we performed cross-ﬁtting across the ﬁrst and second stages, and for the second-order method we performed nested crossﬁtting between the n/4 subsample used for the E[η2] and E[η3] estimation and the n/4 subsample used for the second stage estimation.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments.
Software Dependencies	No	The paper mentions "Python code recreating all experiments" but does not specify the Python version or any other software dependencies with version numbers.
Experiment Setup	Yes	Sample size n = 5000, dimension of confounders d = 1000, support size of sparse linear nuisance functions s = 100. ... The regularization parameter λn of each Lasso was chosen to be p.