Estimation with Incomplete Data: The Linear Case

Authors: Karthika Mohan, Felix Thoemmes, Judea Pearl

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We denominate the graph based recovery procedure presented in this paper as Model Based Estimation (MBE) and evaluate MBE by simulating partially observed datasets from missingness graphs and estimating their parameters from the incomplete data. We compare our estimates against those yielded by state of the art packages for SEM that apply Multiple Imputation (MI) (using mice package in R) and Maximum Likelihood (ML) (using lavaan in R) techniques [Schminkey et al., 2016; Enders, 2006]. Parameters are evaluated in terms of mean squared error and KL Divergence between original and learned distributions.
Researcher Affiliation Academia Karthika Mohan1, Felix Thoemmes2 and Judea Pearl3 1 University of California, Berkeley 2 Cornell University 3 University of California, Los Angeles
Pseudocode Yes Algorithm 1 Recover Variance(Y, G, X1, X2) Input: Y : variable whose variance is to be recovered. G: Markovian m-graph in which X1 is a parent of Y and X2 is a child of Y Output: var(Y ) if var(Y ) is recoverable NULL if var(Y ) is not recoverable 1: if var(Y ) is recoverable using theorem 3 then 2: Recover estimand var(Y ) 3: return var(Y ) 4: if βX1,Y is recoverable by lemma 2 then 5: Recover estimand βX1,Y 6: else return NULL 7: αy Recover αy(G, Y, X1, X2) 8: if αy == NULL then return NULL 9: cov(X1, Y ) Recover cov(G, Y, X1, αy ) 10: if cov(X1, Y ) == NULL then return NULL 11: return cov(X1,Y )
Open Source Code No The paper does not provide any specific statements or links indicating that its source code is publicly available.
Open Datasets No The paper states: "We generate data according to the following model and evaluate the performance of MBE, MI and FIML in terms of Mean Squared Error (MSE) and time taken to compute mean of X." and "We denominate the graph based recovery procedure presented in this paper as Model Based Estimation (MBE) and evaluate MBE by simulating partially observed datasets from missingness graphs and estimating their parameters from the incomplete data." This indicates the use of synthetically generated data, not a publicly available dataset.
Dataset Splits No The paper describes experiments performed on synthetically generated data, varying parameters like sample size and complexity. It mentions "500 simulations" for MSE computation, but does not specify any training, validation, or test dataset splits in the conventional sense.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies No The paper mentions using "mice package in R" and "lavaan in R" but does not specify version numbers for R or these packages.
Experiment Setup No The paper describes an empirical evaluation and comparison to other software packages but does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training configurations.