reproducibilityindex.ai

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

Authors: Kevin Bello, Bryon Aragam, Pradeep Ravikumar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we provide extensive experiments for linear and nonlinear SEMs and show that our approach can reach large speedups and smaller structural Hamming distances against state-of-the-art methods.
Researcher Affiliation	Academia	Booth School of Business, University of Chicago, Chicago, IL 60637 Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213
Pseudocode	Yes	Algorithm 1 DAGMA
Open Source Code	Yes	Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/dagma.
Open Datasets	No	For each d, 30 matrices were randomly sampled from a standard Gaussian distribution. Given a data matrix X = [x1, . . . , xd] Rn d, we deﬁne a score function Q(f; X) to measure the quality of a candidate SEM as follows: Q(f; X) = Pd j=1 loss(xj, fj(X)) For linear models. In Appendix C.1, we report results for linear SEMs with Gaussian, Gumbel, and exponential noises, and use the least squares loss. This implies data is simulated/generated, not a fixed publicly available dataset they are using. No access information is provided for generated data.
Dataset Splits	No	No explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or specific split methodologies) are mentioned in the paper. The paper implies data is generated and used for optimization.
Hardware Specification	Yes	All experiments were performed on a cluster running Ubuntu 18.04.5 LTS with Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz, and NVIDIA Tesla V100 GPU.
Software Dependencies	Yes	For our proposed method DAGMA, we implemented it in Python 3.8 and PyTorch 1.10.0. We use Adam [24] for optimization.
Experiment Setup	Yes	For all linear and nonlinear SEM experiments, we set the number of iterations T = 10000, initial central path coefﬁcient µ(0) = 1, decay factor α = 0.5, ℓ1 parameter β1 = 0.01, log-det parameter s = 1.0. We use Adam optimizer with learning rate 0.001.