Hierarchical and Density-based Causal Clustering
Authors: Kwangho Kim, Jisu Kim, Larry Wasserman, Edward Kennedy
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore finite sample properties via simulation, and illustrate the proposed methods in voting and employment projection datasets. |
| Researcher Affiliation | Academia | Kwangho Kim Korea University kwanghk@korea.ac.kr Jisu Kim Seoul National University jkim82133@snu.ac.kr Larry A. Wasserman Carnegie Mellon University larry@stat.cmu.edu Edward H. Kennedy Carnegie Mellon University edward@stat.cmu.edu |
| Pseudocode | No | The paper refers to external algorithms (e.g., 'Algorithm 2 in Balcan et al. [5]') but does not provide its own pseudocode or algorithm blocks. |
| Open Source Code | No | We plan to release a quick tutorial code on Github shortly. |
| Open Datasets | Yes | We explore finite sample properties via simulation, and illustrate the proposed methods in voting and employment projection datasets. ... Nie and Wager [46] considered a dataset on the voting study originally used by Arceneaux et al. [2]. ... The dataset, obtained from the US Bureau of Labor Statistics (BLS), provides projected employment by occupation. |
| Dataset Splits | No | We randomly chose a training set of size 13000 and a test set of size 10000 from the entire sample. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | We use the cross-validation-based Super Learner ensemble [54] to combine regression splines, support vector machine regression, and random forests on the training sample, and perform the density-based causal clustering on the test sample using De Ba Cl function in TDA R package [18]. |
| Experiment Setup | Yes | Letting n = 2500, we randomly pick 10 points in a bounded hypercube [0, 1]3: {c 1, ..., c 10}, and assign roughly n/10 points following truncated normal distribution to each Voronoi cell associated with c j; these are our {µ(i)}. Next, we let bµa = µa + ξ with ξ N(0, n β). ... We randomly chose a training set of size 13000 and a test set of size 10000 from the entire sample. Then we estimate {bµ(i)} using the cross-validation-based Super Learner ensemble [54] to combine regression splines, support vector machine regression, and random forests on the training sample ... Next, letting h = 0.01, we compute eph and bph, and the corresponding level sets Lh,t and b Lh,t for different values of t. |