Proximal Causal Inference With Text Data
Authors: Jacob Chen, Rohit Bhattacharya, Katherine Keith
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method in synthetic and semi-synthetic settings the latter with real-world clinical notes from MIMIC-III and open large language models for zeroshot prediction and find that our method produces estimates with low bias. |
| Researcher Affiliation | Academia | Jacob M. Chen Department of Computer Science Johns Hopkins University jchen459@jhu.edu Rohit Bhattacharya Department of Computer Science Williams College rb17@williams.edu Katherine A. Keith Department of Computer Science Williams College kak5@williams.edu |
| Pseudocode | Yes | Algorithm 1 for inferring two text-based proxies |
| Open Source Code | Yes | Supporting code is available at https://github.com/jacobmchen/proximal_w_text. |
| Open Datasets | Yes | For our semi-synthetic experiments, we use MIMIC-III, a deidentified dataset of patients admitted to critical care units at a large tertiary care hospital (Johnson et al., 2016). |
| Dataset Splits | Yes | Following sample splitting from the causal inference literature Hansen (2000), we start by splitting the semi-synthetic dataset into two splits split 1 and split 2 where both splits are 50% of the original dataset. |
| Hardware Specification | Yes | To run the experiments in this paper, we used a local server with 64 cores of CPUs and 4 x NVIDIA RTX A6000 48GB GPUs. |
| Software Dependencies | No | The paper mentions using 'scikit-learn library' and specific large language models (FLAN-T5 XXL, OLMo-7B-Instruct) but does not provide specific version numbers for these software components or any other ancillary software. |
| Experiment Setup | Yes | Whenever the positivity rate of W is less than 0.2 or greater than 0.8, i.e. there is a class imbalance, we set the hyperparameter class_weight to balanced. ... we set the hyperparameter penalty to None to turn off regularization. |