Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Causal Alignment for Reliable Disease Diagnosis
Authors: Mingzhou Liu, Ching-Wen Lee, Xinwei Sun, Xueqing Yu, YU QIAO, Yizhou Wang
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our method on two medical diagnosis applications, showcasing faithful alignment to radiologists. Code is publicly available at https://github.com/lmz123321/Causal_alignment. [...] In this section, we evaluate our method on two medical diagnosis tasks: the benign/malignant classification of lung nodules and breast masses. [...] We repeat 3 different seeds to remove the effect of randomness. [...] Table 1: Comparison with baseline methods on LIDC-IDRI and CBIS-DDSM datasets. [...] Table 2: Ablation study on LIDC-IDRI and CBIS-DDSM datasets. [...] Figure 4: CAM visualization. Each row denotes different cases. |
| Researcher Affiliation | Academia | Mingzhou Liu1 Ching-Wen Lee1 Xinwei Sun 2 Xueqing Yu1 Yu Qiao 3 Yizhou Wang4,1,5,6,7 1 School of Computer Science, Peking University 2 School of Data Science, Fudan University 3 School of Automation and Intelligent Sensing, Shanghai Jiao Tong University 4 Center on Frontiers of Computing Studies, Peking University 5 Institute for Artificial Intelligence, Peking University 6 Nat l Eng. Research Center of Visual Technology, Peking University 7 State Key Lab. of General Artificial Intelligence, Peking University Corresponding authors EMAIL EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Causal alignment training Input: Data D, Output: Decision model fθ, Hyperparameters: Sparsity regularization α, weight of alignment loss λ, learning rate η. 1: while not converged do 2: **Forward pass 3: Compute Lce. 4: Optimize (2) to obtain x and compute Lalign using (3). 5: Compute L Lce + λLalign. 6: **Back propagation 7: Estimate θLalign with conjugate gradient. 8: Update θ: θ θ η θL. // or Adam 9: end while |
| Open Source Code | Yes | Code is publicly available at https://github.com/lmz123321/Causal_alignment. |
| Open Datasets | Yes | We consider the LIDC-IDRI dataset Armato III et al. (2011) for lung nodule classification and the CBIS-DDSM dataset Lee et al. (2017) for breast mass classification. |
| Dataset Splits | Yes | We split the dataset into training (n = 731), validation (n = 238), and test (n = 244) sets. The CBIS-DDSM dataset ... We follow the official dataset split, with 691 masses in the training set and 200 masses in the test set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | We use the Adam optimizer and set the learning rate as 0.001. We adopt the Torch Opt Ren et al. (2022) package to implement the conjugate gradient estimator. |
| Experiment Setup | Yes | We use the Adam optimizer and set the learning rate as 0.001. We parameterize the attributes prediction network fθ1 with a seven-layer Convolutional Neural Network (CNN), and train it for 100 epochs with a batch size of 128 for each iteration. For the classification network fθ2, we parameterize it with a two-layer Multi-Layer Perceptron (MLP), and train it for 30 epochs with a batch size of 128. Please refer to Appx. B for details of the network architectures. For the hyperparameters α1 in (7) and α2 in (6), we set them to α1 = 0.01, α2 = 0.0005 for LIDC-IDRI and α1 = 0.07, α2 = 0.0005 for CBIS-DDSM, respectively. For both datasets, we set λ1 = λ2 = 1 in (5). |