Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Causally Disentangled Representations for Fair Personality Detection

Authors: Yangfu Zhu, Meiling Li, Yuting Wei, Di Liu, Yuqing Li , Bin Wu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on three real-world datasets demonstrate that our IPDN outperforms state-of-the-art methods in personality detection.
Researcher Affiliation	Academia	1 College of Information Engineering, Capital Normal University, Beijing, China 2 Beijing University of Posts and Telecommunications, Beijing, China EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical equations but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	Kaggle dataset is collected from Personality Cafe platform... Pandora is consist of Reddit posts... [Gjurkovi c et al., 2021]. Essays is a well-known stream-of-consciousness dataset [Pennebaker and King, 1999]
Dataset Splits	Yes	Following previous works [Yang et al., 2021b; Yang et al., 2023b], these datasets are randomly divided into 6:2:2 for training, validation, and testing respectively.
Hardware Specification	Yes	We implement our IPDN in Pytorch 935 1.11.0 and train it on three NVIDIA Ge Force RTX 2080 GPUs.
Software Dependencies	Yes	We implement our IPDN in Pytorch 935 1.11.0 and train it on three NVIDIA Ge Force RTX 2080 GPUs. We utilized the Adam optimizer (Kingma et al., 2017)
Experiment Setup	Yes	We utilized the Adam optimizer (Kingma et al., 2017) and searched for the learning rate among {1e 2, 1e 3, 1e 4}. IPDN are trained for 80 and 120 epochs in single-dataset and cross-dataset experiments, respectively. Early stopping strategy is employed for training. To initialize the post embeddings, we used the pre-trained language model BERT with the bert-base-cased architecture. The output dimensions of the mapping function are set to 200, 200, and 300 for the Kaggle, Pandora, and Essays dataset. The dimensions of the confounder prototype are the same as the output dimension of the mapping function, which is to facilitate feature-level computation. The size K of confounder dictionary C = [c1, c2, ...,c K] (i.e., the number of clusters) are set to 64, 128, and 64 for the three datasets, respectively. We search for the trade-off parameter λ are searched in (0, 1) for different datasets.