Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability

Authors: Hoang Son Nguyen, Xiao Fu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested the proposed approach on synthetic data and in a single-cell transcriptomics application. The results corroborate with our NMMI theory.
Researcher Affiliation	Academia	Hoang-Son Nguyen and Xiao Fu School of Electrical Engineering and Computer Science Oregon State University EMAIL
Pseudocode	No	The paper describes the proposed learning criterion and implementation details in Section 3.2 and Section 5, but does not provide structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our implementation code is available here: https://github.com/hsnguyen24/dica
Open Datasets	Yes	We use the TRRUST dataset of TF-gene pairs of mouse [64]. It is a manually curated dataset that includes high-confidence interactions between TFs and their target genes.
Dataset Splits	Yes	For each of the simulation, we generate 30000 samples from the described synthetic data generation processes, of which 90% are for training and 10% are for evaluating the MCC and R2 scores.
Hardware Specification	Yes	All experiments use one NVIDIA A40 48GB GPU, hosted on a server using Intel Xeon Gold 6148 CPU @ 2.40GHz with 260GB of RAM.
Software Dependencies	No	The paper mentions the use of Adam method [75] and ReLU neural networks, but does not provide specific version numbers for programming languages, libraries, or frameworks used for implementation.
Experiment Setup	Yes	We use two fully-connected Re LU neural networks with one hidden layer of 64 neurons to represent fθ and gϕ. The autoencoder is trained via Adam method [75] with learning rate 10^-4 for 200 iterations, among which the first 20 epochs are for warm-up. The regularization hyperparameters are chosen by validating from {10^-2, 10^-3, 10^-4, 10^-5}, resulting in λvol = 10^-4, λnorm = 10^-4, λsp = 10^-4.