Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability
Authors: Hoang Son Nguyen, Xiao Fu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested the proposed approach on synthetic data and in a single-cell transcriptomics application. The results corroborate with our NMMI theory. |
| Researcher Affiliation | Academia | Hoang-Son Nguyen and Xiao Fu School of Electrical Engineering and Computer Science Oregon State University EMAIL |
| Pseudocode | No | The paper describes the proposed learning criterion and implementation details in Section 3.2 and Section 5, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation code is available here: https://github.com/hsnguyen24/dica |
| Open Datasets | Yes | We use the TRRUST dataset of TF-gene pairs of mouse [64]. It is a manually curated dataset that includes high-confidence interactions between TFs and their target genes. |
| Dataset Splits | Yes | For each of the simulation, we generate 30000 samples from the described synthetic data generation processes, of which 90% are for training and 10% are for evaluating the MCC and R2 scores. |
| Hardware Specification | Yes | All experiments use one NVIDIA A40 48GB GPU, hosted on a server using Intel Xeon Gold 6148 CPU @ 2.40GHz with 260GB of RAM. |
| Software Dependencies | No | The paper mentions the use of Adam method [75] and ReLU neural networks, but does not provide specific version numbers for programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | We use two fully-connected Re LU neural networks with one hidden layer of 64 neurons to represent fθ and gϕ. The autoencoder is trained via Adam method [75] with learning rate 10^-4 for 200 iterations, among which the first 20 epochs are for warm-up. The regularization hyperparameters are chosen by validating from {10^-2, 10^-3, 10^-4, 10^-5}, resulting in λvol = 10^-4, λnorm = 10^-4, λsp = 10^-4. |