Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Matchings Under Biased and Correlated Evaluations

Authors: Amit Kumar, Nisheeth K. Vishnoi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our key technical contributions include: (i) a closed-form characterization of the equilibrium thresholds s 1 and s 2(γ) that govern the stable matching (Equation (4), Theorem 3.2); (ii) a piecewise expression for the representation ratio R(β, γ) (Theorem 3.3); and (iii) an analytic description of the structural γ-thresholds γ1, γ2, γ3 that govern transitions in selection (Theorem 3.1). In Appendix L.5, we numerically examine how thresholds, fairness metrics, and utilities evolve as p and the correlation parameter γ vary. A key contribution of this section lies in the comprehensive set of plots (Figures 8 14) that reveal monotonicity, phase transitions, and other emergent behaviors, offering insights that go beyond what closed-form analysis can capture. From the NeurIPS Paper Checklist: Question: Does the paper conduct EMPIRICAL STUDIES WITH DATA ANALYSIS...? Answer: [Yes]. Justification: The simulations are fully specified using closed-form formulas and all code used to generate figures is available in the supplementary material.
Researcher Affiliation	Academia	Amit Kumar IIT Delhi Nisheeth K. Vishnoi Yale University
Pseudocode	No	The paper primarily presents mathematical models, theorems, and proofs. While it discusses algorithms like the 'Deferred Acceptance algorithm' as related work, it does not provide its own specific pseudocode or algorithm blocks. A thorough review of the paper's sections and appendices did not reveal any explicitly labeled 'Algorithm' or 'Pseudocode' sections with structured steps.
Open Source Code	Yes	From the NeurIPS Paper Checklist: Question: Does the paper provide open access to the data and code...? Answer: [Yes] . Justification: Code is included in the supplemental materials and is sufficient to reproduce the key numerical experiments and heatmaps shown in the figures.
Open Datasets	No	Candidate attributes are drawn independently and uniformly from [0, 1], following standard assumptions in prior work on fairness in matching and screening [18, 29, 15]. While this enables clean derivations and actionable intervention design, it does not capture the full heterogeneity or noise present in real-world evaluations. From the NeurIPS Paper Checklist: Question: Does the paper describe safeguards that have been put in place for responsible release of data or models...? Answer: [NA] . Justification: The paper does not release any model or dataset with a high risk of misuse.
Dataset Splits	No	Candidate attributes are drawn independently and uniformly from [0, 1], following standard assumptions in prior work on fairness in matching and screening [18, 29, 15]. The paper does not use pre-existing datasets that would require specific training/test/validation splits. Instead, data is generated synthetically based on a uniform distribution, as described. Therefore, the concept of predefined dataset splits is not applicable to this work.
Hardware Specification	No	From the NeurIPS Paper Checklist: Question: For each experiment, does the paper provide sufficient information on the computer resources...? Answer: [NA] . Justification: No significant compute resources were used; all computations are analytical and run efficiently on standard personal machines.
Software Dependencies	No	From the NeurIPS Paper Checklist: Question: Does the paper provide open access to the data and code...? Answer: [Yes] . Justification: Code is included in the supplemental materials and is sufficient to reproduce the key numerical experiments and heatmaps shown in the figures. The paper confirms code availability but does not specify particular software dependencies with version numbers (e.g., Python 3.x, specific libraries with versions) that would be needed to replicate the numerical experiments. The justification in the checklist implies standard personal machine usage for analytical computations, but lacks version details for any ancillary software.
Experiment Setup	Yes	In Appendix L.5, we numerically examine how thresholds, fairness metrics, and utilities evolve as p and the correlation parameter γ vary. Figures 8-11 fix c = 0.2 and vary p for different values of β and γ. In Figure 8 (β = 0.8, γ = 0.9), increased preference alignment with Institution 1 leads to widening disparity: s 1 rises, s 2 falls (consistent with Theorem L.4, Proposition L.8,Proposition L.14), and both R and N drop significantly monotonically. Figure 8: Variation of thresholds, representation ratio, normalized representation ratio and utilities with p for a fixed value of c = 0.2, β = 0.8, and γ = 0.9.