Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Near-Optimal Smoothing of Structured Conditional Probability Matrices

Authors: Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky

NeurIPS 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	7 Experiments Having expounded the theoretical merit of properly smoothing structered conditional probability matrices, we give a brief empirical study of its practical impact. We use both synthetic and real data.
Researcher Affiliation	Academia	Moein Falahatgar University of California, San Diego San Diego, CA, USA EMAIL Mesrob I. Ohannessian Toyota Technological Institute at Chicago Chicago, IL, USA EMAIL Alon Orlitsky University of California, San Diego San Diego, CA, USA EMAIL
Pseudocode	Yes	Algorithm: ADD1 2-SMOOTHED LOW-RANK
Open Source Code	No	The paper does not provide any concrete access to source code (e.g., a specific repository link, an explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	Yes	tartuffe, a French text, train and test size: 9.3k words, vocabulary size: 2.8k words. genesis, English version, train and test size: 19k words, vocabulary size: 4.4k words brown, shortened Brown corpus, train and test size: 20k words, vocabulary size: 10.5k words All but the ﬁrst one are readily available through the Python NLTK
Dataset Splits	Yes	In particular, half of the data was held out as a validation set, and for a range of different choices for m, the model was trained and its cross-entropy on the validation set was calculated.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For all these experiments, m = 50 and 200 iterations were performed.