Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Targeted Maximum Likelihood Learning: An Optimization Perspective

Authors: Diyang Li, Kyra Gan

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Targeted maximum likelihood estimation (TMLE) is a widely used debiasing algorithm for plug-in estimation. While its statistical guarantees, such as double robustness and asymptotic efﬁciency, are well-studied, the convergence properties of TMLE as an iterative optimization scheme have remained underexplored. To bridge this gap, we study TMLE s iterative updates through an optimization-theoretic lens, establishing global convergence under standard assumptions and regularity conditions. We begin by providing the ﬁrst complete characterization of different stopping criteria and their relationship to convergence in TMLE. Next, we provide geometric insights. We show that each submodel induces a smooth, non-selfintersecting path (homotopy) through the probability simplex. We then analyze the solution space of the estimating equation and loss landscape. We show that all valid solutions form a submanifold of the statistical model, with the difference in dimension (i.e., codimension) exactly matching the dimension of the target parameter. Building on these geometric insights, we deliver the ﬁrst strict proof of TMLE s convergence from an optimization viewpoint, as well as explicit sufﬁcient criteria under which TMLE terminates in a single update.
Researcher Affiliation	Academia	Diyang Li Cornell University EMAIL Kyra Gan Cornell University EMAIL
Pseudocode	Yes	Algorithm 1 Targeted Maximum Likelihood Estimator (TMLE) Input: Data {Oi}n i=1, canonical gradient D Ψ(p) of the interested functional Ψ, initial estimator p0 n. 1: k 0, initialize ϵ0 n = 0. 2: while ϵk n = 0 do 3: ϵk n arg min{ϵ:pkn(ϵ) M} O L(pk n(ϵ))(o)pndν(o) 4: pk+1 n pk n ϵk n , k k + 1 \ iterative debiasing 5: end while 6: p n pk 1 n , ˆψn Ψ(p n) \ plug-in estimation Output: Targeted estimator ˆψn.
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justiﬁcation: The paper does not include experiments.
Open Datasets	No	Question: Does the paper provide CONCRETE ACCESS INFORMATION (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets) for a publicly available or open dataset? Answer: [NA] Justiﬁcation: The paper does not include experiments.
Dataset Splits	No	Question: Does the paper provide SPECIFIC DATASET SPLIT INFORMATION (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning? Answer: [NA] Justiﬁcation: The paper does not include experiments.
Hardware Specification	No	Question: For each experiment, does the paper provide sufﬁcient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [NA] Justiﬁcation: The paper does not include experiments.
Software Dependencies	No	Question: Does the paper provide SPECIFIC ANCILLARY SOFTWARE DETAILS (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment? Answer: [NA] Justiﬁcation: The paper does not include experiments.
Experiment Setup	No	Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results? Answer: [NA] Justiﬁcation: The paper does not include experiments.