Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Targeted Maximum Likelihood Learning: An Optimization Perspective

Authors: Diyang Li, Kyra Gan

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Targeted maximum likelihood estimation (TMLE) is a widely used debiasing algorithm for plug-in estimation. While its statistical guarantees, such as double robustness and asymptotic efficiency, are well-studied, the convergence properties of TMLE as an iterative optimization scheme have remained underexplored. To bridge this gap, we study TMLE s iterative updates through an optimization-theoretic lens, establishing global convergence under standard assumptions and regularity conditions. We begin by providing the first complete characterization of different stopping criteria and their relationship to convergence in TMLE. Next, we provide geometric insights. We show that each submodel induces a smooth, non-selfintersecting path (homotopy) through the probability simplex. We then analyze the solution space of the estimating equation and loss landscape. We show that all valid solutions form a submanifold of the statistical model, with the difference in dimension (i.e., codimension) exactly matching the dimension of the target parameter. Building on these geometric insights, we deliver the first strict proof of TMLE s convergence from an optimization viewpoint, as well as explicit sufficient criteria under which TMLE terminates in a single update.
Researcher Affiliation Academia Diyang Li Cornell University EMAIL Kyra Gan Cornell University EMAIL
Pseudocode Yes Algorithm 1 Targeted Maximum Likelihood Estimator (TMLE) Input: Data {Oi}n i=1, canonical gradient D Ψ(p) of the interested functional Ψ, initial estimator p0 n. 1: k 0, initialize ϵ0 n = 0. 2: while ϵk n = 0 do 3: ϵk n arg min{ϵ:pkn(ϵ) M} O L(pk n(ϵ))(o)pndν(o) 4: pk+1 n pk n ϵk n , k k + 1 \ iterative debiasing 5: end while 6: p n pk 1 n , ˆψn Ψ(p n) \ plug-in estimation Output: Targeted estimator ˆψn.
Open Source Code No Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justification: The paper does not include experiments.
Open Datasets No Question: Does the paper provide CONCRETE ACCESS INFORMATION (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets) for a publicly available or open dataset? Answer: [NA] Justification: The paper does not include experiments.
Dataset Splits No Question: Does the paper provide SPECIFIC DATASET SPLIT INFORMATION (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning? Answer: [NA] Justification: The paper does not include experiments.
Hardware Specification No Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [NA] Justification: The paper does not include experiments.
Software Dependencies No Question: Does the paper provide SPECIFIC ANCILLARY SOFTWARE DETAILS (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment? Answer: [NA] Justification: The paper does not include experiments.
Experiment Setup No Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results? Answer: [NA] Justification: The paper does not include experiments.