Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fréchet Geodesic Boosting

Authors: Yidong Zhou, Su Iao, Hans-Georg Müller

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through theoretical analysis, extensive simulations, and realworld applications, we demonstrate the strong performance and adaptability of FGBoost, showcasing its potential for modeling complex data. ... Simulation studies. Through extensive numerical experiments, we evaluate the performance of FGBoost across various types of non-Euclidean outputs, including distributions, networks, and compositional data. ... Experiments on real-world data. We validate the practical utility of FGBoost using real-world datasets from multiple domains.
Researcher Affiliation	Academia	Yidong Zhou Department of Statistics University of California, Davis Davis, CA 95616 EMAIL Su I Iao Department of Statistics University of California, Davis Davis, CA 95616 EMAIL Hans-Georg Müller Department of Statistics University of California, Davis Davis, CA 95616 EMAIL
Pseudocode	Yes	Algorithm 1 Fréchet Geodesic Boosting Input: data {(Xi, Yi)}n i=1, a new predictor level X and a learning rate ν (0, 1). Initialize the model with the estimated Fréchet mean of {Yi}n i=1: ˆF0(x) = id Y0, where Y0 = arg minω M 1 n Pn i=1 d2(Yi, ω). for k = 1 to K do 1. Fit a base learner (e.g. tree) ˆfk to approximate the geodesic from the current prediction to the actual observation using data {(Xi, γ ˆY k 1 i ,Yi)}n i=1, where ˆY k 1 i = T ˆ Fk 1(Xi)(Y0) denotes the current prediction. 2. Update the ensemble model: ˆFk(x) = ˆFk 1(x) {ν ˆfk(x)}. end for Output: prediction ˆY = T ˆ F (X)(Y0) where ˆF(X) := ˆFK(X).
Open Source Code	Yes	Code for implementing FGBoost is available at https://github.com/SUIIAO/FGBoost.
Open Datasets	Yes	These include distributional data from human mortality studies, networks derived from New York City yellow taxi trip records, and compositional data from a survey of unemployed workers in New Jersey. ... This analysis uses age-at-death distributions from 162 countries in 2015 as distributional outputs. The life tables, sourced from the United Nations World Population Prospects 2024 (https:// population.un.org/wpp/downloads), provide death counts grouped into five-year age intervals. ... We analyze transport dynamics in Manhattan, New York, using yellow taxi trip records obtained from https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page. ... A survey of unemployed workers in New Jersey [29] was conducted during the fall of 2009 and early 2010 ... K Additional real-world data application: National Health and Nutrition Examination Survey. ... We focused on modeling the distribution of physical activity intensity as a non-Euclidean response, using demographic and health-related variables as predictors. For each participant, activity values equal to zero or exceeding 1000 CPM were excluded ... These accelerometer data have been widely used to study the relationship between physical activity and health outcomes [31, 24].
Dataset Splits	Yes	Tuning these parameters can be accomplished through a grid search, assessing empirical risk with cross-validation. Additionally, 10% of the training set is reserved as the validation set in each run. ... To evaluate model performance, leave-one-out cross-validation is employed to compute the MSPE... Model performance is evaluated through five-fold cross-validation, with the MSPE averaged over 100 runs... Model performance is assessed using ten-fold cross-validation, with the MSPE averaged over 100 runs... we selected the 200 participants with the most valid observations and performed 10-fold cross-validation over 20 runs for model evaluation.
Hardware Specification	Yes	All experiments were conducted on a local machine equipped with an Apple M3 Max chip running mac OS Sequoia.
Software Dependencies	No	The paper does not provide specific software versions for its own implementation. While it mentions the 'frechet package [14]' for data processing and 'R package version 0.3.0', it does not specify the versions of general programming languages (e.g., Python), machine learning frameworks (e.g., PyTorch, TensorFlow), or other key libraries used to implement FGBoost itself.
Experiment Setup	Yes	Common hyperparameters. In all simulations, the learning rate ν is set to 0.05, and the number of iterations K is fixed at 100. The depth of the tree is fixed at 3, with each leaf requiring a minimum of 10 samples. Tuning these parameters can be accomplished through a grid search, assessing empirical risk with cross-validation. Additionally, 10% of the training set is reserved as the validation set in each run. The training process halts when the empirical risk on the validation set no longer shows consistent improvement. ... Table 6: Hyperparameter settings. Learning rate 0.01 0.03 0.05 0.1 Number of iterations 50 70 90 100 Depth of each tree 2 3 4 5