Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

NAFlora-1M: Continental-Scale High-Resolution Fine-Grained Plant Classification Dataset

Authors: John Park, Riccardo de Lutio, Brendan Rappazzo, Barbara Ambrose, Fabian Michelangeli, Kimberly Watson, Serge Belongie, Damon Little

DMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we present a novel large scale, fine grained dataset, NAFlora-1M, which consists of 1,050,182 hebarium images covering 15,501 North American vascular plant species (90% of the known species). ... In addition, we present several baseline models, along with benchmarking results from a Kaggle competition: A total of 134 teams benchmarked the dataset in a total of 1,663 submissions; the leading team achieved an 87.66% macro-F1 score with a 1 billion parameter ensemble model leaving substantial room for future improvement in both performance and efficiency.
Researcher Affiliation	Collaboration	John Park EMAIL The New York Botanical Garden, Bronx, NY, USA SCINet Program and ARS AI Center of Excellence, Office of National Programs, USDA Agricultural Research Service, Beltsville, MD, 20705, USA Riccardo de Lutio EMAIL Eco Vision Lab, Photogrammetry and Remote Sensing ETH Zurich, 8092 Zürich, Switzerland Brendan Rappazzo EMAIL Cornell University Ithaca, NY, USA Barbara Ambrose EMAIL Laboratory The New York Botanical Garden, Bronx, NY, USA Fabian Michelangeli EMAIL Institute of Systematic Botany The New York Botanical Garden, Bronx, NY, USA Kimberly Watson EMAIL Herbarium The New York Botanical Garden, Bronx, NY, USA Serge Belongie EMAIL Pioneer Centre for Artificial Intelligence University of Copenhagen, 1350 Copenhagen, Denmark Damon P. Little EMAIL Lewis B. and Dorothy Cullman Program for Molecular Systematics The New York Botanical Garden, Bronx, NY, USA
Pseudocode	No	The paper describes methods and procedures in narrative text and tables, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The dataset and training scripts are available at https://github.com/dpl10/NAFlora-1M. ... An example training and inference script can be found on Git Hub1. 1 https://github.com/dpl10/NAFlora-1M/blob/main/src/naflora1m_train_and_infer.py. ... The details can be found in the Supplementary A.2 and in the Git Hub repository https://github.com/dpl10/NAFlora-1M/blob/main/src/ naflora1m_train_and_infer.py.
Open Datasets	Yes	In this paper, we present a novel large scale, fine grained dataset, NAFlora-1M, which consists of 1,050,182 hebarium images covering 15,501 North American vascular plant species (90% of the known species). ... The dataset and training scripts are available at https://github.com/dpl10/NAFlora-1M.
Dataset Splits	Yes	The dataset is partitioned with 80/20% split, resulting in 839,772 training images and 210,407 testing images, under the rule to include at least two images for each class in the test partition.
Hardware Specification	Yes	It took about 20 hours to train the network on Google TPU v3-8. ... Hardware Used 4 A-100 8 A-100 8 A-100 8 RTX-3090 V-100
Software Dependencies	No	The paper mentions various neural network architectures (Res Net-50, Mobile Net V3, Efficient Net V2, Vi T), optimizers (SGDW, AdamW), and loss functions (Cross Entropy, Arcface, Seesaw, Adaptive Mining Sample, Class balance loss) by name, but does not specify their version numbers or any other software dependencies like Python or PyTorch versions.
Experiment Setup	Yes	All networks were pre trained on Image Net-1k and fine tuned with 2562 pixel training images for 30 epochs. In terms of data augmentations we simply used two: random horizontal flip and random rotation of 15 degrees. We utilize a cyclical learning rate scheduler (Smith, 2017), which has fast convergence with CNNs (Table 1); 20 epochs with a cyclical learning rate scheduler has shown to produce faster convergence than 100 epochs of a constant learning rate (Smith and Topin, 2019). ... CNNs were optimized using Stochastic Gradient Descent with Weight decay (SGDW) and the Cross Entropy (CE) loss with label smoothing (= 0.1). Vi T-B/16 and B/32 were fine tuned with Adam W optimizer, following Touvron et al. (2021) for learning rate and weight decay settings. We applied the same loss function with the same label smoothing threshold to Vi T. ... The final neural network is finetuned with 4002 pixel images for 60 epochs on NAFlora-1M (macro-F1 score = 80.47%), with cyclical learning rate scheduler (Smith and Topin, 2019), and the maximal learning rate scaling linearly with the batch size as suggested by Goyal et al. (2018).