Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Are Ensembles Getting Better All the Time?
Authors: Pierre-Alexandre Mattei, Damien Garreau
JMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our results on a medical problem (diagnosing melanomas using neural nets) and a wisdom of crowds experiment (guessing the ratings of upcoming movies). |
| Researcher Affiliation | Academia | Pierre-Alexandre Mattei EMAIL Université Côte d Azur Inria, Maasai team Laboratoire J.A. Dieudonn e, CNRS Nice, France Damien Garreau EMAIL Julius-Maximilians-Universit at W ürzburg Institute for Computer Science / CAIDAS W ürzburg, Germany |
| Pseudocode | No | The paper presents theoretical results, theorems, and proofs but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | In Appendix G, we show similar curves for three more movies. It is also possible to produce such curves for all 20 movies using a Python notebook available at https://github.com/pamattei/Getting-Better-Ensembles. |
| Open Datasets | Yes | We use the Derma MNIST (Yang et al., 2023) data set, based on the HAM10000 collection (Tschandl et al., 2018)... based on data collected by Simoiu et al. (2019). |
| Dataset Splits | Yes | The training/validation/test split is the same as the one from Yang et al. (2023), and consists of 1, 548 color images of resolution 28 28 for training, 221 similar images for validation, and 443 test images. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU model, CPU type) used for the experiments. |
| Software Dependencies | No | The paper mentions using a 'Le Net-like convolutional neural network' and 'dropout' but does not specify software versions for libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | We use a simple Le Net-like convolutional network (Le Cun et al., 1998) whose fully connected layers are regularised with a dropout rate of 50% (Srivastava et al., 2014). |