Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Top Two Algorithms Revisited
Authors: Marc Jourdan, Rémy Degenne, Dorian Baudry, Rianne de Heide, Emilie Kaufmann
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, in Section 5 we report results from numerical experiments on a challenging non-parametric task using real-world data from a crop-management problem for various members of the Top Two family of algorithms. Most of them perform significantly better than the baselines. |
| Researcher Affiliation | Academia | 1 Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9198-CRISt AL, F-59000 Lille, France 2 Vrije Universiteit Amsterdam |
| Pseudocode | Yes | Figure 1: Generic β-Top Two sampling rule |
| Open Source Code | Yes | The code to reproduce the experiments can be found at https://github.com/mjourdan/TopTwoAlgorithmsRevisited (anonymous link during review). |
| Open Datasets | Yes | We benchmark our algorithms on the DSSAT simulator2 [22]. DSSAT is an Open-Source project maintained by the DSSAT Foundation, see https://dssat.net. |
| Dataset Splits | No | The paper describes generating Bernoulli instances and using a simulator, but does not provide specific train/validation/test splits for datasets. |
| Hardware Specification | No | We are using an internal cluster. As giving more details would break anonymity, we will include them in the camera-ready version. |
| Software Dependencies | Yes | All experiments are implemented in Python 3.8, using Numpy, Scipy and Matplotlib. |
| Experiment Setup | Yes | The stopping rule (2) is used with the threshold c(n, δ) defined in (4). As Top Two sampling rules, we present results for β-EB-TC, β-EBTCI, β-TS-TC and β-TS-TCI with β = 0.5. For the Bernoulli instances, we run 1000 independent simulations for each configuration. For the DSSAT data, we only run 100 simulations as the computational cost for computing Kinf for nonparametric distribution is high. |