Active Learning-Based Species Range Estimation
Authors: Christian Lange, Elijah Cole, Grant Van Horn, Oisin Mac Aodha
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a detailed evaluation of our approach and compare it to existing active learning methods using an evaluation dataset containing expert-derived ranges for one thousand species. Our results demonstrate that our method outperforms alternative active learning methods and approaches the performance of end-to-end trained models, even when only using a fraction of the data. |
| Researcher Affiliation | Collaboration | Christian Lange1 Elijah Cole2, 3 Grant Van Horn4 Oisin Mac Aodha1 1University of Edinburgh 2Altos Labs 3Caltech 4UMass Amherst |
| Pseudocode | No | The paper does not contain any pseudocode blocks or algorithms. |
| Open Source Code | Yes | Code for reproducing the experiments in our paper can be found at https://github.com/Chris-lange/SDM_active_sampling. |
| Open Datasets | Yes | We use the same original set of publicly available data from i Naturalist [5] but retrain the model from scratch after removing species that are in our test sets. We make use of two expert curated sources of range maps for evaluation and to generate labels for observations during the active learning process: International Union for Conservation of Nature (IUCN) [2] and e Bird Status and Trends (S&T) [21]. |
| Dataset Splits | No | As we synthetically generate data from the underlying expert range maps provided, we do not have a specified train test split and merely evaluate at each valid H3 cell centroid and measure performance as the difference between a model s predictions and the expert-derived range map. |
| Hardware Specification | Yes | Training the feature extractor takes 2.5 hours on an NVIDIA RTX A6000 GPU with 48GB of RAM. A single experiment using our WA_HSS+ method involving 500 species across 50 time steps takes 4 hours using unoptimized code on an AMD EPYC 7513 32-Core Processor. |
| Software Dependencies | No | The logistic regression range estimation model h we optimize during active sampling is implemented using scikit-learn s linear_model.Logistic Regression class with default settings [33]. The paper mentions scikit-learn but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | This network uses the same architecture, training procedure, and hyperparameter settings as the LAN full model from [13]. This model is trained for 10 epochs using Adam optimizer with a dropout probability of 0.5 and a batch size of 2048. |