reproducibilityindex.ai

Spatial Implicit Neural Representations for Global-Scale Species Mapping

Authors: Elijah Cole, Grant Van Horn, Christian Lange, Alexander Shepard, Patrick Leary, Pietro Perona, Scott Loarie, Oisin Mac Aodha

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We find that our approach scales gracefully, making increasingly better predictions as we increase the number of species and the amount of data per species when training. To make this problem accessible to machine learning researchers, we provide four new benchmarks that measure different aspects of species range estimation and spatial representation learning. Using these benchmarks, we demonstrate that noisy and biased crowdsourced data can be combined with implicit neural representations to approximate expert-developed range maps for many species. In this section we investigate the performance of SINR models on four species and environmental prediction tasks.
Researcher Affiliation	Collaboration	1Caltech 2Cornell 3University of Edinburgh 4i Naturalist.
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks. Figure A3 shows a 'Network diagram for the fully connected network (with residual connections)' which is an architecture diagram, not pseudocode.
Open Source Code	Yes	Training and evaluation code is available at: https://github.com/elijahcole/sinr
Open Datasets	Yes	We train our models on presence-only species observation data obtained from the community science platform i Naturalist (i Na). Our training data was sourced from the i Naturalist AWS Open Dataset in May 2022. i Naturalist. www.inaturalist.org, accessed 9 May 2023.
Dataset Splits	No	The paper explicitly states for the training data: 'Finally, we only included observations made prior to 2022. This will enable a temporal split from 2022 onward to be used as a validation set in the future.' This refers to a future possibility rather than a validation split used in the current experiments. For the 'Geo Feature' task, it mentions 'After splitting the locations into train and test data...', but no separate validation split details are provided for the main models to reproduce the experiments.
Hardware Specification	Yes	All models were trained on an Amazon AWS p3.2xlarge instance with a Tesla V100 GPU and 60 GB RAM.
Software Dependencies	Yes	The model training code was written in PyTorch (v1.7.0).
Experiment Setup	Yes	All models were trained for 10 epochs using a batch size of 2048 and a learning rate of 5e 4. We used the Adam optimizer with an exponential learning rate decay schedule of learning rate = initial learning rate epoch0.98 where epoch {0, 1, . . . , 9}. For LAN full and LGP we set = 2048.