Bias Reduction via End-to-End Shift Learning: Application to Citizen Science

Authors: Di Chen, Carla P. Gomes493-500

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Applied to bird observational data from the citizen science project e Bird, we demonstrate how SCN quantifies the data distribution shift and outperforms supervised learning models that do not address the data bias. Experiments Datasets and Implementation Details We worked with a crowd-sourced bird observation dataset from the successful citizen science project e Bird (Sullivan et al. 2014), which is the world s largest biodiversity-related citizen science project, with more than 100 million bird sightings contributed each year by e Birders around the world.
Researcher Affiliation Academia Di Chen di@cs.cornell.edu Cornell University Carla P. Gomes gomes@cs.cornell.edu Cornell University
Pseudocode Yes Algorithm 1 shows the pseudocode of our two-step learning scheme for SCN.
Open Source Code Yes Code to reproduce the experiments can be found at https://bitbucket.org/Di Chen9412/aaai2019-scn/.
Open Datasets Yes We worked with a crowd-sourced bird observation dataset from the successful citizen science project e Bird (Sullivan et al. 2014), which is the world s largest biodiversity-related citizen science project, with more than 100 million bird sightings contributed each year by e Birders around the world. Table 1: Statistics of the e Bird dataset
Dataset Splits Yes Table 1: Statistics of the e Bird dataset. We formed the unbiased test set and the unbiased validation set by overlaying a grid on the map and choosing observation records spatially uniformly.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software components and techniques such as 'Re LU', 'Dense Net', 'Adam optimizer', 'batch normalization', and 'dropout', but does not provide specific version numbers for these or for the programming environment.
Experiment Setup Yes For all models in our experiments, the training process was done for 200 epochs, using a batch size of 128, cross-entropy loss, and an Adam optimizer (Kingma and Ba 2014) with a learning rate of 0.0001, and utilized batch normalization (Ioffe and Szegedy 2015), a 0.8 dropout rate (Srivastava et al. 2014), and early stopping to accelerate the training process and prevent overfitting.