Three-quarter Sibling Regression for Denoising Observational Data

Authors: Shiv Shankar, Daniel Sheldon, Tao Sun, John Pickering, Thomas G. Dietterich

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical justification of this approach, demonstrate its effectiveness on synthetic data, and show that it reduces systematic detection variability due to moon brightness in moth surveys.
Researcher Affiliation Collaboration 1University of Massachusetts Amherst 2Mount Holyoke College 3Amazon 4University of Georgia 5Oregon State University
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper mentions using third-party libraries ('scikit-learn package', 'py GAM package') but does not state that the authors are providing open-source code for their methodology. No links to code repositories are present.
Open Datasets Yes Our second set of experiments use moth survey data from the Discover Life project4 for studying spatio-temporal variation in moth communities. This dataset consists of counts of moths of different species collected at regular intervals at different study sites by both citizen scientists and moth experts. (footnote 4: https://www.discoverlife.org/moth)
Dataset Splits No The paper describes a 'train-year / test-year' splitting strategy ('We hold out one year at a time for testing and make predictions using each other year, for a total of 20 train-year / test-year pairs') but does not explicitly mention a separate 'validation' split with specific details.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions software packages like 'scikit-learn package' and 'py GAM package' and 'Pyephem' but does not specify their version numbers within the text.
Experiment Setup No The paper mentions using default settings for support-vector regression and gradient boosted regression trees from scikit-learn and pyGAM, but it does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or other system-level training settings.