Multi-Instance Multi-Label Class Discovery: A Computational Approach for Assessing Bird Biodiversity

Authors: Forrest Briggs, Xiaoli Fern, Raviv Raich, Matthew Betts

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a comparative study, we show that the proposed methods discover more species/classes than current state-of-the-art in a real world dataset of 92,095 ten-second recordings collected in field conditions. We apply our proposed methods, and baseline methods, to a real-world dataset of 92,095 ten-second recordings, collected at 13 sites over a period of two months, in a research forest. These recordings pose many challenges for automatic species discovery, including multiple simultaneously vocalizing birds of different species, non-bird sounds such as motor sound, and environmental noises, e.g., wind, rain, streams, and thunder. Our results show that the proposed methods discover more species/classes than previous methods. The results of the experiment are viewed in terms of a graph of number of species or classes discovered vs. number of recordings labeled.
Researcher Affiliation Collaboration Forrest Briggs Facebook, Inc. fbriggs@gmail.com Xiaoli Z. Fern, Raviv Raich, Matthew Betts Oregon State University {xfern,raich}@eecs.oregonstate.edu matt.betts@oregonstate.edu
Pseudocode Yes Algorithm 1 Multi-Instance Farthest First (MIFF)
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the methodology described.
Open Datasets No In this study, we collected audio data at 13 different sites in the H. J. Andrews Long Term Experimental Research Forest over a two-month period during the 2009 breeding season. we divided the full dataset into 920,956 ten-second intervals, then randomly subsampled 10% of this data, to obtain a total of 92,095 tens-second recordings for our experiments. We annotated 150 randomly chosen ten-second recording spectrograms as examples for segmentation. Figure 1 shows an example of an annotated spectrogram for training the segmentation algorithm. A further 1000 randomly chosen recordings are labeled as rain or non-rain to train the rain filter. However, there is no explicit link or statement about this dataset being publicly available.
Dataset Splits No From the pool of 92,095 recordings, we apply each of the methods (dawn, cluster centers, MIFF, CCMIFF) to select m = 100 recordings to be labeled. This 100 recordings are the target of their 'discovery'. There is no explicit training, validation, or test split for the *discovery* task, as the task is to select from an unlabeled pool. For the rain filter, it says "trained on 1000 ten-second recordings" but doesn't mention validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions algorithms like 'k-means++' and 'random forest classifier' but does not specify any software dependencies (e.g., libraries, frameworks) with version numbers.
Experiment Setup Yes For MIFF and CCMIFF, we set the parameter p = 2 because we expect on average to have 2 classes per bag. For CCMIFF, we set the number of clusters k = 1000, based on the observation that with a smaller number of clusters (e.g., 100), the algorithm covers all clusters very early on, before selecting m = 100 bags. We compare the species discovered by cluster centers, MIFF, and CCMIFF with rain filter threshold T {0.1, 0.01} or no rain filter.