Multi-Instance Multi-Label Class Discovery: A Computational Approach for Assessing Bird Biodiversity
Authors: Forrest Briggs, Xiaoli Fern, Raviv Raich, Matthew Betts
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In a comparative study, we show that the proposed methods discover more species/classes than current state-of-the-art in a real world dataset of 92,095 ten-second recordings collected in field conditions. We apply our proposed methods, and baseline methods, to a real-world dataset of 92,095 ten-second recordings, collected at 13 sites over a period of two months, in a research forest. These recordings pose many challenges for automatic species discovery, including multiple simultaneously vocalizing birds of different species, non-bird sounds such as motor sound, and environmental noises, e.g., wind, rain, streams, and thunder. Our results show that the proposed methods discover more species/classes than previous methods. The results of the experiment are viewed in terms of a graph of number of species or classes discovered vs. number of recordings labeled. |
| Researcher Affiliation | Collaboration | Forrest Briggs Facebook, Inc. fbriggs@gmail.com Xiaoli Z. Fern, Raviv Raich, Matthew Betts Oregon State University {xfern,raich}@eecs.oregonstate.edu matt.betts@oregonstate.edu |
| Pseudocode | Yes | Algorithm 1 Multi-Instance Farthest First (MIFF) |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | No | In this study, we collected audio data at 13 different sites in the H. J. Andrews Long Term Experimental Research Forest over a two-month period during the 2009 breeding season. we divided the full dataset into 920,956 ten-second intervals, then randomly subsampled 10% of this data, to obtain a total of 92,095 tens-second recordings for our experiments. We annotated 150 randomly chosen ten-second recording spectrograms as examples for segmentation. Figure 1 shows an example of an annotated spectrogram for training the segmentation algorithm. A further 1000 randomly chosen recordings are labeled as rain or non-rain to train the rain filter. However, there is no explicit link or statement about this dataset being publicly available. |
| Dataset Splits | No | From the pool of 92,095 recordings, we apply each of the methods (dawn, cluster centers, MIFF, CCMIFF) to select m = 100 recordings to be labeled. This 100 recordings are the target of their 'discovery'. There is no explicit training, validation, or test split for the *discovery* task, as the task is to select from an unlabeled pool. For the rain filter, it says "trained on 1000 ten-second recordings" but doesn't mention validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms like 'k-means++' and 'random forest classifier' but does not specify any software dependencies (e.g., libraries, frameworks) with version numbers. |
| Experiment Setup | Yes | For MIFF and CCMIFF, we set the parameter p = 2 because we expect on average to have 2 classes per bag. For CCMIFF, we set the number of clusters k = 1000, based on the observation that with a smaller number of clusters (e.g., 100), the algorithm covers all clusters very early on, before selecting m = 100 bags. We compare the species discovered by cluster centers, MIFF, and CCMIFF with rain filter threshold T {0.1, 0.01} or no rain filter. |