Attention-based Deep Multiple Instance Learning

Authors: Maximilian Ilse, Jakub Tomczak, Max Welling

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show empirically that our approach achieves comparable performance to the best MIL methods on benchmark MIL datasets and it outperforms other methods on a MNIST-based MIL dataset and two real-life histopathology datasets without sacrificing interpretability.
Researcher Affiliation Academia 1University of Amsterdam, the Netherlands.
Pseudocode No The paper does not contain any figures, blocks, or sections explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper mentions using code for a comparison method ('We use code provided with (Doran & Ray, 2014): https: //github.com/garydoranjr/misvm'), but it does not provide an unambiguous statement or link for the open-sourcing of the authors' own described methodology.
Open Datasets Yes We evaluate our approach on a number of different MIL datasets: five MIL benchmark datasets (MUSK1, MUSK2, FOX, TIGER, ELEPHANT), an MNIST-based image dataset (MNIST-BAGS) and two real-life histopathology datasets (BREAST CANCER, COLON CANCER). The MNIST dataset is well-known, and the histopathology datasets are cited with proper bibliographic information (Gelasca et al., 2008 and Sirinukunwattana et al., 2016).
Dataset Splits Yes In order to obtain a fair comparison we use a common evaluation methodology, i.e., 10-fold-cross-validation, and five repetitions per experiment. If an attention-based MIL pooling layer is used the number of parameters in V was determined using a validation set.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud computing resources used for running the experiments.
Software Dependencies No The paper mentions the use of the Adam optimization algorithm and refers to third-party code for a comparison method, but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes In order to obtain a fair comparison we use a common evaluation methodology, i.e., 10-fold-cross-validation, and five repetitions per experiment. If an attention-based MIL pooling layer is used the number of parameters in V was determined using a validation set. We tested the following dimensions (L): 64, 128 and 256. Finally, all layers were initialized according to Glorot & Bengio (2010) and biases were set to zero. The models are trained with the Adam optimization algorithm (Kingma & Ba, 2014). We keep the default parameters for β1 and β2, see Table 10 in the Appendix.