reproducibilityindex.ai

Attention-based Deep Multiple Instance Learning

Authors: Maximilian Ilse, Jakub Tomczak, Max Welling

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that our approach achieves comparable performance to the best MIL methods on benchmark MIL datasets and it outperforms other methods on a MNIST-based MIL dataset and two real-life histopathology datasets without sacriﬁcing interpretability.
Researcher Affiliation	Academia	1University of Amsterdam, the Netherlands.
Pseudocode	No	The paper does not contain any figures, blocks, or sections explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper mentions using code for a comparison method ('We use code provided with (Doran & Ray, 2014): https: //github.com/garydoranjr/misvm'), but it does not provide an unambiguous statement or link for the open-sourcing of the authors' own described methodology.
Open Datasets	Yes	We evaluate our approach on a number of different MIL datasets: ﬁve MIL benchmark datasets (MUSK1, MUSK2, FOX, TIGER, ELEPHANT), an MNIST-based image dataset (MNIST-BAGS) and two real-life histopathology datasets (BREAST CANCER, COLON CANCER). The MNIST dataset is well-known, and the histopathology datasets are cited with proper bibliographic information (Gelasca et al., 2008 and Sirinukunwattana et al., 2016).
Dataset Splits	Yes	In order to obtain a fair comparison we use a common evaluation methodology, i.e., 10-fold-cross-validation, and ﬁve repetitions per experiment. If an attention-based MIL pooling layer is used the number of parameters in V was determined using a validation set.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud computing resources used for running the experiments.
Software Dependencies	No	The paper mentions the use of the Adam optimization algorithm and refers to third-party code for a comparison method, but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	In order to obtain a fair comparison we use a common evaluation methodology, i.e., 10-fold-cross-validation, and ﬁve repetitions per experiment. If an attention-based MIL pooling layer is used the number of parameters in V was determined using a validation set. We tested the following dimensions (L): 64, 128 and 256. Finally, all layers were initialized according to Glorot & Bengio (2010) and biases were set to zero. The models are trained with the Adam optimization algorithm (Kingma & Ba, 2014). We keep the default parameters for β1 and β2, see Table 10 in the Appendix.