reproducibilityindex.ai

Sparse and Structured Hopfield Networks

Authors: Saul José Rodrigues Dos Santos, Vlad Niculae, Daniel C Mcnamee, Andre Martins

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on multiple instance learning and text rationalization demonstrate the usefulness of our approach. Experiments on synthetic and real-world tasks (multiple instance learning and text rationalization) showcase the usefulness of our proposed models using various kinds of sparse and structured transformations ( 5).
Researcher Affiliation	Collaboration	1Instituto Superior T ecnico, Universidade de Lisboa, Lisbon, Portugal 2Instituto de Telecomunicac oes, Lisbon, Portugal 3Language Technology Lab, University of Amsterdam, The Netherlands 4Champalimaud Research, Lisbon, Portugal 5Unbabel, Lisbon, Portugal.
Pseudocode	Yes	Algorithm 1 Compute α-normmax by bisection. (Appendix B)
Open Source Code	Yes	Our code is available on https://github.com/deep-spin/SSHN (Footnote 1)
Open Datasets	Yes	We next investigate how often our Hopfield networks converge to metastable states, a crucial aspect for understanding the network s dynamics. To elucidate this, we examine ˆyΩ(βXq(t)) for the MNIST dataset, probing the number of nonzeros of these vectors. ... We run these models for K-MIL problems in the MNIST dataset (choosing 9 as target) and in three MIL benchmarks: Elephant, Fox, and Tiger (Ilse et al., 2018). ... The MIL benchmark datasets (Fox, Tiger and Elephant) comprise preprocessed and segmented color images sourced from the Corel dataset (Ilse et al., 2018).
Dataset Splits	Yes	We use 500 bags for testing and 500 bags for validation. (Appendix F.1) ... Model validation was conducted through a 10-fold nested cross-validation, repeated five times with different data splits where the first seed is used for hyperparameter tuning. (Appendix F.2)
Hardware Specification	No	The paper does not explicitly describe any specific hardware used for running the experiments (e.g., GPU/CPU models, memory specifications, or cloud computing instances).
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and 'transformer attention' but does not provide specific version numbers for any of these, which is required for reproducibility.
Experiment Setup	Yes	We train the models for 5 different random seeds, where the first one is used for tuning the hyperparameters. ... The hyperparameters are tuned via grid search, where the grid space is shown in Table 5. ... All models were trained for 50 epochs. We incorporated an early-stopping mechanism, with patience 5, that selects the optimal checkpoint based on performance on the validation set. (Appendix F.1) ... We used a head dimension of 200, ... and a head dropout of 0.5 ... We used a single attention head ... Aditionally we use a transition score of 0.001 and a train temperature of 0.1. (Appendix F.3)