Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Saliency-based Sequential Image Attention with Multiset Prediction
Authors: Sean Welleck, Jialin Mao, Kyunghyun Cho, Zheng Zhang
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the classification performance, training process, and hierarchical attention with set-based and multiset-based classification experiments. To test the effectiveness of the permutation-invariant RL training, we compare against a baseline model that uses a cross-entropy loss on the probabilities pt,i and (randomly ordered) labels yi instead of the RL training, similar to training proposed in [42]. Datasets Two synthetic datasets, MNIST Set and MNIST Multiset, as well as the real-world SVHN dataset, are used. Each dataset is split into 60,000 training examples and 10,000 testing examples, and metrics are reported for the testing set. |
| Researcher Affiliation | Academia | Sean Welleck New York University EMAIL Jialin Mao New York University EMAIL Kyunghyun Cho New York University kyunghyun.nyu.edu Zheng Zhang New York University EMAIL |
| Pseudocode | No | The paper describes the architecture and processes in prose and with diagrams (Figure 1), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific repository links or explicit statements about the release of source code for the described methodology. |
| Open Datasets | Yes | Two synthetic datasets, MNIST Set and MNIST Multiset, as well as the real-world SVHN dataset, are used. |
| Dataset Splits | Yes | For MNIST Set and Multiset, each 100x100 image in the dataset has a variable number (1-4) of digits, of varying sizes (20-50px) and positions, along with cluttering objects that introduce noise. Each dataset is split into 60,000 training examples and 10,000 testing examples, and metrics are reported for the testing set. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or cloud instance specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using a 'Res Net-34 network pre-trained on Image Net' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Images are resized to 224x224, and the final (4th) convolutional layer is used (V R512 7 7). Since the label sets vary in size, the model is trained with an extra 'stop' class, and during inference greedy argmax sampling is used until the 'stop' class is predicted. |