Trainable Decoding of Sets of Sequences for Neural Sequence Models
Authors: Ashwin Kalyan, Peter Anderson, Stefan Lee, Dhruv Batra
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we show results on the image captioning task and find that our model outperforms standard techniques and natural ablations. |
| Researcher Affiliation | Collaboration | 1School of Interactive Computing, Georgia Tech, Atlanta, GA, USA 2Facebook AI Research, Menlo Park, CA, USA. |
| Pseudocode | Yes | Algorithm 1 Sequential Subset Selection |
| Open Source Code | Yes | 1pronounced diff-BS, code available at https://github. com/ashwinkalyan/diff-bs |
| Open Datasets | Yes | Datasets and Models. We show results on three captioning datasets of increasing size Flickr8k, Flickr30k (Young et al., 2014) and the large scale COCO dataset (Lin et al., 2014). |
| Dataset Splits | Yes | For the first two Flickr datasets, 1000 images each are used for validation and testing while using the rest (6000 and 28000 respectively) for training. For COCO, a similar split is used but the number of images used for validation and testing each is 5000. |
| Hardware Specification | No | The paper mentions training models but does not specify any hardware details such as GPU/CPU models, memory, or cloud computing instances used for experiments. |
| Software Dependencies | No | The paper mentions training with Adam and using an LSTM, but it does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Both the DSF and the LSTM (in the case of EE) are trained using Adam (Kingma & Ba, 2014) with a learning rate of 1e 4 and 1e 5 respectively. We set the beam size K = 5 in all our experiments. As mentioned in Section 2, we first do a coarse selection using a standard sequence model; inputting only the top-100 alternatives corresponding to each partial solution to the DSF. |