reproducibilityindex.ai

Loss Functions for Multiset Prediction

Authors: Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed multiset loss function is empirically evaluated on two families of datasets, one synthetic and the other real, with varying levels of difﬁculty, against various baseline loss functions including reinforcement learning, sequence, and aggregated distribution matching loss functions. The experiments reveal the effectiveness of the proposed loss function over the others.
Researcher Affiliation	Academia	Sean Welleck1,2, Zixin Yao1, Yu Gai1, Jialin Mao1, Zheng Zhang1, Kyunghyun Cho2,3 1New York University Shanghai 2New York University 3CIFAR Azrieli Global Scholar {wellecks,zixin.yao,yg1246,jialin.mao,zz,kyunghyun.cho}@nyu.edu
Pseudocode	No	The paper describes the proposed method and various algorithms but does not include a dedicated "Pseudocode" or "Algorithm" block or section.
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	MNIST Multi We vary the size of each digit and also add clutter. In the experiments, we consider the following variants of MNIST Multi: MNIST Multi (4): \|Y\| = 4; 20-50 px digits MNIST Multi (1-4): \|Y\| 1, . . . , 4; 20-50 px digits MNIST Multi (10): \|Y\| = 10; 20 px digits Each dataset has a training set with 70,000 examples and a test set with 10,000 examples. We randomly sample 7,000 examples from the training set to use as a validation set, and train with the remaining 63,000 examples. MS COCO As a real-world dataset, we use Microsoft COCO [13] which includes natural images with multiple objects.
Dataset Splits	Yes	Each dataset has a training set with 70,000 examples and a test set with 10,000 examples. We randomly sample 7,000 examples from the training set to use as a validation set, and train with the remaining 63,000 examples. For each variant, we hold out a randomly sampled 15% of the training examples as a validation set.
Hardware Specification	No	The paper describes the neural network architectures (e.g., convolutional layers, LSTM, ResNet-34) and general training procedures but does not specify any hardware components like CPU or GPU models, or memory sizes used for the experiments.
Software Dependencies	No	The paper mentions using ResNet-34 and convolutional LSTM layers, which implies the use of deep learning frameworks, but it does not specify any software names with version numbers (e.g., TensorFlow, PyTorch, or specific Python library versions).
Experiment Setup	No	The paper states that models were trained for "200 epochs (350 for MNIST Multi 10)" and mentions using a "termination policy approach" and "greedy decoding" for evaluation. However, it does not provide specific hyperparameter values like learning rate, batch size, optimizer details, or other concrete system-level training settings.