Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Structured Energy Network As a Loss
Authors: Jay Yoon Lee, Dhruvesh Patel, Purujit Goyal, Wenlong Zhao, Zhiyang Xu, Andrew McCallum
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluation on multi-label classification, semantic role labeling, and image segmentation, we demonstrate that SEAL provides various useful design choices, is faster at inference than GBI, and leads to significant performance gains over the baselines. |
| Researcher Affiliation | Academia | Graduate School of Data Science, Seoul National University |Manning College of Information & Computer Sciences, University of Massachusetts Amherst Department of Computer Science, Virginia Tech |
| Pseudocode | Yes | Algorithm 1: SEAL-dynamic Algorithm |
| Open Source Code | Yes | The code we used to train and evaluate our models is available a https://github.com/iesl/seal-neurips-2022 |
| Open Datasets | Yes | We utilize 7 feature-based MLC datasets... including Bibtex and Delicious... We experiment with the Arxiv Academic Paper Dataset (AAPD) (Yang et al., 2018)... Semantic role labeling (SRL) (Palmer et al., 2010) using standard benchmark (CoNLL-12) (Pradhan et al., 2013)... We evaluate SEAL on binary image segmentation using the Weizmann Horse Image dataset (Borenstein & Ullman, 2004). |
| Dataset Splits | Yes | Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] The details are in Appendix. The splits are 80/10/10 for train/dev/test sets. |
| Hardware Specification | Yes | All experiments were performed on a cluster equipped with NVIDIA RTX 2080 Ti GPUs. |
| Software Dependencies | No | The paper mentions software components like 'ADAM optimizers' and 'Weights and Biases' but does not specify their version numbers or the version numbers of other key libraries or programming languages used in the experiments. |
| Experiment Setup | Yes | We defer specific training details such as hyperparameters, gpu environment, and number of random seed runs to the Appendix C.1. For all experiments, we used a batch size of 32 for the task-net and 64 for the loss-net unless otherwise stated. We use Adam optimizer with a learning rate of 1e-4 for the task-net and 1e-5 for the loss-net. |