Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Guaranteed Generation from Large Language Models
Authors: Minbeom Kim, Thibaut Thonet, Jos Rozen, Hwaran Lee, Kyomin Jung, Marc Dymetman
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate these theoretical concepts, we conduct extensive experiments on two text generation settings with hard-to-satisfy constraints: a lexical constraint scenario and a sentiment reversal scenario. These experiments show that GUARD achieves perfect constraint satisfaction while almost preserving the ideal distribution with highly improved inference efficiency. |
| Researcher Affiliation | Collaboration | 1Seoul National University 2NAVER Labs Europe 3NAVER AI Lab 4Sogang University 5Independent Researcher |
| Pseudocode | Yes | Algorithm 1 GUARD sampler |
| Open Source Code | Yes | 1The code is available at https://github.com/naver/guard. |
| Open Datasets | Yes | We then selected our set of openings X by collecting negative story openings from the ROCStories test set (Mostafazadeh et al., 2016) |
| Dataset Splits | No | The paper uses the ROCStories test set as a source for story openings, but it does not specify explicit training/test/validation splits for its own experimental methodology beyond referring to pre-existing dataset divisions. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used to run its experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions the 'disco toolkit' but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Detailed hyperparameters are provided in Table 3, and the list of constraint-aware prompts used in the experiments can be found in Table 4. |