Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Conditional and Compositional Language Model Differentiable Prompting
Authors: Jonathan Pilault, Can Liu, Mohit Bansal, Markus Dreyer
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present extensive empirical and theoretical analysis and show that PROPS consistently surpasses other PLM adaptation techniques, and often improves upon fully ο¬ne-tuned models, on compositional generalization tasks, controllable summarization and multilingual translation, while needing fewer trainable parameters. |
| Researcher Affiliation | Collaboration | Jonathan Pilault1 , Can Liu2, Mohit Bansal3, Markus Dreyer2 1Mila Qu ebec AI Institute, Polytechnique Montr eal 2Amazon Alexa, 3University of North Carolina at Chapel Hill |
| Pseudocode | Yes | As summarized in Algorithm 1, each condition sequence SC = ct|t {1, . . . , TC} C, where TC is the sequence length of condition C C and C is the set of conditions, is ο¬rst encoded by a Condition Encoder f( ). |
| Open Source Code | No | The code and datasets will be made publicly available. |
| Open Datasets | Yes | We study four Conditional Natural Language Generation (CNLG) datasets SCAN [Lake and Baroni, 2018], Europarl [Koehn, 2005], XSum [Narayan et al., 2018] and Topic-CNN-DM [Mrini et al., 2021a]. |
| Dataset Splits | No | Each language pair direction has 1M training and 100K testing examples. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, or CUDA versions) needed to replicate the experiment. |
| Experiment Setup | Yes | We describe our datasets, training and evaluation setup in Appendix E. |