reproducibilityindex.ai

Genetic Prompt Search via Exploiting Language Model Probabilities

Authors: Jiangjiang Zhao, Zhuoran Wang, Fangchun Yang

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on diverse benchmark datasets show that the proposed precondition-free method significantly outperforms the existing DFO-style counterparts that require preconditions, including blackbox tuning, genetic prompt search and gradientfree instructional prompt search.
Researcher Affiliation	Collaboration	Jiangjiang Zhao1,2 , Zhuoran Wang3 , Fangchun Yang1 1Beijing University of Posts and Telecommunications, P.R. China 2China Mobile Online Services Co., Ltd. Beijing, P.R. China 3Clouchie Limited, London, United Kingdom
Pseudocode	Yes	Algorithm 1 gives the pseudo-code of the proposed GAP3, where hyperparameters and constant objects are denoted in italic type.
Open Source Code	Yes	1Code and supplementary material available at: https://github. com/zjjhit/gap3
Open Datasets	Yes	The datasets used in the main experiments consist of 7 benchmark NLP tasks, which are the same as in [Sun et al., 2022b], including Yelp polarity, AG s News and DBPedia from [Zhang et al., 2015], SST-2, MRPC and RTE from the GLUE benchmarks [Wang et al., 2018], as well as SNLI [Bowman et al., 2015].
Dataset Splits	No	The paper describes the creation of k-shot training sets and the use of original test sets or development sets as test sets, but does not explicitly define a separate validation set for the main model training.
Hardware Specification	No	The paper mentions 'computing power' in the acknowledgements but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for the experiments.
Software Dependencies	No	The paper mentions the use of various pretrained language models and optimizers but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We set GAP3 s population size N = 64 and iteration number M = 50, with crossover and mutation probabilities ρc = 0.5 and ρm = 0.75, respectively. For PT, with learning rate 5e-4 and batch size 16, it runs for 1000 epochs. For full-model FT, with the same batch size, but learning rate 1e-5, we run it for 200 epochs.