Understanding prompt engineering may not require rethinking generalization
Authors: Victor Akinwande, Yiding Jiang, Dylan Sam, J Zico Kolter
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically that this holds for existing handcrafted prompts and prompts generated through simple greedy search. ... 5 EXPERIMENTS In this section, we evaluate the generalization of discrete prompts generated by Greedy on CIFAR10, CIFAR-100, Image Net as well as domain generalization datasets f Mo W (Christie et al., 2018) and Office Home (Venkateswara et al., 2017), which is much less studied in the context of numerical generalization bounds. |
| Researcher Affiliation | Collaboration | Victor Akinwande1, Yiding Jiang1, Dylan Sam1 & J. Zico Kolter1,2 1Carnegie Mellon University, 2Bosch Center for AI |
| Pseudocode | Yes | A PESUDOCODE Algorithm 1 Sequential Prompt Search |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the methodology described is openly available. |
| Open Datasets | Yes | In this section, we evaluate the generalization of discrete prompts generated by Greedy on CIFAR10, CIFAR-100, Image Net as well as domain generalization datasets f Mo W (Christie et al., 2018) and Office Home (Venkateswara et al., 2017) |
| Dataset Splits | No | The paper describes using a 'split portion of the dataset s {0.1, . . . , 1.0}' for its experiments and mentions training and testing data, but it does not explicitly define or specify a separate 'validation' dataset split with percentages or counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like CLIP and LLaMA-7B (Touvron et al., 2023), but it does not provide specific version numbers for multiple key software libraries, frameworks, or programming languages used to run the experiments. |
| Experiment Setup | Yes | C EXPERIMENTAL DETAILS Hyperparameters We report the hyperparameters used in CLIP, LLa MA-7b, and the Greedy algorithm in Table 4. Table 4: Hyperparameters used in CLIP, LLa MA-7b and Greedy. Hyperparameter Value Batch size 100 CLIP Vocabulary size 49,408 LLa MA-7B Vocabulary size 32,000 Temperature 1.0 Bound δ 0.01 SRM β 1.0 |