Capturing Failures of Large Language Models via Human Cognitive Biases
Authors: Erik Jones, Jacob Steinhardt
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Specifically, we use cognitive biases as motivation to (i) generate hypotheses for problems that models may have, and (ii) develop experiments that elicit these problems. Using code generation as a case study, we find that Open AI s Codex errs predictably based on how the input prompt is framed, adjusts outputs towards anchors, and is biased towards outputs that mimic frequent training examples. We then use our framework to elicit high-impact errors such as incorrectly deleting files. Our results indicate that experimental methodology from cognitive science can help characterize how machine learning systems behave. |
| Researcher Affiliation | Academia | Erik Jones UC Berkeley erjones@berkeley.edu Jacob Steinhardt UC Berkeley jsteinhardt@berkeley.edu |
| Pseudocode | No | The paper contains figures and examples of code, but no structured pseudocode or algorithm blocks are explicitly labeled or presented. |
| Open Source Code | Yes | Code for this paper is available at https://github.com/ejones313/codex-cog-biases. |
| Open Datasets | Yes | We use the Human Eval benchmark as a diverse source of normal prompts [Chen et al., 2021]. |
| Dataset Splits | No | The paper uses pre-trained models and evaluates them on prompts from benchmarks like Human Eval. While Human Eval provides test cases for verifying the correctness of generated code, the paper does not specify how the prompts themselves are split into training, validation, or test sets for its own experimental methodology. |
| Hardware Specification | No | The paper states that for Codex, they use the OpenAI API, and for Code Gen, they 'run inference locally' but do not specify any details about the hardware (e.g., CPU/GPU models, memory) used for this local inference or for querying the API. |
| Software Dependencies | No | The paper mentions the use of Open AI's Codex (davinci-001) and Salesforce's Code Gen, but it does not provide specific version numbers for any ancillary software dependencies or libraries used to run the experiments locally. |
| Experiment Setup | Yes | We use the Open AI API to query the davinci-001 version of Codex, and use greedy decoding to generate solutions. We test five framing lines: raise Not Implemented Error , pass , assert False , return False , and print("Hello world!") . We consider all 12 combinations of the binary operations sum, difference, and product, with unary operations square, cube, quadruple, and square root. |