Does Writing with Language Models Reduce Content Diversity?
Authors: Vishakh Padmakumar, He He
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we measure the impact of co-writing on diversity via a controlled experiment, where users write argumentative essays in three setups |
| Researcher Affiliation | Academia | Vishakh Padmakumar New York University vishakh@nyu.edu He He New York University hehe@cs.nyu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data are available at https://github.com/vishakhpk/hai-diversity. |
| Open Datasets | Yes | For the sake of reproducibility, the essays with character-level logs as recorded by the interface along with all model suggestions presented to users will be released after the review period. [...] Code and data are available at https://github.com/vishakhpk/hai-diversity. |
| Dataset Splits | No | The paper describes a controlled experimental setup with different groups of participants and topics, stating 'In total, we obtained 10 essays on each of the 10 topics for each setting, resulting in 300 essays in total.' However, it does not specify a training, validation, or test split for a dataset in the context of model development or evaluation reproducibility. |
| Hardware Specification | No | The paper mentions using the 'Open AI API' (davinci, text-davinci-003, gpt-3.5-turbo) to obtain model continuations and summaries. However, it does not specify any hardware details (e.g., GPU models, CPU types, or server specifications) used by the authors for running their experiments or data processing. |
| Software Dependencies | No | The paper mentions software like 'Scikit-learn' and models such as 'davinci', 'text-davinci-003', 'gpt-3.5-turbo', and 'GPT2'. However, it does not provide specific version numbers for any of these software components, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | Specifically, we sample continuations from both models with a temperature of 0.9 and a frequency penalty of 0.5 (detailed parameters listed in Appendix A.2) Temperature: 0.9 Frequency penalty: 0.5 Presence penalty: 0.5 |