reproducibilityindex.ai

ReMoDetect: Reward Models Recognize Aligned LLM's Generations

Authors: Hyunseok Lee, Jihoon Tack, Jinwoo Shin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide an extensive evaluation by considering six text domains across twelve aligned LLMs, where our method demonstrates state-of-the-art results. Code is available at https://github.com/hyunseoklee-ai/Re Mo Detect. 4 Experiments We provide an empirical evaluation of Re Mo Detect by investigating the following questions:
Researcher Affiliation	Academia	Hyunseok Lee 1, Jihoon Tack ,1, Jinwoo Shin1 1Korea Advanced Institute of Science and Technology {hs.lee,jihoontack,jinwoos}@kaist.ac.kr
Pseudocode	No	The paper describes its methods using textual descriptions and mathematical equations but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/hyunseoklee-ai/Re Mo Detect.
Open Datasets	Yes	HC3. HC3 is a question-and-answering dataset that consists of answers written by humans and generated by Chat GPT corresponding to the same questions. The dataset is a collection of several domains: reddit_eli5, open_qa, wiki_csai, medicine, and finance. We used training samples of 2,200 and validation samples of 1,000, which is the same subset of HC3 as the prior work [6, 40].
Dataset Splits	Yes	We used training samples of 2,200 and validation samples of 1,000, which is the same subset of HC3 as the prior work [6, 40].
Hardware Specification	Yes	For the main development, we mainly use Intel(R) Xeon(R) Gold 6426Y CPU @ 2.50GHz and a single A6000 48GB GPU.
Software Dependencies	No	The paper mentions using 'Adam W optimizer' and 'nltk framework' and specific reward models like 'Open Assistant' and 'De BERTa-v3-Large', but does not provide specific version numbers for software libraries or environments required for full reproducibility.
Experiment Setup	Yes	We use Adam W optimizer with a learning rate of 2.0 10 5 with 10% warm up and cosine decay and train it for one epoch. For the λ constant for regularization using replay buffer, we used λ = 0.01. For the β1, β2 parameters that chooses the contribution of the mixed data, we used 0.3 and 0.3.