Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment

Authors: Geyang Guo, Ranchi Zhao, Tianyi Tang, Xin Zhao, Ji-Rong Wen

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments have demonstrated the effectiveness of our approaches by comparing a number of competitive baselines. We release all the above-mentioned resources at https://github.com/RUCAIBox/FIGA. and 4 EXPERIMENT
Researcher Affiliation Academia 1Gaoling School of Artificial Intelligence, Renmin University of China. 2School of Information, Renmin University of China.
Pseudocode Yes Algorithm 1: FIGA Leveraging Fine-grained Quality Signals for Alignment
Open Source Code Yes We release all the above-mentioned resources at https://github.com/RUCAIBox/FIGA.
Open Datasets Yes For our SPA dataset mentioned in Section 3.1, we broadly select the following datasets as our initial instance pool: HH-RLHF (Bai et al., 2022a), Share GPT (Share GPT, 2023), Instruct GPT-J Pairwise (Dahoas, 2023), SHP (Ethayarajh et al., 2022), and Open Orca (Lian et al., 2023).
Dataset Splits No The paper mentions training on various datasets and evaluating on a test set, but it does not explicitly provide details about a validation dataset split (e.g., percentages or counts for validation).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Open LLaMA2 (Open LLMAI, 2023) library' and 'gpt-3.5-turbo', but does not provide specific version numbers for any software dependencies required to replicate the experiment.
Experiment Setup Yes For SFT, we set the learning rate to 1e-5 and the batch size to 128. We conduct 5 epochs of training... For FIGA, we set the parameters α = 1, β = 0.5, γ = 0 respectively.