Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment
Authors: Geyang Guo, Ranchi Zhao, Tianyi Tang, Xin Zhao, Ji-Rong Wen
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have demonstrated the effectiveness of our approaches by comparing a number of competitive baselines. We release all the above-mentioned resources at https://github.com/RUCAIBox/FIGA. and 4 EXPERIMENT |
| Researcher Affiliation | Academia | 1Gaoling School of Artificial Intelligence, Renmin University of China. 2School of Information, Renmin University of China. |
| Pseudocode | Yes | Algorithm 1: FIGA Leveraging Fine-grained Quality Signals for Alignment |
| Open Source Code | Yes | We release all the above-mentioned resources at https://github.com/RUCAIBox/FIGA. |
| Open Datasets | Yes | For our SPA dataset mentioned in Section 3.1, we broadly select the following datasets as our initial instance pool: HH-RLHF (Bai et al., 2022a), Share GPT (Share GPT, 2023), Instruct GPT-J Pairwise (Dahoas, 2023), SHP (Ethayarajh et al., 2022), and Open Orca (Lian et al., 2023). |
| Dataset Splits | No | The paper mentions training on various datasets and evaluating on a test set, but it does not explicitly provide details about a validation dataset split (e.g., percentages or counts for validation). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Open LLaMA2 (Open LLMAI, 2023) library' and 'gpt-3.5-turbo', but does not provide specific version numbers for any software dependencies required to replicate the experiment. |
| Experiment Setup | Yes | For SFT, we set the learning rate to 1e-5 and the batch size to 128. We conduct 5 epochs of training... For FIGA, we set the parameters α = 1, β = 0.5, γ = 0 respectively. |