reproducibilityindex.ai

In-Context Learning State Vector with Inner and Momentum Optimization

Authors: Dongfang Li, zhenyu liu, Xinshuo Hu, Zetian Sun, Baotian Hu, Min Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments using Llama-2 and GPT-J in both zero-shot setting and few-shot setting. The experimental results show that our optimization method effectively enhances the state vector and achieves the state-of-the-art performance on diverse tasks.
Researcher Affiliation	Academia	Dongfang Li, Zhenyu Liu, Xinshuo Hu, Zetian Sun, Baotian Hu B, Min Zhang Harbin Institute of Technology (Shenzhen), Shenzhen, China {crazyofapple, liuzhenyuhit}@gmail.com {hubaotian, zhangmin2021}@hit.edu.cn
Pseudocode	No	The paper describes algorithms and methods but does not include any explicitly labeled pseudocode blocks or algorithm figures.
Open Source Code	Yes	Code is available at https: //github.com/HITsz-TMG/ICL-State-Vector
Open Datasets	Yes	Linguistics includes Antonym [Nguyen et al., 2017], Capitalize, Present-Past, and Singular Plural [Todd et al., 2023], focusing on transformations in the form or meaning of words. Translation is represented by the English-French [Lample et al., 2018] dataset, which involves translating English words into their French counterparts. Knowledge comprises Country-Capital [Todd et al., 2023], AG News [Zhang et al., 2015], Person-Sport, Person-Instrument, Person-Occupation, Product-Company, and Landmark Country [Hernandez et al., 2023], which are centred around question-to-answer mappings for commonsense knowledge queries.
Dataset Splits	Yes	The remaining instances are split into test and development sets with a 7:3 ratio.
Hardware Specification	Yes	We run all the experiments on a single NVIDIA A100 80G GPUs.
Software Dependencies	No	The paper mentions using Llama-2 and GPT-J models but does not specify software versions for libraries like PyTorch, TensorFlow, or Python itself, which would be necessary for reproduction.
Experiment Setup	Yes	Each subset consists of 10 instances for demonstrations and one instance for a dummy query since we employ a 10-shot as the default ICL setting. ... We find the best layer for different tasks via the accuracy of the development set. For the inner optimization in 4.2, we choose the last seven state vectors to optimize. ... For the momentum optimization, we choose 0.5 as the retention rate for historical momentum from the options of 0.25, 0.5 and 0.75.