reproducibilityindex.ai

Natural Language Instruction-following with Task-related Language Development and Translation

Authors: Jing-Cheng Pang, Xin-Yu Yang, Si-Hang Yang, Xiong-Hui Chen, Yang Yu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a series of experiments to evaluate the effectiveness of TALAR and answer the following questions: (1) How does TALAR perform compared to existing NLC-RL approaches when learning an instruction-following policy? (Section 5.1) (2) Can TALAR learn effective task language? (Section 5.2) (3) Can TL acquire any compositional structure and serve as an abstraction for hierarchical RL? (Section 5.3) (4) What is the impact of each component on the overall performance of TALAR? (Section 5.4)
Researcher Affiliation	Collaboration	National Key Laboratory of Novel Software Technology, Nanjing University Polixir Technology {pangjc,yangxy,yangsh,chenxh}@lamda.nju.edu.cn, yuy@nju.nju.edu.cn
Pseudocode	Yes	Algorithm 1 Training procedure of the TL generator. Algorithm 2 Training procedure of the translator. Algorithm 3 Training procedure of the instruction-following policy.
Open Source Code	No	The paper mentions using "the open-sourced RL repository, stable-baselines3 [65]" for implementation but does not provide a specific link or statement for the open-sourcing of their own research code.
Open Datasets	Yes	We conduct experiments in Franka Kitchen [8] and CLEVR-Robot [9] environments, as shown in Fig. 3.
Dataset Splits	No	The paper states "We split the NL instructions into two tasks: training and the testing set" but does not explicitly mention or specify details for a validation set or precise split percentages for all datasets used for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instance types used for running the experiments.
Software Dependencies	No	The paper mentions "we utilize the open-sourced RL repository, stable-baselines3 [65]" but does not provide specific version numbers for this or other software dependencies.
Experiment Setup	Yes	The hyper-parameters for implementing TALAR are presented in Table 2. When implementing baseline methods, we use the same hyper-parameters of PPO for policy learning.