reproducibilityindex.ai

EcomGPT: Instruction-Tuning Large Language Models with Chain-of-Task Tasks for E-commerce

Authors: Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments and human evaluations demonstrate that Ecom GPT outperforms Chat GPT in term of cross-dataset/task generalization on E-commerce tasks.
Researcher Affiliation	Collaboration	1SIGS, Tsinghua University 2Shanghai Tech University 3DAMO Academy, Alibaba Group 4Peng Cheng Laboratory
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	Yes	The Ecom GPT will be public at https://github.com/Alibaba-NLP/Ecom GPT.
Open Datasets	Yes	We manually collected a wide range of E-commerce natural language processing (NLP) datasets from open data sources, such as academic websites and data competition platforms.
Dataset Splits	No	The Ecom Instruct dataset is divided into two partitions, namely training and testing." The paper specifies training and testing splits but does not explicitly provide details about a separate validation dataset split.
Hardware Specification	Yes	All experiments are run on 4 NVIDIA A100 SXM4 80GB GPUs.
Software Dependencies	No	The paper mentions the use of 'Adam W' optimizer and refers to models like 'BLOOMZ' and 'Chat GPT', and a tool 'Alpaca Garbage Collector' (with a URL), but it does not specify explicit version numbers for programming languages, libraries, or frameworks (e.g., Python version, PyTorch/TensorFlow version, CUDA version) needed for replication.
Experiment Setup	Yes	Adam W (Loshchilov and Hutter 2017) optimizer is employed for model training, with learning rate set of 2e-5 and weight decay of 0. We utilize a cosine learning rate schedule, warming up over 3% of the training steps. The model is fine-tuned with 3 epochs, with the batch size per device set to 4 and the gradient accumulation step set to 8. The maximum sequence length is 1024.