reproducibilityindex.ai

CulturePark: Boosting Cross-cultural Understanding in Large Language Models

Authors: Cheng Li, Damien Teney, Linyi Yang, Qingsong Wen, Xing Xie, Jindong Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated these models across three downstream tasks: content moderation, cultural alignment, and cultural education.
Researcher Affiliation	Collaboration	Cheng Li Institute of Software, CAS contact@damienteney.info Damien Teney Idiap Research Institute Linyi Yang Westlake University yanglinyi@westlake.edu.cn Qingsong Wen Squirrel AI qingsongedu@gmail.com Xing Xie Microsoft Research xing.xie@microsoft.com Jindong Wang William & Mary jwang80@wm.edu
Pseudocode	Yes	Figure 9: Pipeline of data refinement.
Open Source Code	Yes	Code is released at https://github. com/Scarelette/Culture Park.
Open Datasets	Yes	The seed questions initiating the communication have two sources: World Values Survey (WVS) [Survey, 2022b] and Global Attitudes surveys (GAS) from Pew Research Center [Survey, 2022a].
Dataset Splits	No	The paper mentions 41k samples used for fine-tuning and a test set for evaluation, but does not explicitly provide training/validation/test splits for the fine-tuning data or the source datasets.
Hardware Specification	No	The paper mentions using GPT-3.5-Turbo and fine-tuning Llama-2-70b models but does not provide specific hardware details such as GPU/CPU models, memory, or processor types used for these operations.
Software Dependencies	No	The paper mentions using "text-embedding-3-small" and "K-means" but does not provide specific version numbers for key software components or libraries required for replication, nor a comprehensive list of dependencies.
Experiment Setup	Yes	Hyperparameters are shown in Table 6. Table 6: Details on Fine-tuning GPT-3.5-turbo using Open AI API. Model [various] Epochs [various numbers].