SocioDojo: Building Lifelong Analytical Agents with Real-world Text and Time Series
Authors: Junyan Cheng, Peter Chin
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments and ablation studies to explore the factors that impact performance. The results show that our proposed method achieves improvements of 32.4% and 30.4% compared to the state-of-the-art method in the two experimental settings. [...] We also perform experiments and ablation studies to explore factors that impact performance. [...] 5 EXPERIMENT In Section 5.1, we present our experimental setup. We then evaluate our proposed H&P prompting and compare it with other state-of-the-art prompting techniques in Section 5.2. Finally, we discuss the results of the ablation studies in Section 5.3. |
| Researcher Affiliation | Academia | Junyan Cheng Thayer School of Engineering Dartmouth College Hanover, NH 03755, USA jc.th@dartmouth.edu Peter Chin Thayer School of Engineering Dartmouth College Hanover, NH 03755, USA pc@dartmouth.edu |
| Pseudocode | No | The paper describes the agent's architecture and prompting process in detail but does not include any sections explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Our code and data are available at https://github.com/chengjunyan1/Socio Dojo. |
| Open Datasets | Yes | Our code and data are available at https://github.com/chengjunyan1/Socio Dojo. [...] Socio Dojo uses three components Information Sources, Time series, and Knowledge base & Tools based on 30 GB of high-quality real-world data that we have collected [...] We collect time series on a variety of topics for a comprehensive probe of the world state, including financial data from Yahoo Finance, economic time series from the St. Louis Federal Reserve Economic Data Database (FRED) with a popularity rating of more than 50%, Google trends of free trending keywords from Exploding Topics, a society trend tracking service, political polls from Five Thirty Eight, a famous political analysis website, and public opinion poll trackers from You Gov, an online survey platform. |
| Dataset Splits | No | The paper describes a lifelong learning environment where agents are 'constantly updated with the latest messages' and evaluated based on performance 'over time' (e.g., 'The game begins on 2021-10-01 and ends on 2023-08-01'). This indicates a continuous, time-based evaluation rather than traditional, fixed training/validation/test dataset splits. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., specific GPU or CPU models, memory, or cloud instances) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Chroma DB' and 'Instructor-XL (Su et al., 2023)' as components and 'GPT-3.5-Turbo series' and 'GPT-4' as foundation models. However, it does not provide specific version numbers for these software dependencies (e.g., Chroma DB vX.Y.Z or Instructor-XL vA.B). |
| Experiment Setup | Yes | In our experiment, we set this number [max steps for analyst] to 4. [...] It can call one of the query interfaces to find resources at each step. The query is handled iteratively through a multi-round dialog response loop with a max step of 3 in our experiment. [...] It initiates a multi-round dialog action loop with a max step of 5 in our experiment when an analysis report is received. [...] All methods use AAA architecture, one news channel as the information source, GPT-3.5-Turbo series as foundation models with a low temprature=0.2 for a more deterministic experiment result. [...] We chose a bound of 5, as it is possible to achieve a return of 5 times for a single asset without leverage in around 2 years, which is the time span of Socio Dojo. [...] Therefore, we selected overnight rates of FRD: 0.10, WEB: 0.05, and a bound of 5 as our experimental setting. |