reproducibilityindex.ai

Get Rid of Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework

Authors: Zhongchao Yi, Zhengyang Zhou, Qihe Huang, Yanjiang Chen, Liheng Yu, Xu Wang, Yang Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further establish a benchmark of three cities for multi-task spatiotemporal learning, and empirically demonstrate the superiority of CMu ST via extensive evaluations on these datasets.
Researcher Affiliation	Academia	Zhongchao Yi1, Zhengyang Zhou1,2,3, , Qihe Huang1, Yanjiang Chen1, Liheng Yu1, Xu Wang1,2, Yang Wang1,2, 1University of Science and Technology of China (USTC), Hefei, China 2Suzhou Institute for Advanced Research, USTC, Suzhou, China 3State Key Laboratory of Resources and Environmental Information System, Beijing, China
Pseudocode	Yes	Algorithm 1 Rolling Adaptation Process
Open Source Code	Yes	Code is available at https://github.com/DILab-USTCSZ/CMu ST.
Open Datasets	Yes	NYC3: Includes three months of crowd flow and taxi hailing from Manhattan and its surrounding areas in New York City, encompassing four tasks: Crowd In, Crowd Out, Taxi Pick, and Taxi Drop. 3https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page ... Chicago4: Comprises of traffic data collected in the second half of 2023 from Chicago, including three tasks: Taxi Pick, Taxi Drop, and Risk. 4https://data.cityofchicago.org/browse
Dataset Splits	Yes	We partitioned datasets into training, validation, and testing sets with 7:1:2 ratio.
Hardware Specification	Yes	Our model was implemented with Py Torch on a Linux system equipped with Tesla V100 16GB.
Software Dependencies	No	The paper mentions 'Py Torch' as the implementation framework but does not specify its version or any other software dependencies with their respective version numbers.
Experiment Setup	Yes	For the MSTI, embedding dimensions were dobs = 24, ds = 12, dt = 60, and the prompt dimension was 72. Dimensions for self-attention and cross-attention respectively were 168 and 24, with each attention having 4 heads and FFN s hidden dimension was 256. The Adam optimizer is adopted with an initialized learning rate of 1 10 3, and weight decay of 3 10 4, where the early-stop was applied. For Ro Ada, the threshold δ = 10 6.