Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SPOT-Trip: Dual-Preference Driven Out-of-Town Trip Recommendation

Authors: Yinghui Liu, Hao Miao, Guojiang Shen, Yan Zhao, Xiangjie Kong, Ivan Lee

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on real data offer insight into the effectiveness of the proposed solutions, showing that SPOT-Trip achieves performance improvement by up to 17.01%. We report on extensive experiments using real data, offering evidence of the effectiveness of the proposals.
Researcher Affiliation	Academia	Yinghui Liu1 , Hao Miao2 , Guojiang Shen1, Yan Zhao3, Xiangjie Kong1 , Ivan Lee4 1Zhejiang Key Laboratory of Visual Information Intelligent Processing, College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China 2Department of Computing, Hong Kong Polytechnic University, Hong Kong, China 3Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, China 4STEM, University of South Australia, Adelaide, Australia
Pseudocode	Yes	The detailed algorithm for the optimization and recommendation phases of SPOT-Trip are summarized in Algorithm 1 and 2, to facilitate understanding and implementation.
Open Source Code	Yes	The code of SPOT-Trip can be found at https://github.com/Yinghui-Liu/SPOT-Trip.
Open Datasets	Yes	The experiments are carried out on two widely-used travel behavior datasets: Foursquare and Yelp. We conduct experiments on two public real-world datasets: Foursquare and Yelp. Footnotes: https://sites.google.com/site/yangdingqi/home/foursquare-dataset, https://www.yelp.com.tw/dataset
Dataset Splits	Yes	Subsequently, the datasets were randomly partitioned by user into training, validation, and testing sets with a ratio of 80%, 10%, and 10%, respectively.
Hardware Specification	Yes	We implement our model using the Pytorch framework on NVIDIA GeForce RTX 4090 GPU.
Software Dependencies	No	We implement our model using the Pytorch framework on NVIDIA GeForce RTX 4090 GPU. Similarly, we use a differentiable dopri5 ODE solver with rtol = atol = 10 5 from torchdiffeq package [9]. The paper mentions software components like PyTorch and torchdiffeq package, but does not provide specific version numbers for them, which is required for a reproducible description of software dependencies.
Experiment Setup	Yes	The learning rate is set to 0.001 with Adam optimizer. Additionally, the batch size and training epochs are set to 32 and 1000, respectively... The optimizer is uniformly chosen as Adam with an initial learning rate of 0.001 and L2 regularization with a weight of 10-5. To avoid overfitting, we adopt the early stop strategy with an 8-epoch patience... the number of Transformer layers in module ODPL is set to 4, while both f( ) and λ( ) are implemented as 3-layer MLPs. The latent dimensions of these MLPs are searched from {16, 32, 64, 128, 256}, with 128 selected as the optimal value... For the static-dynamic preference fusion (Sec. 3.3), the number of Transformer layers is set to 1... All Transformer layers employ 4 attention heads. The hyper-parameter σ vo is tuned separately for each dataset, with the optimal value set to 0.6 for Foursquare and 0.4 for Yelp. In the optimization stage, the weights of the loss terms β1, β2 and β3 for two datasets are set as 1.