Evaluation of Trajectory Distribution Predictions with Energy Score

Authors: Novin Shahroudi, Mihkel Lepson, Meelis Kull

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a series of experiments highlighting the importance of adopting proper scoring rules as a distribution-aware evaluation of trajectory distribution predictions. We empirically demonstrate the consequence of adopting an improper score for evaluation and how it can go wrong in Section 6.1 through a showcase of propriety. We also empirically demonstrate the effect of the trajectory size K in Section 6.2. To see the energy score in action, we perform a real data experiment on the ETH/UCY dataset (Ess et al., 2007) in Section 6.3.
Researcher Affiliation Academia 1Institute of Computer Science, University of Tartu, Tartu, Tartu County, Estonia. Correspondence to: Novin Shahroudi <novin.shahroudi@ut.ee>, Mihkel Lepson <mihkel.lepson@ut.ee>, Meelis Kull <meelis.kull@ut.ee>.
Pseudocode No The paper provides mathematical definitions and descriptions of metrics but does not include any pseudocode or algorithm blocks.
Open Source Code Yes The code for our experiments is available at https://github.com/novinsh/trajectoryprediction-eval-with-energy-score.
Open Datasets Yes To see the energy score in action, we perform a real data experiment on the ETH/UCY dataset (Ess et al., 2007) in Section 6.3.
Dataset Splits No The paper does not explicitly provide details about training, validation, or test dataset splits for its own experiments. It focuses on evaluating pre-trained models on the ETH/UCY dataset.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or cloud configurations) used to run the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We set the ground truth parameters to be µt = 1, σt = 0.2, at =0, and bt =0 for t={1, 2, 3}. Then, we generate N = 5000 observations and consider K ={10, 20, 50, 100, 300} to generate predictions from the same process.