When Hashing Met Matching: Efficient Spatio-Temporal Search for Ridesharing

Authors: Chinmoy Dutta90-98

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with large real-world datasets show that our algorithm consistently outperforms state-of-the-art heuristic methods thereby proving its practical applicability.
Researcher Affiliation Industry Chinmoy Dutta Turing Research chinmoy@turingresearch.ai
Pseudocode Yes Algorithm 1: Algorithm for approximate OMS.
Open Source Code Yes 1Code available at https://github.com/chinmoy-dutta/ridematch_lsh
Open Datasets Yes We used the publicly available New York city yellow taxi trip datasets for our experiments (available from https:// www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page).
Dataset Splits No The paper describes using sub-sampled rates of the datasets for varying load, and evaluating against an optimal solution, but does not specify explicit training, validation, or test dataset splits for model development or evaluation in a reproducible manner.
Hardware Specification Yes Our experiments were carried out on a machine with a 2.4 GHz Intel i9 processor and 64 GB RAM.
Software Dependencies No The paper mentions using 'Open Source Routing Machine (OSRM)' and the 'Ball-Tree algorithm' for near-neighbor search, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes To evaluate our algorithm under varying traffic patterns, we constructed two experiment scenarios: (i) Morning commute, consisting of 21310 rides requested on 2016-06-08 (Wed) between 8:00 AM and 9:00 AM local time, and (ii) Evening commute, consisting of 19610 rides requested on the same day between 6:00 and 7:00 PM local time. For varying load, we sub-sampled the rides in each scenario at sub-sampling rates of 20%, 40%, 60%, 80% and 100% (full dataset). For our experiments, we chose travel duration to be the cost function. Match feasibility was obtained by enforcing a matching constraint for maximum allowable pickup delay of 10 minutes. We allowed k = 10 potential matches per ride to construct the sparse shareability network. For the LSH algorithm, a space discretization of geohash-7 and time discretization of 20 minutes were chosen.