Enhancing Preference-based Linear Bandits via Human Response Time
Authors: Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A Shah
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations on three real-world datasets demonstrate that using response times significantly accelerates preference learning compared to choice-only approaches. Additional materials, such as code, slides, and talk video, are available at https://shenlirobot.github.io/pages/Neur IPS24.html. |
| Researcher Affiliation | Academia | 1Massachusetts Institute of Technology 2Harvard University {shenli,cyl48}@mit.edu, julie_a_shah@csail.mit.edu {yuyangzhang,zhaolinren}@g.harvard.edu, nali@seas.harvard.edu |
| Pseudocode | Yes | The pseudo-code for GSE is shown in algorithm 1. |
| Open Source Code | Yes | Additional materials, such as code, slides, and talk video, are available at https://shenlirobot.github.io/pages/Neur IPS24.html. |
| Open Datasets | Yes | Simulations using three real-world datasets [16, 39, 57]... We accessed the food-risk dataset with choices (-1 or 1) [57] through Yang and Krajbich [76] s repository (https://osf.io/d7s6c/). ... We accessed the snack dataset with choices (yes or no) [16] through the supplementary material provided by Alós-Ferrer et al. [2] at https://www.journals.uchicago.edu/doi/abs/10.1086/ 713732. ... We accessed the snack dataset with choices (-1 or 1) [39] via Fudenberg et al. [27] s replication package at https://www.aeaweb.org/articles?id=10.1257/aer.20150742. |
| Dataset Splits | No | The paper discusses training and testing data for its experiments (e.g., 'The training data was collected from a YN task...' and 'The testing data was collected using a two-alternative forced-choice task...'), but does not explicitly mention a separate validation split or subset used for hyperparameter tuning or model selection. |
| Hardware Specification | Yes | Our empirical experiments (Sec. 5) were conducted on a Mac Book Pro (M3 Pro, Nov 2023) with 36 GB of memory. |
| Software Dependencies | No | The code is written in Julia and builds on the implementation by Tirinzoni and Degenne [63]... Simulations and Bayesian inference for the DDM are implemented using the Julia package Sequential Sampling Models.jl... The paper mentions specific software (Julia, Sequential Sampling Models.jl) but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | The algorithm uses a hyperparameter η to control the number of phases... We manually tuned the buffer size Bbuff in algorithm 1 to 20, 30, or 50 seconds... We considered η {2, 3, 4, 5, 6, 7, 8, 9}... After tuning η, we manually set the buffer size Bbuff in algorithm 1 to 10 seconds... |