Predicting Choice with Set-Dependent Aggregation
Authors: Nir Rosenfeld, Kojin Oshiba, Yaron Singer
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three large choice datasets demonstrate the utility of our approach. We now present our experimental evaluation on real choice data. Our goal here is to show that aggregation performs well and at scale, and to support our results from Sec. 3. |
| Researcher Affiliation | Academia | 1School of Engineering and Applied Sciences, Harvard University. Correspondence to: Nir Rosenfeld <nirr@seas.harvard.edu>. |
| Pseudocode | No | The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide a specific link to source code for the described methodology, nor does it contain an explicit statement about the code being released or available in supplementary materials. |
| Open Datasets | Yes | Datasets. We evaluate our method on three large datasets: flight itineraries from Amadeus3, hotel reservations from Expedia4, and news recommendations from Outbrain5. [3See Mottini & Acuna-Agost (2017)], [4www.kaggle.com/c/expedia-personalized-sort], [5www.kaggle.com/c/outbrain-click-prediction]. |
| Dataset Splits | Yes | Results are based on averaging 10 random 50:25:25 train-validation-test splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Both set functions w and r are small permutation-invariant neural networks (Zaheer et al., 2017), each having 2 hidden layers of 16 units each with tanh activations and mean pooling. Our main results use ℓ= 24... For all methods we tuned regularization, dropout, and learning rate (when applicable) using Bayesian optimization. Default values were used for other hyper-parameters. For optimization we used Adam with step-wise exponential decay. |