PCMC-Net: Feature-based Pairwise Choice Markov Chains
Authors: Alix Lhéritier
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show our network significantly outperforming, in terms of prediction accuracy and logarithmic loss, feature engineered standard and latent class Multinomial Logit models as well as recent machine learning approaches. |
| Researcher Affiliation | Industry | Alix Lhéritier Amadeus SAS F-06902 Sophia-Antipolis, France alix.lheritier@amadeus.com |
| Pseudocode | No | The paper includes an architecture diagram in Figure 1, but no explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/alherit/PCMC-Net. |
| Open Datasets | Yes | We used the dataset from Mottini & Acuna-Agost (2017) consisting of flight bookings sessions on a set of European origins and destinations. |
| Dataset Splits | Yes | Early stopping is performed during training if no significant improvement (greater than 0.01 with respect to the best log loss obtained so far) is made on a validation set (a random sample consisting of 10% of the choice sessions from the training set) during 5 epochs. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, or cloud computing instances with their specifications) used for running its experiments. |
| Software Dependencies | No | PCMC-Net was implemented in Py Torch (Paszke et al., 2017). Stochastic gradient optimization is performed with Adam (Kingma & Ba, 2015). |
| Experiment Setup | Yes | We instantiate PCMC-Net with an identity representation layer with da = 4 and d0 = 0 and a transition rate layer with h {1, 2, 3} hidden layers of ν = 16 nodes with Leaky Re LU activation (slope = 0.01) and ϵ = 0.5. We trained it with the Adam optimizer with one choice set per iteration, a learning rate of 0.001 and no dropout, for 100 epochs. Table 2: Hyperparameters optimized with Bayesian optimization. parameter range best value learning rate {10 i}i=1...6 0.001 batch size (in sessions) {2i}i=0...4 16 hidden layers in fwq {1, 2, 3} 2 nodes per layer in fwq {2i}i=5...9 512 activation {Re LU, Sigmoid, Tanh, Leaky Re LU} Leaky Re LU |