Learning with Posterior Sampling for Revenue Management under Time-varying Demand

Authors: Kazuma Shimizu, Junya Honda, Shinji Ito, Shinji Nakadai

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical study shows that the proposed algorithm performs better than other benchmark algorithms and comparably to the optimal policy in hindsight. We also propose a heuristic modification of the proposed algorithm, which further efficiently learns the pricing policy in the experiments.
Researcher Affiliation Collaboration 1NEC Corporation 2Kyoto University 3RIKEN AIP 4Intent Exchange, Inc.
Pseudocode Yes Algorithm 1: TS-episodic; Algorithm 2: TS-dynamic
Open Source Code Yes 2The code of the experiments is available at: https://github.com/NECDSresearch2007/RM-TSepisodic-and-dynamic.
Open Datasets No The paper uses simulated demand distributions based on specified parameters (e.g., Poisson distribution with mean parameters λ(t, p) = 50 exp(p+t/5)). It does not use a publicly available dataset or provide a link to a generated dataset.
Dataset Splits No The paper describes the number of episodes and independent trials for its simulations (e.g., S=5000 episodes, 100 trials) but does not specify traditional train/validation/test dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper mentions using linear programming (LP) and 'off-the-shelf solvers' but does not specify any software names with version numbers (e.g., Python, specific LP solvers, or libraries).
Experiment Setup Yes Experimental Settings We consider the set of K = 9 prices P = {1, 2, . . . , 9} with a shut-off price p . The selling horizon is set to T = 10. The true demand distribution is set to Poisson distributions with mean demand parameters λ(t, p) = 50 exp p+t 5 , depending on the time t and price p. ... The initial inventory is set to n0 = 1000 and 50... Independent Gamma Prior: For Example 1 in Section 2.1, we set prior gamma distributions with shape α = 10 and scale β = 1 for all k [K] and t [T]. Gaussian Process (GP) Prior: For Example 2 in Section 2.1, we took the mean function µ as a zero function and the kernel function as an anisotropic radial basis function kernel defined as, K ((p, t), (p , t )) = exp (t t )2/σ2 t (p p )2/σ2 p where σt = 3 σp = 2.5.