A Robust Test for the Stationarity Assumption in Sequential Decision Making
Authors: Jitao Wang, Chengchun Shi, Zhenke Wu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments. |
| Researcher Affiliation | Academia | 1Department of Biostatistics, University of Michigan, Ann Arbor 2Department of Statistics, London School of Economics and Political Science. |
| Pseudocode | Yes | Algorithm 1 Proposed testing procedure |
| Open Source Code | Yes | Code is available at https://github.com/jtwang95/Double_CUSUM_RL. |
| Open Datasets | Yes | In this section, we apply the proposed testing procedure to a real-world mobile health dataset collected from a micro-randomized trial (MRT) aiming at improving the health outcomes of the medical interns in the United States by sending the push notifications through mobile app to induce and maintain healthy behaviors related to physical activity, sleep and mood (Ne Camp et al., 2020). |
| Dataset Splits | No | The paper describes a random, even division of subject indices into two disjoint sets I1 and I2 for method computation (cross-fitting), but does not provide explicit train/validation/test dataset splits with percentages or sample counts for model evaluation purposes. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) were provided for running the experiments. |
| Software Dependencies | No | The paper mentions using neural networks, Gaussian mixture models (GMM), logistic regression (LR), and double deep Q network (double DQN) algorithms, and specifies architectural details and activation functions. However, it does not provide specific version numbers for any of the software libraries or frameworks used (e.g., PyTorch, TensorFlow, scikit-learn). |
| Experiment Setup | Yes | To implement the proposed test, the boundary removal parameter ϵ is set to 0.1. 5000 bootstrap samples are generated to compute p-values. ... In the context of continuous state-space MDP with binary actions, H is set to be the class of feed-forward neural networks that contain a single hidden layer with 32 neurons and the sigmoid function as the activation function. ... The loss function is set to be the Gaussian negative log likelihood. ... We set the neural network that used to learn (p[0,t], p[t,T ]) to have two hidden layers with 128 nodes in each layer along with Re LU activation function. The corresponding learning rate is set to 0.001. ... The neural net with structure [32, 64, 128, 64, 32] serves as the backbone of the Q network and the discount factor is set to 0.9. |