reproducibilityindex.ai

Anytime-Competitive Reinforcement Learning with Policy Prior

Authors: Jianyi Yang, Pengfei Li, Tongxin Li, Adam Wierman, Shaolei Ren

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the application of carbonintelligent computing verify the reward performance and cost constraint guarantee of ACRL. We experiment with the application of resource management for carbon-aware computing [49] to empirically show the beneﬁts of ACRL. Figure 1(a) gives the regret changing the in ﬁrst 500 episodes. Figure 1(b) shows the regret with different λ and b, demonstrating the trade-off between reward optimization and the satisfaction of anytime competitive constraints. Figure 1(c) shows the probability of the violation of the anytime competitive constraints by RL and constrained RL.
Researcher Affiliation	Academia	Jianyi Yang UC Riverside Riverside, CA, USA jyang239@ucr.edu Pengfei Li UC Riverside Riverside, CA, USA pli081@ucr.edu Tongxin Li CUHK Shenzhen Shenzhen, Guangdong, China litongxin@cuhk.edu.cn Adam Wierman Caltech Pasadena, CA, USA adamw@caltech.edu Shaolei Ren UC Riverside Riverside, CA, USA shaolei@ucr.edu
Pseudocode	Yes	Algorithm 1 Anytime-Competitive Decision-making (ACD) Algorithm 2 Anytime-Competitive Reinforcement Learning (ACRL)
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for its methodology.
Open Datasets	Yes	We experiment with the application of resource management for carbon-aware computing [49]... The concrete settings can be found in Appendix A. Appendix A: Experiment Setup ...electricity price data from California ISO [47] and carbon intensity data provided by WattTime. [47] California Independent System Operator. Calfornia renewable datasets. https://www.caiso.com/Pages/default.aspx, 2023.
Dataset Splits	No	The paper mentions using datasets for experiments but does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The paper discusses applications in cloud workload scheduling and datacenters, implying computational resources. However, it does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper describes its algorithms but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The concrete settings can be found in Appendix A. Appendix A: Experiment Setup... The time horizon is set to H = 24. For a time step h, the state xh = (carbon price, electricity price). The action ah represents the workload scheduling decision... The reward rh is the sum of revenues... The cost ch is the computing latency. ...The workload processing rate is set to 2...