Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Thompson Sampling Efficiently Learns to Control Diffusion Processes

Authors: Mohamad Kazem Shirani Faradonbeh, Mohamad Sadegh Shirani Faradonbeh, Mohsen Bayati

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical results through empirical simulations with real matrices.
Researcher Affiliation Academia Mohamad Kazem Shirani Faradonbeh Department of Statistics University of Georgia Athens, GA 30602 EMAIL Mohamad Sadegh Shirani Faradonbeh Graduate School of Business Stanford University Stanford, CA, 94305 EMAIL Mohsen Bayati Graduate School of Business Stanford University Stanford, CA, 94305 EMAIL
Pseudocode Yes Algorithm 1 : Stabilization under Uncertainty (...) Algorithm 2 : Thompson Sampling for Efficient Control of Diffusion Processes
Open Source Code No The paper states in its ethics checklist that code is included to reproduce results (3a: 'Yes ; See Section 6'). However, Section 6, 'Numerical Analysis', does not provide a direct link to a code repository or explicit instructions on where to find the source code. It only references a 'longer version of the paper [54]' which is another arXiv preprint.
Open Datasets Yes We empirically evaluate the theoretical results of Theorems 1 and 2 for the flight control of X-29A airplane at 2000 ft [49].
Dataset Splits No The paper does not explicitly provide training, validation, or test dataset splits. It discusses simulations for a flight control problem using 'true drift matrices' and episodic learning, rather than traditional dataset splitting.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. The ethics checklist also states 'No' for including compute resources.
Software Dependencies No The paper does not list specific software dependencies with version numbers used for the experiments. It describes algorithms and theoretical foundations, and presents numerical analysis without specifying the software environment.
Experiment Setup Yes Further, we let ΣW = 0.25 Ip, Qx = Ip, and Qu = 0.1 Iq where In is the n by n identity matrix. To update the diffusion process xt in (1), time-steps of length 10 3 are employed. Then, in Algorithm 1, we let σw = 5, κ = τ 3/2 , while τ varies from 4 to 20 seconds. The initial feedback K is generated randomly. (...) On the right hand side of Figure 1, Algorithm 2 is executed for 600 second, for τ n = 20 1.1n. We compare TS with the Randomized Estimate algorithm [2] for 100 different repetitions.