Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Thompson Sampling Efficiently Learns to Control Diffusion Processes
Authors: Mohamad Kazem Shirani Faradonbeh, Mohamad Sadegh Shirani Faradonbeh, Mohsen Bayati
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical results through empirical simulations with real matrices. |
| Researcher Affiliation | Academia | Mohamad Kazem Shirani Faradonbeh Department of Statistics University of Georgia Athens, GA 30602 EMAIL Mohamad Sadegh Shirani Faradonbeh Graduate School of Business Stanford University Stanford, CA, 94305 EMAIL Mohsen Bayati Graduate School of Business Stanford University Stanford, CA, 94305 EMAIL |
| Pseudocode | Yes | Algorithm 1 : Stabilization under Uncertainty (...) Algorithm 2 : Thompson Sampling for Efficient Control of Diffusion Processes |
| Open Source Code | No | The paper states in its ethics checklist that code is included to reproduce results (3a: 'Yes ; See Section 6'). However, Section 6, 'Numerical Analysis', does not provide a direct link to a code repository or explicit instructions on where to find the source code. It only references a 'longer version of the paper [54]' which is another arXiv preprint. |
| Open Datasets | Yes | We empirically evaluate the theoretical results of Theorems 1 and 2 for the flight control of X-29A airplane at 2000 ft [49]. |
| Dataset Splits | No | The paper does not explicitly provide training, validation, or test dataset splits. It discusses simulations for a flight control problem using 'true drift matrices' and episodic learning, rather than traditional dataset splitting. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models. The ethics checklist also states 'No' for including compute resources. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers used for the experiments. It describes algorithms and theoretical foundations, and presents numerical analysis without specifying the software environment. |
| Experiment Setup | Yes | Further, we let ΣW = 0.25 Ip, Qx = Ip, and Qu = 0.1 Iq where In is the n by n identity matrix. To update the diffusion process xt in (1), time-steps of length 10 3 are employed. Then, in Algorithm 1, we let σw = 5, κ = τ 3/2 , while τ varies from 4 to 20 seconds. The initial feedback K is generated randomly. (...) On the right hand side of Figure 1, Algorithm 2 is executed for 600 second, for τ n = 20 1.1n. We compare TS with the Randomized Estimate algorithm [2] for 100 different repetitions. |