Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms
Authors: Tengyu Xu, Zhe Wang, Yingbin Liang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This is the first theoretical study establishing that AC and NAC attain orderwise performance improvement over PG and NPG under infinite horizon due to the incorporation of critic. |
| Researcher Affiliation | Academia | Department of ECE, The Ohio State University |
| Pseudocode | Yes | Algorithm 1 Actor-critic (AC) and natural actor-critic (NAC) online algorithms; Algorithm 2 Minibatch-TD(sini, π, φ, β, Tc, M) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not conduct empirical experiments on a specific dataset; thus, it does not describe any training dataset or its availability. |
| Dataset Splits | No | The paper is theoretical and does not report on empirical experiments; thus, it does not specify any dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is a theoretical study and does not report on empirical experiments; therefore, it does not specify any hardware used. |
| Software Dependencies | No | The paper is a theoretical study and does not report on empirical experiments; therefore, it does not specify any software dependencies with version numbers for replication. |
| Experiment Setup | No | The paper describes algorithms with general parameters like 'actor stepsize α, critic stepsize β, regularization λ' but does not provide specific hyperparameter values or system-level training settings for an empirical experimental setup. |