Interaction-Grounded Learning with Action-Inclusive Feedback
Authors: Tengyang Xie, Akanksha Saran, Dylan J Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach. |
| Researcher Affiliation | Collaboration | Tengyang Xie UIUC tx10@illinois.edu Akanksha Saran Microsoft Research, NYC akanksha.saran@microsoft.com Dylan J. Foster Microsoft Research, New England dylanfoster@microsoft.com Lekan Molu Microsoft Research, NYC lekanmolu@microsoft.com Ida Momennejad Microsoft Research, NYC idamo@microsoft.com Nan Jiang UIUC nanjiang@illinois.edu Paul Mineiro Microsoft Research, NYC pmineiro@microsoft.com John Langford Microsoft Research, NYC jcl@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Action-inclusive IGL (AI-IGL) |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its own source code, nor does it provide a link to a repository for the described methodology. It only mentions using existing open-source datasets and simulators. |
| Open Datasets | Yes | To verify that our proposed algorithm scales to a variety of tasks, we evaluate performance on more than 200 datasets from the publicly available Open ML Curated Classification Benchmarking Suite [Vanschoren et al., 2015; Casalicchio et al., 2019; Feurer et al., 2021; Bischl et al., 2021]. Open ML CC-18 datasets are licensed under CC-BY license2 and the platform and library are licensed under the BSD (3-Clause) license3. |
| Dataset Splits | Yes | We use 90% of the data for training and the remaining 10% for evaluation. |
| Hardware Specification | No | The paper does not specify any particular hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only mentions general simulation environments. |
| Software Dependencies | No | The paper mentions using 'logistic regression with a linear representation' and the OpenML platform, but it does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | All methods use logistic regression with a linear representation. At test time, each method takes the argmax of the policy. |