Interaction-Grounded Learning with Action-Inclusive Feedback

Authors: Tengyang Xie, Akanksha Saran, Dylan J Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach.
Researcher Affiliation Collaboration Tengyang Xie UIUC tx10@illinois.edu Akanksha Saran Microsoft Research, NYC akanksha.saran@microsoft.com Dylan J. Foster Microsoft Research, New England dylanfoster@microsoft.com Lekan Molu Microsoft Research, NYC lekanmolu@microsoft.com Ida Momennejad Microsoft Research, NYC idamo@microsoft.com Nan Jiang UIUC nanjiang@illinois.edu Paul Mineiro Microsoft Research, NYC pmineiro@microsoft.com John Langford Microsoft Research, NYC jcl@microsoft.com
Pseudocode Yes Algorithm 1 Action-inclusive IGL (AI-IGL)
Open Source Code No The paper does not contain an explicit statement about the release of its own source code, nor does it provide a link to a repository for the described methodology. It only mentions using existing open-source datasets and simulators.
Open Datasets Yes To verify that our proposed algorithm scales to a variety of tasks, we evaluate performance on more than 200 datasets from the publicly available Open ML Curated Classification Benchmarking Suite [Vanschoren et al., 2015; Casalicchio et al., 2019; Feurer et al., 2021; Bischl et al., 2021]. Open ML CC-18 datasets are licensed under CC-BY license2 and the platform and library are licensed under the BSD (3-Clause) license3.
Dataset Splits Yes We use 90% of the data for training and the remaining 10% for evaluation.
Hardware Specification No The paper does not specify any particular hardware details such as GPU models, CPU types, or memory amounts used for running the experiments. It only mentions general simulation environments.
Software Dependencies No The paper mentions using 'logistic regression with a linear representation' and the OpenML platform, but it does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup Yes All methods use logistic regression with a linear representation. At test time, each method takes the argmax of the policy.