The Non-linear $F$-Design and Applications to Interactive Learning

Authors: Alekh Agarwal, Jian Qian, Alexander Rakhlin, Tong Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we employ the F-design in a variety of interactive machine learning tasks, where the design is naturally useful for data collection or exploration. We show that in four diverse settings of confidence band construction, contextual bandits, model-free reinforcement learning, and active learning, F-design can be combined with existing approaches in a black-box manner to yield state-of-the-art results in known problem settings as well as to generalize to novel ones.
Researcher Affiliation Collaboration 1Google 2MIT 3UIUC. Correspondence to: Alekh Agarwal <alekhagarwal@google.com>, Jian Qian <jianqian@mit.edu>.
Pseudocode Yes Algorithm 1 Greedy optimization of F-condition number; Algorithm 2 Bucketing; Algorithm 3 FW algorithm for F-design; Algorithm 4 Subgradient descent for non-linear G-optimal design; Algorithm 5 Contextual Explorative E2D; Algorithm 6 Two time Scale Thompson Sampling with Non-linear Design (TS2-ND).
Open Source Code No The paper does not provide any explicit statements or links indicating that open-source code for the described methodology is available.
Open Datasets No The paper discusses various application settings like 'Simultaneous Confidence Bands', 'Contextual Bandits', 'Model-free RL', and 'Active Learning', mentioning 'regression data sampled from ρ' or 'rewards are in [0, 1]'. However, it does not specify or provide access information (links, citations, or names) for any concrete publicly available datasets used for training.
Dataset Splits No The paper focuses on theoretical bounds and applications to problem settings, rather than detailing experimental setups with specific train/validation/test dataset splits. Therefore, it does not provide specific dataset split information for reproducibility.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments or computations.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate any experimental work.
Experiment Setup No The paper defines parameters within its theoretical algorithms (e.g., time horizon T, regularization ϵ0), but it does not specify concrete hyperparameter values, training configurations, or system-level settings that would be used in an empirical experimental setup.