The Non-linear $F$-Design and Applications to Interactive Learning
Authors: Alekh Agarwal, Jian Qian, Alexander Rakhlin, Tong Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we employ the F-design in a variety of interactive machine learning tasks, where the design is naturally useful for data collection or exploration. We show that in four diverse settings of confidence band construction, contextual bandits, model-free reinforcement learning, and active learning, F-design can be combined with existing approaches in a black-box manner to yield state-of-the-art results in known problem settings as well as to generalize to novel ones. |
| Researcher Affiliation | Collaboration | 1Google 2MIT 3UIUC. Correspondence to: Alekh Agarwal <alekhagarwal@google.com>, Jian Qian <jianqian@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Greedy optimization of F-condition number; Algorithm 2 Bucketing; Algorithm 3 FW algorithm for F-design; Algorithm 4 Subgradient descent for non-linear G-optimal design; Algorithm 5 Contextual Explorative E2D; Algorithm 6 Two time Scale Thompson Sampling with Non-linear Design (TS2-ND). |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that open-source code for the described methodology is available. |
| Open Datasets | No | The paper discusses various application settings like 'Simultaneous Confidence Bands', 'Contextual Bandits', 'Model-free RL', and 'Active Learning', mentioning 'regression data sampled from ρ' or 'rewards are in [0, 1]'. However, it does not specify or provide access information (links, citations, or names) for any concrete publicly available datasets used for training. |
| Dataset Splits | No | The paper focuses on theoretical bounds and applications to problem settings, rather than detailing experimental setups with specific train/validation/test dataset splits. Therefore, it does not provide specific dataset split information for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments or computations. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate any experimental work. |
| Experiment Setup | No | The paper defines parameters within its theoretical algorithms (e.g., time horizon T, regularization ϵ0), but it does not specify concrete hyperparameter values, training configurations, or system-level settings that would be used in an empirical experimental setup. |