reproducibilityindex.ai

Noise-Tolerant Interactive Learning Using Pairwise Comparisons

Authors: Yichong Xu, Hongyang Zhang, Kyle Miller, Aarti Singh, Artur Dubrawski

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our work addresses the aforementioned issues by presenting a new algorithm, Active Data Generation with Adversarial Comparisons (ADGAC)... We analyze ADGAC under Tsybakov (TNC) [30] and adversarial noise conditions for the labeling oracle, along with the adversarial noise condition for the comparison oracle... We propose A2-ADGAC algorithm, which can learn an arbitrary hypothesis class... We derive Margin-ADGAC to learn the class of halfspaces. We present lower bounds on total query complexity for any algorithm that can access both labeling and comparison oracles, and a noise tolerance lower bound for our algorithms. These lower bounds demonstrate that our analysis is nearly optimal.
Researcher Affiliation	Academia	Yichong Xu, Hongyang Zhang, Kyle Miller , Aarti Singh, and Artur Dubrawski Machine Learning Department, Carnegie Mellon University, USA Auton Lab, Carnegie Mellon University, USA {yichongx, hongyanz, aarti, awd}@cs.cmu.edu, mille856@andrew.cmu.edu
Pseudocode	Yes	Algorithm 1 Active Learning Framework Input: ε, δ, a sequence of ni, functions f U, f V . 1: Initialize U U0 X, V V0 C. 2: for i = 1, 2, ..., log(1/ε) do 3: Sample unlabeled dataset S of size ni. Let S {x : x S, x U}. 4: Request the labels of x S and obtain W {(xi, yi) : xi S}. 5: Update V f V (U, V, W, i), U f U(U, V, W, i). Output: Any classiﬁer ˆh V .
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	No	The paper discusses theoretical settings, such as 'a set S with \| S\| = n is sampled i.i.d. from PX'. It does not refer to any specific, named, or publicly accessible dataset with concrete access information.
Dataset Splits	No	The paper focuses on theoretical analysis and algorithms, discussing concepts like 'sample unlabeled dataset S' and 'labels generated by ADGAC', but does not specify practical dataset splits (e.g., training, validation, test percentages or sample counts) for empirical evaluation.
Hardware Specification	No	The paper is theoretical and focuses on algorithms and their complexity bounds. It does not describe any specific hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running experiments.
Software Dependencies	No	The paper describes theoretical algorithms and their analysis. It does not mention any specific software components or their version numbers (e.g., Python, PyTorch, specific libraries or solvers) that would be needed to replicate any experimental setup.
Experiment Setup	No	The paper is theoretical, presenting algorithms and their theoretical guarantees. It does not provide specific details about experimental setups, such as hyperparameter values, optimization settings, or training configurations that would be used in an empirical study.