Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples

Authors: Dake Bu, Wei Huang, Taiji Suzuki, Ji Cheng, Qingfu Zhang, Zhiqiang Xu, Hau-San Wong

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results validate our findings.
Researcher Affiliation Academia 1Department of Computer Science, City University of Hong Kong, Hong Kong SAR 2Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan 3Department of Mathematical Informatics, the University of Tokyo, Tokyo, Japan 4Mohamed bin Zayed University of Artificial Intelligence, Masdar, United Arab Emirates.
Pseudocode Yes Algorithm 1 Querying Algorithms
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets No Here we generate synthetic data exactly following Definition 2.1. The paper describes the data generation process but does not provide a link or specific access information for the generated dataset itself.
Dataset Splits No The paper specifies sizes for the initial labeled set, querying size, and pool size (e.g., "n0=10, n = 30 and |P| = 190") and discusses training and testing. However, it does not explicitly mention or detail conventional train/validation/test splits, nor does it define a separate validation set for hyperparameter tuning.
Hardware Specification No The paper mentions software like PyTorch but does not specify any hardware components (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The parameters are initialized using the default method in PyTorch... (PyTorch is mentioned, but no version number or other software dependencies with versions are specified).
Experiment Setup Yes The parameters are initialized using the default method in PyTorch, and the models are trained using gradient descent with a learning rate of 0.1 for 200 iterations at the initial stage and the stage after sampling.