Preference Based Adaptation for Learning Objectives

Authors: Yao-Xiang Ding, Zhi-Hua Zhou

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply the overall approach to multi-label learning, and show that the proposed approach achieves significant performance under various multi-label performance measures.
Researcher Affiliation Academia Yao-Xiang Ding Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China {dingyx, zhouzh}@lamda.nju.edu.cn
Pseudocode Yes Algorithm 1 Dueling bandit Learning for Logit Model (DL2M)
Open Source Code No The paper does not provide an explicit statement about the release of source code or a link to a code repository for the described methodology.
Open Datasets Yes The experiments are conducted on six benchmark multi-label datasets 1: emotions, CAL500, enron, Corel5k, medical and bibtex. (Footnote 1: http://mulan.sourceforge.net/datasets-mlc.html)
Dataset Splits Yes To implement DL2M , each dataset is randomly split into training, validation and testing set, with ratio of size 3:1:1.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes During the learning process, the preference feedback is generated by testing the learned hypothesis on the validation set, and DL2M is utilized to update the objective for 20 iterations, with c = 0.05, λ = 1.