reproducibilityindex.ai

Machine Teaching of Active Sequential Learners

Authors: Tomi Peltola, Mustafa Mert Çelikok, Pedram Daee, Samuel Kaski

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the formulation with multi-armed bandit learners in simulated experiments and a user study. The results show that learning is improved by (i) planning teaching and (ii) the learner having a model of the teacher.
Researcher Affiliation	Academia	Tomi Peltola tomi.peltola@aalto.fi Mustafa Mert Çelikok mustafa.celikok@aalto.fi Pedram Daee pedram.daee@aalto.fi Samuel Kaski samuel.kaski@aalto.fi Helsinki Institute for Information Technology HIIT Department of Computer Science, Aalto University, Helsinki, Finland
Pseudocode	No	The paper describes algorithms and models (e.g., Thompson sampling, teacher model) in prose and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Source code is available at https://github. com/Aalto PML/machine-teaching-of-active-sequential-learners.
Open Datasets	Yes	We use a word relevance dataset for simulating an information retrieval task... The Word dataset is a random selection of 10,000 words from Google s Word2Vec vectors, pre-trained on Google News dataset [42].
Dataset Splits	No	The paper describes the setup of simulation experiments and a user study, but it does not specify explicit training, validation, or test dataset splits in the traditional machine learning sense. For example, it mentions
Hardware Specification	No	The paper states: "We acknowledge the computational resources provided by the Aalto Science-IT Project." However, this does not provide specific hardware details such as CPU/GPU models, memory, or specific cloud instance types.
Software Dependencies	Yes	We implemented the models in the probabilistic programming language Pyro (version 0.3, under Py Torch v1.0) [40] and approximate the posterior distributions with Laplace approximations [41, Section 4.1].
Experiment Setup	Yes	The ground-truth relevance proﬁle is generated by ﬁrst setting ˆθ = [c, dˆx] RM+1, where c = 4 is a weight for an intercept term (a constant element of 1 is added to the xs) and d = 8 is a scaling factor. [...] We use ˆβ = 20 as the planning teacher s optimality parameter and also set β of the learner s teacher model to the same value. For multi-step models, we set γt = 1/T, so that they plan to maximise the average return up to horizon T.