reproducibilityindex.ai

Locality Sensitive Teaching

Authors: Zhaozhuo Xu, Beidi Chen, Chaojian Li, Weiyang Liu, Le Song, Yingyan Lin, Anshumali Shrivastava

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments in real-world teaching scenarios, we demonstrate that LST performs exponential teachability that matches or even exceeds IMT while achieving at most 425.12 speedups and 99.76% energy savings on Io T devices.
Researcher Affiliation	Collaboration	Zhaozhuo Xu Rice University zx22@rice.edu Beidi Chen Stanford University beidic@stanford.edu Chaojian Li Rice University chaojian.li@rice.edu Weiyang Liu University of Cambridge and MPI-IS Tübingen wl396@cam.ac.uk Le Song Biomap and MBZUAI dasongle@gmail.com Yingyan Lin Rice University yingyan.lin@rice.edu Anshumali Shrivastava Rice University and Third AI Corp. anshumali@rice.edu
Pseudocode	Yes	Algorithm 1: Locality Sensitive Teaching (LST) Result: Model w Input: D = {x, y}, w , wt, η; π permute(1, L); for i in π do h1...h L H1(f(xi, yi))...HL(f(xi, yi)); insert i (id) in L hash tables; end l 0; while not converged do j LSS(f(w , wt)) (Algorithm 2) ; w w η( L(xj, yj)) end return w; Algorithm 2: Locality Sensitive Sampling (LSS) Result: Sample id Input: Query q; l 0; π permute(1, L); for i in π do compute H(q) for hash table i; if Bucket B for H(q) is not Empty then S elements in in B; id random(S); else l l + 1; end end return id
Open Source Code	No	The paper does not provide an unambiguous statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	In the experiments, we use four regression datasets to demonstrate the performance of our LST. First, we use abalone, space ga [61] and mg dataset from LIBSVM [61] and UCI dataset [62]. ...We use slice dataset from UCI dataset [62]. slice dataset contains 53500 training samples and 42800 testing samples. Each sample is a 74dimensional vector. We use slice only for algorithm level evaluation as it causes memory exhaustion on Io T devices. We randomly split 30% of samples in each dataset as a test set while others are training set. All datasets are under MIT license.
Dataset Splits	Yes	We randomly split 30% of samples in each dataset as a test set while others are training set.
Hardware Specification	Yes	The evaluation is on a server with 1 Nvidia Tesla V100 GPU and two 20-core/40-thread processors (Intel Xeon(R) E5-2698 v4 2.20GHz). ... In this section, we compare LST and IMT on Nvidia TX2 devices.
Software Dependencies	Yes	We implement LSH by separating the random projection and hash table lookups into GPU and CPU. We ﬁrst generate hash codes of data vectors by GPU-based random matrix multiplication via Cu Py [60] and compiled CUDA kernels. ... We provide Cython wrapping for the implementation to make it Py Torch friendly.
Experiment Setup	No	The paper mentions using AdaGrad as the learner's optimizer and early stopping, but it does not specify concrete hyperparameter values (e.g., learning rate, batch size) for the experiments. It refers to Appendix G for detailed settings on TX2, but the main text lacks these specifics.