Locality Sensitive Teaching
Authors: Zhaozhuo Xu, Beidi Chen, Chaojian Li, Weiyang Liu, Le Song, Yingyan Lin, Anshumali Shrivastava
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments in real-world teaching scenarios, we demonstrate that LST performs exponential teachability that matches or even exceeds IMT while achieving at most 425.12 speedups and 99.76% energy savings on Io T devices. |
| Researcher Affiliation | Collaboration | Zhaozhuo Xu Rice University zx22@rice.edu Beidi Chen Stanford University beidic@stanford.edu Chaojian Li Rice University chaojian.li@rice.edu Weiyang Liu University of Cambridge and MPI-IS Tübingen wl396@cam.ac.uk Le Song Biomap and MBZUAI dasongle@gmail.com Yingyan Lin Rice University yingyan.lin@rice.edu Anshumali Shrivastava Rice University and Third AI Corp. anshumali@rice.edu |
| Pseudocode | Yes | Algorithm 1: Locality Sensitive Teaching (LST) Result: Model w Input: D = {x, y}, w , wt, η; π permute(1, L); for i in π do h1...h L H1(f(xi, yi))...HL(f(xi, yi)); insert i (id) in L hash tables; end l 0; while not converged do j LSS(f(w , wt)) (Algorithm 2) ; w w η( L(xj, yj)) end return w; Algorithm 2: Locality Sensitive Sampling (LSS) Result: Sample id Input: Query q; l 0; π permute(1, L); for i in π do compute H(q) for hash table i; if Bucket B for H(q) is not Empty then S elements in in B; id random(S); else l l + 1; end end return id |
| Open Source Code | No | The paper does not provide an unambiguous statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | In the experiments, we use four regression datasets to demonstrate the performance of our LST. First, we use abalone, space ga [61] and mg dataset from LIBSVM [61] and UCI dataset [62]. ...We use slice dataset from UCI dataset [62]. slice dataset contains 53500 training samples and 42800 testing samples. Each sample is a 74dimensional vector. We use slice only for algorithm level evaluation as it causes memory exhaustion on Io T devices. We randomly split 30% of samples in each dataset as a test set while others are training set. All datasets are under MIT license. |
| Dataset Splits | Yes | We randomly split 30% of samples in each dataset as a test set while others are training set. |
| Hardware Specification | Yes | The evaluation is on a server with 1 Nvidia Tesla V100 GPU and two 20-core/40-thread processors (Intel Xeon(R) E5-2698 v4 2.20GHz). ... In this section, we compare LST and IMT on Nvidia TX2 devices. |
| Software Dependencies | Yes | We implement LSH by separating the random projection and hash table lookups into GPU and CPU. We first generate hash codes of data vectors by GPU-based random matrix multiplication via Cu Py [60] and compiled CUDA kernels. ... We provide Cython wrapping for the implementation to make it Py Torch friendly. |
| Experiment Setup | No | The paper mentions using AdaGrad as the learner's optimizer and early stopping, but it does not specify concrete hyperparameter values (e.g., learning rate, batch size) for the experiments. It refers to Appendix G for detailed settings on TX2, but the main text lacks these specifics. |