Large-Scale Distributed Learning via Private On-Device LSH

Authors: Tahseen Rabbani, Marco Bornstein, Furong Huang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we (1) gauge the sensitivity of PGHash and (2) analyze the performance of PGHash and our own DWTA variant (PGHash-D) in training large-scale recommender networks.
Researcher Affiliation Academia Tahseen Rabbani Department of Computer Science University of Maryland trabbani@umd.edu Marco Bornstein Department of Computer Science University of Maryland marcob@umd.edu Furong Huang Department of Computer Science University of Maryland furongh@umd.edu
Pseudocode Yes Algorithm 1 Distributed PGHash
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets Yes We use three extreme multi-label datasets for training recommender networks: Delicious-200K, Amazon-670K, and Wiki LSHTC-325K. These datasets come from the Extreme Classification Repository [4]. and the citation [4] is "Kush Bhatia, Kunal Dahiya, Himanshu Jain, Anshul Mittal, Yashoteja Prabhu, and Manik Varma. The extreme classification repository: Multi-label datasets and code. URL http://manikvarma. org/downloads/XC/XMLRepository. html, 2016."
Dataset Splits No The paper mentions 'test accuracy' and 'test sets' but does not provide explicit training, validation, and test dataset splits or specific methodologies for partitioning the data into these subsets.
Hardware Specification Yes These experiments are run on a cloud cluster using Intel Xeon Silver 4216 processors with 128GB of total memory.
Software Dependencies No Finally, we train our neural network using Tensor Flow.
Experiment Setup Yes Table 1: Hyper-parameters for Federated Experiments (PGHash and Federated SLIDE). Dataset Algorithm Hash Type LR Batch Size Steps per LSH k c Tables CR Delicious-200K PGHash PGHash 1e-4 128 1 8 8 50 1