Momentum Aggregation for Private Non-convex ERM

Authors: Hoang Tran, Ashok Cutkosky

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We introduce new algorithms and convergence guarantees for privacy-preserving non-convex Empirical Risk Minimization (ERM) on smooth d-dimensional objectives. ... By combining this new approach with recent analysis of momentum with private aggregation techniques, we provide an (ϵ, δ)-differential private algorithm that finds a gradient of norm O d1/3 (ϵN)2/3 in O N7/3ϵ4/3 d2/3 gradient evaluations, improving the previous best gradient bound of O d1/4.
Researcher Affiliation Academia Hoang Tran Boston University tranhp@bu.edu Ashok Cutkosky Boston University ashok@cutkosky.com
Pseudocode Yes Algorithm 1 Differentially private Normalized SGD with momentum (DP-NSGD) ... Algorithm 2 Sensitivity reduced normalized SGD
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository. The checklist under '3. If you ran experiments...' has all N/A entries, implying no code for experiments.
Open Datasets No The paper is theoretical and defines a problem setting that assumes 'N i.i.d samples x1, ..., x N X from some unknown distribution P', but does not use or provide access information for a specific public dataset.
Dataset Splits No The paper is theoretical and does not conduct experiments; therefore, it does not provide specific dataset split information.
Hardware Specification No The paper is theoretical and does not describe any experiments; thus, it does not specify hardware details. The checklist confirms this with all N/A entries for experimental questions.
Software Dependencies No The paper is theoretical and does not detail specific software dependencies with version numbers for its own work. Any mentions of software are general or in the context of references, without versioning for reproducibility.
Experiment Setup No The paper is theoretical and defines parameters for its algorithms within the scope of mathematical analysis and utility bounds (e.g., 'η = 1/NT', 'α = N3/4ϵ T 3/4 /d'), but these are not presented as concrete experimental setup details like specific hyperparameter values or training configurations for an actual experiment.