Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GeoClip: Geometry-Aware Clipping for Differentially Private SGD

Authors: Atefeh Gilani, Naima Tasnim, Lalitha Sankar, Oliver Kosut

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both tabular and image datasets demonstrate that Geo Clip consistently outperforms existing adaptive clipping methods under the same privacy budget. [...] 4 Experimental Results
Researcher Affiliation Academia Atefeh Gilani Arizona State University Tempe, AZ, USA EMAIL Naima Tasnim Arizona State University Tempe, AZ, USA EMAIL Lalitha Sankar Arizona State University Tempe, AZ, USA EMAIL Oliver Kosut Arizona State University Tempe, AZ, USA EMAIL
Pseudocode Yes Algorithm 1 outlines our proposed Geo Clip method. [...] Algorithm 2 Streaming Rank-k PCA [...] The complete version of this variant is provided as Algorithm 3 in Appendix C.
Open Source Code Yes The code implementation is available on Git Hub [16]. [16] Geo Clip Implementation. https://github.com/atefeh-gilani/Geo Clip, May 2025.
Open Datasets Yes Experiments on both tabular and image datasets [...] Diabetes [17], Breast Cancer [18], and Android Malware [19]. [...] MNIST [20] [...] Fashion-MNIST [21] [...] USPS dataset [22]. All datasets used in the paper are publicly available and have been properly cited with their original sources.
Dataset Splits Yes All datasets are split into 80-10-10 train-validation-test sets for consistent evaluation.
Hardware Specification Yes All experiments were conducted on Google Colab using CPU resources. The computational demands are modest, and no specialized hardware (e.g., GPU or TPU) is required to reproduce the reported results.
Software Dependencies No We either use custom optimizer updates as outlined in our proposed algorithms, or specify it otherwise. [...] The CNN consists of two convolutional and pooling layers followed by a linear compression layer that reduces the feature size to 50, resulting in a total of only 510 trainable parameters. We fine-tune this layer using different methods under varying privacy budgets [...] first trained on MNIST [20] using the Adam optimizer
Experiment Setup Yes All training and test details including data splits and hyperparameters are discussed in the Experimental Results section. [...] We train a linear regression model using various private training methods for 10 epochs with a batch size of 1024, tuning the learning rate for each method to ensure stable convergence. [...] We train for 5 epochs using 20 random seeds [...] batch size = 32 [...] batch size = 64 [...] batch size = 512. [...] β1 = 0.99 and β2 = 0.999. The parameter h1 only needs to be a small positive constant (e.g., 10 15). [...] For h2, we have observed that the values 1 and 10 perform consistently well across datasets. [...] A similar setup is used in Algorithm 2 with β3 (set to 0.99).