On the Computational Complexity of Private High-dimensional Model Selection

Authors: Saptarshi Roy, Zehua Wang, Ambuj Tewari

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we perform some illustrative experiments on simulated data showing that our algorithm can quickly identify active features under reasonable privacy budget constraints. 5 Numerical experiments In this section, we will conduct some illustrative simulations.
Researcher Affiliation Academia Saptarshi Roy Zehua Wang Ambuj Tewari Department of Statistics University of Michigan, Ann Arbor {roysapta, wangzeh, tewaria}@umich.edu
Pseudocode No The paper describes the Metropolis-Hastings algorithm and its double swap update scheme in detail with formulas, but it does not present this information in a formally structured pseudocode or algorithm block.
Open Source Code Yes All codes are available at https://github.com/roysaptaumich/DP-BSS.
Open Datasets No The paper uses simulated data generated according to specified distributions (Uniform and Gaussian design) and parameters (n, p, s, noise distribution), rather than pre-existing publicly available datasets.
Dataset Splits No The paper uses simulated data and evaluates model recovery, but does not specify explicit train/validation/test dataset splits.
Hardware Specification Yes All the experiments were performed in the Great Lakes cluster with 16 cores and 10 GB RAM.
Software Dependencies No The paper mentions using the ABESS algorithm and the CVXPY package, but does not specify their version numbers.
Experiment Setup Yes In detail, we set n = 900, p = 2000, and the sparsity level s = 4. We generate the entries of the noise w independently from Uniform( 0.1, 0.1), and consider the linear model (1). We choose the design vector β with true sparsity s = 4 and the support set γ = {j : 1 j 4}. We set all the signal strength to be equal, taking the following two forms: (i) Strong signal: βj = 2{(s log p)/n}1/2, and (ii) Weak signal: βj = 2{(log p)/n}1/2 for all j γ . Under these setups, we consider the privacy parameter ε {0.5, 1, 3, 5, 10} which are acceptable choices of ε [35]. Moreover, similar (or larger) choices of ε are common in various applications including US census study [14], socio-economic study [38], and industrial applications [3, 52]. For the Metropolis-Hastings random walk, we vary K {0.5, 2, 3, 3.5} and initialize 10 independent Markov chains from random initializations and record the F-score of the last iteration.