reproducibilityindex.ai

Stochastic Optimization Schemes for Performative Prediction with Nonconvex Loss

Authors: Qiang LI, Hoi-To Wai

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments corroborate our theories. We consider two examples of performative prediction with non-convex loss based on synthetic data and real data. All simulations are performed with Pytorch on a server using a Intel Xeon 6318 CPU.
Researcher Affiliation	Academia	Qiang Li Hoi-To Wai Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong, Shatin, Hong Kong SAR of China {liqiang, htwai}@se.cuhk.edu.hk
Pseudocode	No	The paper describes algorithms verbally and with mathematical equations (e.g., equation (2) for SGD-GD, equation (23) for lazy deployment), but does not include explicit pseudocode or algorithm blocks labeled as such.
Open Source Code	Yes	The paper will provide open access to the source code, ensuring that the main experimental results can be faithfully reproduced.
Open Datasets	Yes	Our second example deals with the task of training a neural network (NN) on the spambase Hopkins et al. [1999] dataset with m = 4601 samples, each with d = 48 features.
Dataset Splits	No	We split the training/test sets as 8 : 2.
Hardware Specification	Yes	All simulations are performed with Pytorch on a server using a Intel Xeon 6318 CPU.
Software Dependencies	No	All simulations are performed with Pytorch on a server using a Intel Xeon 6318 CPU.
Experiment Setup	Yes	For (2), the batch size is b = 1 and the stepsize is γt = γ = 1/√T with T = 10^6. In our experiment, we set ϵNN ∈ {0, 10, 100}, batch size as b = 8. For SGD-GD, we use γt = γ = 200/√T and for lazy deployment, we use γ = 200/(K√T) with T = 10^5. The NN encoded in fθ(x) consists of three fully-connected layers with tanh activation and a sigmoid output layer, i.e., fθ(x) = Sigmoid(θ(1)tanh(θ(2)tanh(θ(3)x))) , where θ(i) := [w(i); b(i)] concatenates the weight and bias for each layer with d1 = 10, d2 = 50, d3 = 57 neurons, making a total of d = 3421 parameters for θ.