Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Logistic Regression and Classification

Authors: Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

NeurIPS 2014 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct simulations to verify the robustness of Ro LR along with its applicability for robust binary classification. We compare Ro LR with standard logistic regression which estimates the model parameter through maximizing the log-likelihood function.
Researcher Affiliation Academia Jiashi Feng EECS Department & ICSI UC Berkeley EMAIL Huan Xu ME Department National University of Singapore EMAIL Shie Mannor EE Department Technion EMAIL Shuicheng Yan ECE Department National University of Singapore EMAIL
Pseudocode Yes Algorithm 1 Ro LR Input: Contaminated training samples {(x1, y1), . . . , (xn+n1, yn+n1)}, an upper bound on the number of outliers n1, number of inliers n and sample dimension p. Initialization: Set T = 4 p log p/n + log n/n. Preprocessing: Remove samples (xi, yi) whose magnitude satisfies xi T. Solve the following linear programming problem (see Eqn. (3)): ˆβ = arg max β Bp 2 i=1 [y β, x ](i). Output: ˆβ.
Open Source Code No The paper does not provide any concrete statement or link regarding the availability of source code for the described methodology.
Open Datasets No We randomly generated the samples according to the model in Eqn. (1) for the logistic regression problem.
Dataset Splits No The paper mentions 'training samples' and 'additional nt = 1, 000 authentic samples are generated for testing' but does not specify explicit training/validation/test dataset splits or their percentages/counts for the main dataset used for training/validation.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers needed to replicate the experiment.
Experiment Setup Yes In particular, we first sample the model parameter β N(0, Ip) and normalize it as β := β/ β 2. Here p is the dimension of the parameter, which is also the dimension of samples. The samples are drawn i.i.d. from xi N(0, Σx) with Σx = Ip, and the Gaussian noise is sampled as vi N(0, σe). Then, the sample label yi is generated according to P{yi = +1} = τ( β, xi +vi) for the LR case. For the classification case, the sample labels are generated by yi = sign( β, xi +vi) and additional nt = 1, 000 authentic samples are generated for testing. The entries of outliers xo are i.i.d. random variables from uniform distribution [ σo, σo] with σo = 10. ...We fix n = 1, 000 and the λ varies from 0 to 1.2, with a step of 0.1.