reproducibilityindex.ai

Feature Cross-Substitution in Adversarial Classification

Authors: Bo Li, Yevgeniy Vorobeychik

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We support our insight through extensive experiments, exhibiting potential perils of traditional means for feature selection. Our evaluation uses three data sets: Enron email data [21], Ling-spam data [22], and internet advertisement dataset from the UCI repository [23].
Researcher Affiliation	Academia	Bo Li and Yevgeniy Vorobeychik Electrical Engineering and Computer Science Vanderbilt University {bo.li.2,yevgeniy.vorobeychik}@vanderbilt.edu
Pseudocode	Yes	Figure 3: Left: MILP to compute solution to (4). Right: SMA iterative algorithm using clustering and constraint generation. (Algorithm 1 SMA(X) is presented in Figure 3 (right)).
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Our evaluation uses three data sets: Enron email data [21], Ling-spam data [22], and internet advertisement dataset from the UCI repository [23].
Dataset Splits	Yes	The Enron data set was divided into training set of 3172 and a test set of 2000 emails in each of 5 folds of cross-validation, with an equal number of spam and non-spam instances [21]. The Ling-spam data set was divided into 1158 instances for training and 289 for test in cross-validation with ﬁve times as much non-spam as spam, and contains 1000 features from which between 5 and 500 were sub-selected for the experiments. Finally, the UCI data set was divided into 476 training and 119 test instances in ﬁve-fold cross validation, with four times as many advertisement as non-advertisement instances.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions solving mixed-integer linear programs but does not specify any particular software dependencies, libraries, or solvers with version numbers that would be needed for replication.
Experiment Setup	No	The paper describes the overall model and algorithms but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or other detailed training configurations in the main text.