reproducibilityindex.ai

Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization

Authors: Tyler B. Johnson, Carlos Guestrin

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We include empirical evaluations that compare the scalability of screening and working set methods on real-world problems. ... While many screening tests have been proposed for large-scale optimization, we have not seen the scalability of screening studied in prior literature. Surprisingly, although our screening test signiﬁcantly improves upon many prior results, we ﬁnd that screening scales poorly as the size of the problem increases. In fact, in many cases, screening has negligible effect on overall convergence times. In contrast, our working set algorithm improves convergence times considerably in a number of cases. This result suggests that compared to screening, working set algorithms are signiﬁcantly more useful for scaling optimization to large problems.
Researcher Affiliation	Academia	Tyler B. Johnson University of Washington, Seattle tbjohns@washington.edu Carlos Guestrin University of Washington, Seattle guestrin@cs.washington.edu
Pseudocode	Yes	Algorithm 1 PW-BLITZ
Open Source Code	No	The paper does not contain an unambiguous statement of source code release for the described methodology or a direct link to a code repository.
Open Datasets	Yes	We train an SVM model on the Higgs boson dataset2. This dataset was generated by a team of particle physicists. The classiﬁcation task is to determine whether an event corresponds to the Higgs boson. In order to learn an accurate model, we performed feature engineering on this dataset, resulting in 8010 features. In this experiment, we consider subsets of examples with size m = 104, 105, and 106. Footnotes: 1https://www.kaggle.com/c/ClaimPredictionChallenge and 2https://archive.ics.uci.edu/ml/datasets/HIGGS
Dataset Splits	No	The paper mentions '250,000 training instances' and 'subsets of examples with size m = 104, 105, and 106', but does not provide specific percentages or counts for training, validation, or test splits. It hints at a validation set by mentioning 'minimizes validation loss' but lacks details on its size or how it was separated.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions implementing 'dual coordinate ascent (DCA)' and referencing 'LIBLINEAR library [12]', but it does not provide specific version numbers for any software, libraries, or solvers used in the experiments.
Experiment Setup	No	The paper mentions setting 'λ so that exactly 5% of groups have nonzero weight' and 'C is a tuning parameter', as well as applying screening 'After every ﬁve DCA epochs'. However, it does not provide specific concrete hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training settings.