A Multi-Batch L-BFGS Method for Machine Learning

Authors: Albert S. Berahas, Jorge Nocedal, Martin Takac

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this Section, we present numerical results that evaluate the proposed robust multi-batch L-BFGS scheme (Algorithm 1) on logistic regression problems. Figure 2 shows the performance on the webspam dataset
Researcher Affiliation Academia Albert S. Berahas Northwestern University Evanston, IL albertberahas@u.northwestern.edu Jorge Nocedal Northwestern University Evanston, IL j-nocedal@northwestern.edu Martin Takáˇc Lehigh University Bethlehem, PA takac.mt@gmail.com
Pseudocode Yes A pseudo-code of the proposed method is given below, and depends on several parameters... Algorithm 1 Multi-Batch L-BFGS
Open Source Code No No explicit statement providing concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper was found.
Open Datasets Yes Figure 2 shows the performance on the webspam dataset1, where we compare it against three methods: ... 1LIBSVM: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html.
Dataset Splits No The paper mentions the use of training examples and a training set but does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., exact percentages, sample counts, or citations to predefined splits) needed for reproduction.
Hardware Specification No The paper mentions running experiments on a distributed computing platform with 'K = 16 MPI processes' but does not provide specific hardware details such as GPU/CPU models, memory, or specific cluster configurations.
Software Dependencies No No specific software dependencies with version numbers (e.g., library or solver names with their versions) needed to replicate the experiment were found.
Experiment Setup Yes Figure 2: webspam dataset. Comparison of Robust L-BFGS, L-BFGS (multi-batch L-BFGS without enforcing sample consistency), Gradient Descent (multi-batch Gradient method) and SGD for various batch (r) and overlap (o) sizes. Solid lines show average performance, and dashed lines show worst and best performance, over 10 runs (per algorithm). K = 16 MPI processes. ... Figure 3: webspam dataset. Comparison of Robust L-BFGS and L-BFGS (multi-batch L-BFGS without enforcing sample consistency), for various node failure probabilities p. Solid lines show average performance, and dashed lines show worst and best performance, over 10 runs (per algorithm). K = 16 MPI processes. ... webspam α = 1 r= 1% K = 16 o=20%