Large-scale L-BFGS using MapReduce

Authors: Weizhu Chen, Zhenghao Wang, Jingren Zhou

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We prove the mathematical equivalence of the new Vector-free L-BFGS and demonstrate its excellent performance and scalability using real-world machine learning problems with billions of variables in production clusters. and 6 Experiment and Discussion
Researcher Affiliation Industry Weizhu Chen, Zhenghao Wang, Jingren Zhou Microsoft {wzchen,zhwang,jrzhou}@microsoft.com
Pseudocode Yes Algorithm 1: L-BFGS Algorithm Outline, Algorithm 2: L-BFGS two-loop recursion, Algorithm 3: Vector-free L-BFGS two-loop recursion
Open Source Code No The paper does not provide any information about open-source code availability or links to a repository.
Open Datasets No The dataset we used is from an Ads Click-through Rate (CTR) prediction problem [1] collected from an industrial search engine. There is no link or access information provided for this dataset.
Dataset Splits No We collect 30 days of data and split them into training and test set chronologically. The data from the first 20 days are used as the training set and rest 10 days are used as test set. No explicit validation set split is mentioned.
Hardware Specification No We run the experiment in a shared cluster with tens of thousands of machines. Each machine has up to 12 concurrent vertices. A vertex is generally a map or reduce step with an allocation of 2 cores and 6G memory. This lacks specific model numbers for CPUs or GPUs.
Software Dependencies No The paper mentions a 'map-reduce environment' but does not specify any software names with version numbers.
Experiment Setup Yes We set the historical state length m = 10 and enforce L1[20] regularizer to avoid overfitting and achieve sparsity. The regularizer parameter is tuned following the approach in [18].