Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity

Authors: Lin Chen, Christopher Harshaw, Hamed Hassani, Amin Karbasi

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we test our online algorithms for monotone continuous DR-submodular and convex optimization on both real-world and synthetic data sets. We find that our algorithms outperform most baselines, including projected gradient descent, when supplied with stochastic gradient estimates.
Researcher Affiliation Academia 1Yale Institute for Network Science, Yale University, New Haven, CT, USA 2Department of Electrical Engineering, Yale University 3Department of Computer Science, Yale University 4Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Lin Chen <lin.chen@yale.edu>.
Pseudocode Yes Algorithm 1 Meta-Frank-Wolfe; Algorithm 2 One-Shot Frank-Wolfe; Algorithm 3 Meta-Frank-Wolfe for online discrete submodular maximization
Open Source Code No All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM. No parts of the code were optimized past basic Julia usage. (The paper states that code was written but does not provide access or explicitly state it's open source.)
Open Datasets Yes Joke Recommendations (Continuous) The first set of experiments is to optimize a sequence of continuous facility location objectives on the Jester dataset (Goldberg et al., 2001). ... Topic Summarization We applied the latent Dirichlet allocation to the corpus of Reuters-21578, Distribution 1.0, set the number of topics to 10, and extracted the topic distribution of each news document.
Dataset Splits No The paper mentions splitting users into 'disjoint batches B1, B2, . . . , BT' and sampling 'T batches of news documents', which implies a sequential processing suitable for online optimization. However, it does not specify traditional train/validation/test splits with percentages or sample counts for model training/evaluation, which is typical for offline machine learning.
Hardware Specification Yes All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM.
Software Dependencies No All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM. No parts of the code were optimized past basic Julia usage. (Only 'Julia programming language' is mentioned, without a specific version number or other software dependencies with versions.)
Experiment Setup Yes We set the constraint set to {x [0, 1]J : 1 x 1} and choose B = 5. ... We set the batch size B to 40 and we recommend 10 jokes for users. ... The constraint set is {x [0, 1]50 : 1 x 45}. ... We set all edge capacities to 1 and cost functions are of the form f(x) = P e E wex(e)2 where we Unif[100, 120]. ... M is a rank 10 matrix with m = n = 50, and B = 100.