Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity
Authors: Lin Chen, Christopher Harshaw, Hamed Hassani, Amin Karbasi
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we test our online algorithms for monotone continuous DR-submodular and convex optimization on both real-world and synthetic data sets. We find that our algorithms outperform most baselines, including projected gradient descent, when supplied with stochastic gradient estimates. |
| Researcher Affiliation | Academia | 1Yale Institute for Network Science, Yale University, New Haven, CT, USA 2Department of Electrical Engineering, Yale University 3Department of Computer Science, Yale University 4Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Lin Chen <lin.chen@yale.edu>. |
| Pseudocode | Yes | Algorithm 1 Meta-Frank-Wolfe; Algorithm 2 One-Shot Frank-Wolfe; Algorithm 3 Meta-Frank-Wolfe for online discrete submodular maximization |
| Open Source Code | No | All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM. No parts of the code were optimized past basic Julia usage. (The paper states that code was written but does not provide access or explicitly state it's open source.) |
| Open Datasets | Yes | Joke Recommendations (Continuous) The first set of experiments is to optimize a sequence of continuous facility location objectives on the Jester dataset (Goldberg et al., 2001). ... Topic Summarization We applied the latent Dirichlet allocation to the corpus of Reuters-21578, Distribution 1.0, set the number of topics to 10, and extracted the topic distribution of each news document. |
| Dataset Splits | No | The paper mentions splitting users into 'disjoint batches B1, B2, . . . , BT' and sampling 'T batches of news documents', which implies a sequential processing suitable for online optimization. However, it does not specify traditional train/validation/test splits with percentages or sample counts for model training/evaluation, which is typical for offline machine learning. |
| Hardware Specification | Yes | All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM. |
| Software Dependencies | No | All code was written in the Julia programming language and tested on a Macintosh desktop with an Intel Processor i7 with 16 GB of RAM. No parts of the code were optimized past basic Julia usage. (Only 'Julia programming language' is mentioned, without a specific version number or other software dependencies with versions.) |
| Experiment Setup | Yes | We set the constraint set to {x [0, 1]J : 1 x 1} and choose B = 5. ... We set the batch size B to 40 and we recommend 10 jokes for users. ... The constraint set is {x [0, 1]50 : 1 x 45}. ... We set all edge capacities to 1 and cost functions are of the form f(x) = P e E wex(e)2 where we Unif[100, 120]. ... M is a rank 10 matrix with m = n = 50, and B = 100. |