Mixed Robust/Average Submodular Partitioning: Fast Algorithms, Guarantees, and Applications
Authors: Kai Wei, Rishabh K. Iyer, Shengjie Wang, Wenruo Bai, Jeff A. Bilmes
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the efficacy of our algorithms on real-world problems involving data partitioning for distributed optimization (of convex and deep neural network objectives), and also purely unsupervised image segmentation. |
| Researcher Affiliation | Academia | 1 Department of Electrical Engineering, University of Washington 2 Department of Computer Science, University of Washington {kaiwei, rkiyer, wangsj, wrbai, bilmes}@u.washington.edu |
| Pseudocode | Yes | Algorithm 1: GREEDMAX, Algorithm 2: GREEDSAT, Algorithm 3: LOVASZROUND, Algorithm 4: GREEDMIN, Algorithm 5: MMIN, Algorithm 6: MMAX. |
| Open Source Code | No | The paper does not provide any explicit statement about making source code available or a link to a code repository. |
| Open Datasets | Yes | The evaluation task is text categorization on the 20 Newsgroup data set, which consists of 18,774 articles... We test on two tasks: 1) handwritten digit recognition on the MNIST database, which consists of 60,000 training and 10,000 test samples; 2) phone classification on the TIMIT data, which has 1,124,823 training and 112,487 test samples. We test the efficacy of Problem 2 on unsupervised image segmentation over the Grab Cut data set (30 color images and their ground truth foreground/background labels). |
| Dataset Splits | No | The paper specifies training and test set sizes for some datasets (e.g., MNIST and TIMIT) but does not mention any explicit validation splits or their sizes. |
| Hardware Specification | No | The paper mentions 'distributed machine learning' and 'distributed deep neural network training' but does not provide any specific details about the hardware used, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions 'ADMM implemented as [3]' and refers to an 'averaging stochastic gradient descent scheme, similar to the one in [24]', but does not list any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | We run 10 instances of random partitioning on the training data as a baseline. A 4-layer DNN model is applied to the MNIST experiment, and we train a 5-layer DNN for TIMIT. For both experiments the submodular partitioning is obtained by solving the homogeneous case of Problem 1 (λ = 0) using GREEDMAX on a form of clustered facility location... The submodular partitioning for each image is obtained by solving the homogeneous case of Problem 2 (λ = 0.8) using a modified variant of GREEDMIN on the facility location function. |