Divide-and-Conquer Learning by Anchoring a Conical Hull

Authors: Tianyi Zhou, Jeff A. Bilmes, Carlos Guestrin

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our method to GMM, HMM, LDA, NMF and subspace clustering, then show its competitive performance and scalability over other methods on large datasets. Comprehensive experiments and comparison can be found in 5. [...] 5 Experiments [...] DCA for Non-negative Matrix Factorization on Synthetic Data. The experimental comparison results are shown in Figure 3. [...] DCA for Gaussian Mixture Model on CMU-PIE and YALE Face Dataset. The experimental comparison results are shown in Figure 4. [...] DCA for Hidden Markov Model on Stock Price and Motion Capture Data. The experimental comparison results for stock price modeling and motion segmentation are shown in Figure 5 and Table 2, respectively.
Researcher Affiliation Academia Computer Science & Engineering, Electrical Engineering, University of Washington, Seattle {tianyizh, bilmes, guestrin}@u.washington.edu
Pseudocode Yes Algorithm 1 DCA(X, Y, k, M) Input: Two sets of points (rows) X Rn p and Y Rm p in matrix forms (ref. Table 1 to see X and Y for different models), number of latent factors/variables k, random matrix ensemble M; Output: Anchor set A [m] such that i [n], Xi cone(YA); Divide Step (in parallel): for i = 1 s := O(k log k) do Randomly draw a matrix Φ Rp d from M; Solve sub-problem such as At = MCH(XΦ, Y Φ) by any solver, e.g., (10); end for Conquer Step: i [m], compute ˆg(Yi) = (1/s) Ps t=1 1 At(Yi); Return A as index set of the k points with the largest ˆg(Yi).
Open Source Code No The paper does not provide a statement or a link indicating that the source code for the described methodology is open-source or publicly available.
Open Datasets Yes We apply our method to GMM, HMM, LDA, NMF and subspace clustering, then show its competitive performance and scalability over other methods on large datasets. [...] DCA for Gaussian Mixture Model on CMU-PIE and YALE Face Dataset. [...] DCA for Latent Dirichlet Allocation on NIPS1-17 Dataset. [...] DCA for Subspace Clustering on COIL-100 Dataset.
Dataset Splits Yes We randomly split the raw pixel features into 3 groups, each associates to a view in our multi-view model. [...] s13s29(39/63) means that we split sequence 29 of subject 13 into sub-sequences, each has 63 frames, in which the first 39 ones are for training and the rest are for test. [...] we randomly selected 70% documents for training and the rest 30% is used for test.
Hardware Specification No The paper does not specify any particular hardware (CPU, GPU, memory, etc.) used for running the experiments.
Software Dependencies No The paper does not list any specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, specific libraries or solvers with versions).
Experiment Setup Yes DCA with different number of sub-problems shows slightly less accuracy than greedy algorithms, but the difference is acceptable. Considering its significant acceleration, DCA offers an advantageous trade-off. [...] By increasing the number of sub-problems, the accuracy of DCA improves. [...] DCA-HMM (s=9), DCA-HMM (s=26), DCA-HMM (s=52), DCA-HMM (s=78) [...] DCA LDA(s=801) DCA LDA(s=2001) DCA LDA(s=3336) DCA LDA(s=5070)