Learning Mixtures of Linear Classifiers
Authors: Yuekai Sun, Stratis Ioannidis, Andrea Montanari
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct computational experiments to validate the performance of SPECRALMIRROR on subspace estimation, prediction, and clustering. We generate synthetic data using k = 2, with profiles uℓ N(0, I), ℓ= 1, 2 and mixture weights pℓsampled uniformly at random from the k-dimensional simplex. Features are also Gaussian: Xi N(0, I), i = 1, . . . , n; labels generated by the ℓ-th classifier are given by yi = sgn(u T ℓXi), i = 1, . . . , n. Figures 2(a) and 2(b) show that, despite the error in ˆU, using K-NN on this subspace outperforms K-NN on the ambient space. Figure 4(a) measures the statistical efficiency of EM over the estimated subspace versus EM over the ambient space, when EM is initialized close to the true profiles. The second set of experiments, illustrated in Figure 4(b), aims to capture the additional improvement due to the reduction in the number of local minima in the reduced space. |
| Researcher Affiliation | Collaboration | Yuekai Sun YUEKAI@STANFORD.EDU Institute for Computational and Mathematical Engineering, Stanford University, 475 Via Ortega, Stanford, CA 94305 Stratis Ioannidis STRATIS.IOANNIDIS@TECHNICOLOR.COM Technicolor, 175 S San Antonio Rd, Los Altos, CA 94022 Andrea Montanari MONTANARI@STANFORD.EDU Dept. of Electrical Engineering & Dept. of Statistics, Stanford University, 350 Serra Mall, Stanford, CA 94305 |
| Pseudocode | Yes | Algorithm 1 SPECTRALMIRROR Input: Pairs (Xi, Yi), i [n] Output: Subspace estimate b U 1: ˆµ 1 n/2 P n/2 i=1 Xi 2: ˆΣ 1 n/2 P n/2 i=1 (Xi ˆµ)(Xi ˆµ)T 3: ˆr 1 n/2 P n/2 i=1 Yi ˆΣ 1(Xi ˆµ) 4: for each i { n/2 + 1, . . . , n}: Zi Yi sgn ˆr, Xi 5: ˆQ 1 n/2 i= n/2 +1 Zi ˆΣ 1/2(Xi ˆµ)(Xi ˆµ)T ˆΣ 1/2 6: Find eigendecomposition Pd ℓ=1 λℓwℓw T ℓof ˆQ 7: Let λ(1), . . . , λ(k) be the k eigenvalues furthest from the median. 8: ˆU span ˆΣ 1/2w(1), . . . , ˆΣ 1/2w(k) |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the methodology described. |
| Open Datasets | No | The paper states: "We generate synthetic data using k = 2, with profiles uℓ N(0, I), ℓ= 1, 2 and mixture weights pℓsampled uniformly at random from the k-dimensional simplex. Features are also Gaussian: Xi N(0, I), i = 1, . . . , n; labels generated by the ℓ-th classifier are given by yi = sgn(u T ℓXi), i = 1, . . . , n." This indicates synthetic data generation rather than the use of a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes splitting the dataset into two halves for the algorithm's internal processing ("We split the dataset into two halves. Using the first half (i.e., all Xi with 1 i n /2 ), we construct estimates..."). However, it does not provide specific training/validation/test dataset splits in the typical machine learning sense for model evaluation, nor does it explicitly mention a validation set with specific proportions or usage. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments. It focuses on the computational complexity in terms of operations (e.g., "O(d3) operations") but not the physical hardware. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers. It mentions methods like EM algorithm, K-NN, and a logistic model, but not the specific software implementations or versions used. |
| Experiment Setup | Yes | The paper specifies details for the K-NN experiments: "We record the average root mean squared error between predicted and expected labels over the 25 runs. Figures 3(a) and 3(b) show that, despite the error in ˆU, using K-NN on this subspace outperforms K-NN on the ambient space. For each n, we repeat this procedure 25 times with K = n and K = log n." For EM, it specifies initialization: "(a) initialize EM close to the true profiles uℓ, ℓ [k], and (b) randomly initialize EM and choose the best set of profiles from 30 runs.". |