Provable Tensor Factorization with Missing Data
Authors: Prateek Jain, Sewoong Oh
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations suggest that the dependence of the sample size on the dimensionality n is indeed tight. Empirical Results: Theorem 1.1 guarantees exact recovery when p Cr5(log n)4/n3/2. Numerical experiments show that the average recovery rate converges to a universal curve over α, where p = αr1/2 ln n/((1 ρ)n3/2) in Figure 1. Our bound is tight in its dependency n up to a poly-logarithmic factor, but is loose in its dependency in the rank r. |
| Researcher Affiliation | Collaboration | Prateek Jain Microsoft Research Bangalore, India prajain@microsoft.com Sewoong Oh Dept. of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801 swoh@illinois.edu |
| Pseudocode | Yes | Algorithm 1 Alternating Minimization for Tensor Completion 1: Input: PΩ(T), Ω, r, τ, µ 2: Initialize with [(u0 1, σ1), (u0 2, , σ2), . . . , (u0 r, σr)] = RTPM(PΩ(T), r) (RTPM of [1]) 3: [u1, u2, . . . , ur] = Threshold([u0 1, u0 2, . . . , u0 r], µ) (Clipping scheme of [4]) 4: for all t = 1, 2, . . . , τ do 5: /*OUTER LOOP */ 6: for all q = 1, 2, . . . , r do 7: /*INNER LOOP*/ 8: ˆut+1 1 = arg minut+1 q PΩ(T ut+1 q uq uq P ℓ =q σℓ uℓ uℓ uℓ) 2 F 9: σt+1 q = ˆuq t+1 2 10: ut+1 q = ˆut+1 1 / ˆut+1 q 2 11: end for 12: [u1, u2, . . . , ur] [ut+1 1 , ut+1 2 , . . . , ut+1 r ] 13: [σ1, σ2, . . . , σr] [σt+1 1 , σt+1 2 , . . . , σt+1 r ] 14: end for 15: Output: b T = P q [r] σq(uq uq uq) |
| Open Source Code | Yes | A MATLAB implementation of Algorithm 1 used to run the experiments is available at http://web.engr.illinois.edu/ swoh/software/optspace . |
| Open Datasets | No | The paper describes generating synthetic data for experiments ("We generate orthogonal matrices U = [u1, . . . , ur] Rn r uniformly at random...") but does not provide access information (link, citation, or repository) for a publicly available or open dataset that was used. It only describes the data generation process. |
| Dataset Splits | No | The paper does not explicitly provide details about training, validation, and test splits. It mentions random sampling of entries for the tensor generation but not standard dataset splits for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions a MATLAB implementation but does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper describes how the tensors are generated and the error metric, but it does not specify concrete experimental setup details such as hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings. |