reproducibilityindex.ai

Understanding Deflation Process in Over-parametrized Tensor Decomposition

Authors: Rong Ge, Yunwei Ren, Xiang Wang, Mo Zhou

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We prove that for orthogonally decomposable tensor, a slightly modiﬁed version of gradient ﬂow would follow a tensor deﬂation process and recover all the tensor components.Our proof suggests that for orthogonal tensors, gradient ﬂow dynamics works similarly as greedy low-rank learning in the matrix setting, which is a ﬁrst step towards understanding the implicit regularization effect of over-parametrized models for low-rank tensors.
Researcher Affiliation	Academia	Rong Ge Duke University rongge@cs.duke.edu Yunwei Ren* Shanghai Jiao Tong University 2016renyunwei@sjtu.edu.cn Xiang Wang* Duke University xwang@cs.duke.edu Mo Zhou* Duke University mozhou@cs.duke.edu
Pseudocode	Yes	Algorithm 1 Tensor Deﬂation Process
Open Source Code	No	The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets	No	The paper constructs a synthetic tensor (T = P i [5] aie 4 i) for illustrative purposes in Figure 1, but does not use or provide access information for any publicly available or open datasets.
Dataset Splits	No	The paper is theoretical and does not conduct experiments that require specifying training, validation, or test dataset splits.
Hardware Specification	No	The paper does not specify any hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper does not mention any specific software dependencies with version numbers required to replicate the work.
Experiment Setup	Yes	Input: Number of components m, initialization scale δ0, re-initialization threshold δ1, increasing rate of epoch length γ, target accuracy ϵ, regularization coefﬁcient λ" and "Theorem 1. For any ϵ exp( o(d/ log d)), there exists γ = Θ(1), m = poly(d), λ = min{O(log d/d), O(ϵ/d1/2)}), α = min{O(λ/d3/2), O(λ2), O(ϵ2/d4)}, δ1 = O(α3/2/m1/2), δ0 = Θ(δ1α/ log1/2(d)) such that with probability 1 1/poly(d) in the (re)-initializations, Algorithm 2 terminates in O(log(d/ϵ)) epochs and returns a tensor T such that