Evolutionary Topology Search for Tensor Network Decomposition

Authors: Chao Li, Zhun Sun

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results by both synthetic and real-world data demonstrate that our method can effectively discover the ground-truth topology or even better structures with a small number of generations, and significantly boost the representational power of TN decomposition compared with well-known tensor-train (TT) or tensor-ring (TR) models.
Researcher Affiliation Academia 1Center for Advanced Intelligence Project, RIKEN Institue, Tokyo, Japan. Correspondence to: Chao Li <chao.li@riken.jp>.
Pseudocode Yes Algorithm 1 Genetic meta-algorithm for topology search of TN
Open Source Code Yes Our code is available at https://github.com/ minogame/icml2020-TNGA.
Open Datasets Yes In the experiment, we randomly select 10 natural images from the LIVE dataset (Sheikh et al., 2006) and apply TN decomposition to the data approximation task.
Dataset Splits No The paper does not provide specific training/validation/test dataset splits or mention cross-validation. It only describes the overall setup for synthetic and real-world data.
Hardware Specification Yes In the experiments, we implement our GA on graphics pro cessing unit (GPU, Nvidiar V100) clusters following a central processing unit (CPU, Intelr Xeonr E5-2690) node.
Software Dependencies No The paper mentions using the Adam optimizer (Kingma & Ba, 2014) but does not provide specific version numbers for any software or libraries.
Experiment Setup Yes We specify the maximum number of the generations for all runs to be 30. The population in each generation are set to be 50 for the ground-truth with order {4, 5, 6, 7, 8}, respectively. To balance the scale between the compression ratio and RSE, we adjust the trade-off parameter λ in com puting the fitness score to be 50. During each generation in GA, 20% of the individuals with the worst fitness scores are eliminated. Meanwhile, to calculate the selection proba bility described in Eq. (8), we choose the hyper-parameter α = 200, β = 5. Moreover, we deploy a chance of 5% for each connection to mutate to the opposite state after the recombination is finished. We follow the differentiable programming approach (Liao et al., 2019) for computation of the RSE in Eq. (7). Concretely, for each individual, we initialize the core tensors with Gaussian distribution of zero mean and 0.1 standard deviation, and apply the Adam op timizer (Kingma & Ba, 2014) with a learning rate of 0.001 to carry out the gradient descent steps. In order to avoid the local minima during the TN decomposition, we repeat the decomposition 4 times for each individual under dif ferent initialization, and select the smallest RSE for fitness evaluation.