Performance Gaps in Multi-view Clustering under the Nested Matrix-Tensor Model

Authors: Hugo Lebeau, Mohamed El Amine Seddik, José Henrique De Morais Goulart

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental in the context of multi-view clustering, we compare the performance of the tensor and unfolding approaches to the optimal one and specify the gap between them thanks to our theoretical findings, supported by empirical results. and Proofs and simulations. All proofs are deferred to the appendix. Python codes to reproduce simulations are available in the following Git Hub repository https://github.com/Hugo Lebeau/nested_matrix-tensor.
Researcher Affiliation Collaboration Hugo Lebeau1 Mohamed El Amine Seddik2 Jos e Henrique de Morais Goulart3 1Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG 2Technology Innovation Institute 3Univ. Toulouse, INP-ENSEEIHT, IRIT, CNRS
Pseudocode No The paper does not contain any pseudocode or algorithm blocks. It mentions 'Proofs and simulations. All proofs are deferred to the appendix.' but no algorithm or pseudocode.
Open Source Code Yes Python codes to reproduce simulations are available in the following Git Hub repository https://github.com/Hugo Lebeau/nested_matrix-tensor.
Open Datasets No The paper generates synthetic data based on the nested matrix-tensor model (Equation 5) using parameters like n1, n2, n3, p, n, m. For example, 'Empirical Spectral Distribution (ESD) and Limiting Spectral Distribution (LSD) of T(2)T(2) (left) and T(3)T(3) (right) with n1 = 600, n2 = 400 and n3 = 200.' and 'Empirical versus theoretical multi-view clustering performance with parameters (p, n, m) = (150, 300, 60)'. This is not a public dataset.
Dataset Splits No The paper describes simulations with specific parameter sizes (e.g., n1=600, n2=400, n3=200, p=150, n=300, m=60) for its theoretical analysis but does not mention explicit train/validation/test dataset splits, as the data is synthetically generated for specific scenarios rather than drawn from a pre-existing dataset that typically has such splits.
Hardware Specification No The paper mentions running 'simulations' but does not provide any specific details about the hardware used, such as GPU or CPU models, memory, or cloud computing specifications.
Software Dependencies No The paper states 'Python codes to reproduce simulations are available...' but does not specify any version numbers for Python or any specific software libraries or dependencies used in these codes.
Experiment Setup No The paper defines parameters for its statistical model and simulations, such as (n1, n2, n3) dimensions or (p, n, m) for multi-view clustering, and parameters for signal and noise strength (e.g., ρT, βM, µ, h). However, it does not specify experimental setup details such as hyperparameters (e.g., learning rate, batch size, number of epochs) or optimizer settings typically associated with training machine learning models.