An Information Theory based Approach to Multisource Clustering

Authors: Pierre-Alexandre Murena, Jérémie Sublime, Basarab Matei, Antoine Cornuéjols

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental we propose a new algorithm based on solid theoretical basis, and test it on several real and artificial data sets.
Researcher Affiliation Academia 1 LTCI T el ecom Paris Tech, Paris, France 2 UMR MIA-Paris, Agro Paris Tech, INRA, Universit e Paris-Saclay, Paris, France 3 LISITE laboratory RDI team, ISEP, 10 rue de Vanves, Issy-les-Moulineaux, France 4 Universit e Paris 13 Sorbonne Paris Cit e, LIPN CNRS UMR 7030, Villetaneuse, France
Pseudocode No The paper describes the algorithm steps in paragraph form but does not include a formally structured pseudocode or algorithm block.
Open Source Code No The paper does not provide any concrete access information for open-source code.
Open Datasets Yes The Wisconsin Data Breast Cancer (UCI): this data set contains 569 instances with 30 parameters and 2 classes. These 30 parameters contain 10 descriptors for 3 different cells (10 each) of the same patient. This data set can easily be split into 3 views: one for each cell. The Spam Base data set (UCI): The Spam Base data set contains 4601 observations described by 57 attributes and a label column: Spam or not Spam (1 or 0). The different attributes can be split into views containing word frequencies, letter frequencies and capital run sequences. The VHR Strasbourg data set [Rougier and Puissant, 2014]: it contains the description of 187058 segments extracted from a very high resolution satellite image of the French city of Strasbourg.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification No The paper mentions runtime and parallel computing but does not provide specific hardware details (like exact GPU/CPU models or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions various algorithms and models but does not provide specific version numbers for any software dependencies.
Experiment Setup No The paper describes the comparison setup and algorithm choices but does not provide specific experimental setup details such as hyperparameter values or detailed training configurations.