reproducibilityindex.ai

An Information Theory based Approach to Multisource Clustering

Authors: Pierre-Alexandre Murena, Jérémie Sublime, Basarab Matei, Antoine Cornuéjols

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	we propose a new algorithm based on solid theoretical basis, and test it on several real and artiﬁcial data sets.
Researcher Affiliation	Academia	1 LTCI T el ecom Paris Tech, Paris, France 2 UMR MIA-Paris, Agro Paris Tech, INRA, Universit e Paris-Saclay, Paris, France 3 LISITE laboratory RDI team, ISEP, 10 rue de Vanves, Issy-les-Moulineaux, France 4 Universit e Paris 13 Sorbonne Paris Cit e, LIPN CNRS UMR 7030, Villetaneuse, France
Pseudocode	No	The paper describes the algorithm steps in paragraph form but does not include a formally structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide any concrete access information for open-source code.
Open Datasets	Yes	The Wisconsin Data Breast Cancer (UCI): this data set contains 569 instances with 30 parameters and 2 classes. These 30 parameters contain 10 descriptors for 3 different cells (10 each) of the same patient. This data set can easily be split into 3 views: one for each cell. The Spam Base data set (UCI): The Spam Base data set contains 4601 observations described by 57 attributes and a label column: Spam or not Spam (1 or 0). The different attributes can be split into views containing word frequencies, letter frequencies and capital run sequences. The VHR Strasbourg data set [Rougier and Puissant, 2014]: it contains the description of 187058 segments extracted from a very high resolution satellite image of the French city of Strasbourg.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification	No	The paper mentions runtime and parallel computing but does not provide specific hardware details (like exact GPU/CPU models or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions various algorithms and models but does not provide specific version numbers for any software dependencies.
Experiment Setup	No	The paper describes the comparison setup and algorithm choices but does not provide specific experimental setup details such as hyperparameter values or detailed training configurations.