reproducibilityindex.ai

Hierarchical Linear Disentanglement of Data-Driven Conceptual Spaces

Authors: Rana Alshaikh, Zied Bouraoui, Steven Schockaert

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate whether the discovered features are semantically meaningful, we test how similar they are to natural categories, by training depth-1 decision trees (meaning that only a single feature can be used for prediction) on our feature-based representations. For instance, in the movie domain, we should expect to see common movie genres among the features. Depth1 decision trees should thus be able to predict these genres well. Following [Ager et al., 2018], we also evaluate how well natural categories can be characterized using a small set of features, based on the performance of depth-3 decision trees.
Researcher Affiliation	Academia	Rana Alshaikh1 , Zied Bouraoui2 and Steven Schockaert1 1Cardiff University, UK 2CRIL, Univ Artois & CNRS, France {alshaikhr,schockaerts1}@cardiff.ac.uk, zied.bouraoui@cril.fr
Pseudocode	No	The paper describes its methods in prose but does not include any formal pseudocode or algorithm blocks.
Open Source Code	Yes	The datasets and source code are available online at https: //github.com/rana-alshaikh/Hierarchical Linear Disentanglement.
Open Datasets	Yes	The datasets and source code are available online at https: //github.com/rana-alshaikh/Hierarchical Linear Disentanglement. For the movies and place type domains, we used the embeddings that were shared by [Derrac and Schockaert, 2015].
Dataset Splits	Yes	The datasets are divided into 70% training and 30% testing splits. To tune the parameters, we used 5-fold cross-validation on the training split. Since the movies dataset is substantially larger, in that case we instead used a ﬁxed 60% training, 20% testing and 20% tuning split.
Hardware Specification	No	The paper does not provide specific details regarding the hardware specifications (e.g., CPU, GPU models) used for conducting the experiments.
Software Dependencies	No	The paper mentions several algorithms and models (e.g., Doc2Vec, logistic regression, affinity propagation) but does not specify the versions of any underlying software libraries or dependencies (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup	Yes	For the methods which use afﬁnity propagation, we can only inﬂuence the number of clusters indirectly, by changing the so-called preference parameter of this clustering algorithm. As is usual, this parameter is chosen relative to the median µ of the afﬁnity scores. For the methods Sub and Ortho, we considered values from {0.7µ, 0.9µ, µ, 1.1µ, 1.3µ}. ... To obtain the feature directions, we used logistic regression and only considered words for which the corresponding Kappa score is at least 0.3. To reduce the computation time, for datasets where this led to more than 5000 features, only the 5000 top-scoring words are retained. When learning directions for the sub-features, we use a lower Kappa score of 0.1...