Hierarchical Linear Disentanglement of Data-Driven Conceptual Spaces
Authors: Rana Alshaikh, Zied Bouraoui, Steven Schockaert
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate whether the discovered features are semantically meaningful, we test how similar they are to natural categories, by training depth-1 decision trees (meaning that only a single feature can be used for prediction) on our feature-based representations. For instance, in the movie domain, we should expect to see common movie genres among the features. Depth1 decision trees should thus be able to predict these genres well. Following [Ager et al., 2018], we also evaluate how well natural categories can be characterized using a small set of features, based on the performance of depth-3 decision trees. |
| Researcher Affiliation | Academia | Rana Alshaikh1 , Zied Bouraoui2 and Steven Schockaert1 1Cardiff University, UK 2CRIL, Univ Artois & CNRS, France {alshaikhr,schockaerts1}@cardiff.ac.uk, zied.bouraoui@cril.fr |
| Pseudocode | No | The paper describes its methods in prose but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The datasets and source code are available online at https: //github.com/rana-alshaikh/Hierarchical Linear Disentanglement. |
| Open Datasets | Yes | The datasets and source code are available online at https: //github.com/rana-alshaikh/Hierarchical Linear Disentanglement. For the movies and place type domains, we used the embeddings that were shared by [Derrac and Schockaert, 2015]. |
| Dataset Splits | Yes | The datasets are divided into 70% training and 30% testing splits. To tune the parameters, we used 5-fold cross-validation on the training split. Since the movies dataset is substantially larger, in that case we instead used a fixed 60% training, 20% testing and 20% tuning split. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware specifications (e.g., CPU, GPU models) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions several algorithms and models (e.g., Doc2Vec, logistic regression, affinity propagation) but does not specify the versions of any underlying software libraries or dependencies (e.g., Python, PyTorch, scikit-learn versions). |
| Experiment Setup | Yes | For the methods which use affinity propagation, we can only influence the number of clusters indirectly, by changing the so-called preference parameter of this clustering algorithm. As is usual, this parameter is chosen relative to the median µ of the affinity scores. For the methods Sub and Ortho, we considered values from {0.7µ, 0.9µ, µ, 1.1µ, 1.3µ}. ... To obtain the feature directions, we used logistic regression and only considered words for which the corresponding Kappa score is at least 0.3. To reduce the computation time, for datasets where this led to more than 5000 features, only the 5000 top-scoring words are retained. When learning directions for the sub-features, we use a lower Kappa score of 0.1... |