Deep Descriptive Clustering

Authors: Hongjing Zhang, Ian Davidson

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct experiments to evaluate our approach empirically. Based on our experiments, we aim to answer the following questions: Can our proposed approach generate better explanations compared to existing methods? (see Sec 4.2) Can it generate more complex explanations such as ontologies (see Sec 4.3)? How does our proposed approach perform in terms of clustering quality? (see Sec 4.4) How does simultaneously clustering and explaining improve our model s performance? (see Sec 4.5)
Researcher Affiliation Academia Hongjing Zhang , Ian Davidson University of California, Davis hjzzhang@ucdavis.edu, davidson@cs.ucdavis.edu
Pseudocode Yes Algorithm 1 presents our training algorithm for the deep descriptive clustering.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes We evaluate the performance of our proposed model on two visual data sets with annotated semantic attributes. We first use Attribute Pascal and Yahoo (a PY) [Farhadi et al., 2009], a small-scale coarse-grained dataset with 64 semantic attributes and 5274 instances. Further, we have studied Animals with Attributes (Aw A) [Lampert et al., 2013], which is a medium-scale dataset in terms of the number of images.
Dataset Splits No The paper mentions evaluating performance under different tag annotated ratios (r% {10, 30, 50}) and averaged over 10 trials, but does not specify exact train/validation/test splits, percentages, or absolute sample counts for each split. The term "validation" is not used in the context of data splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud computing instance types used for running the experiments. It only mentions using "pre-trained Res Net-101 features" which implies a certain computing capability but does not specify the hardware used for their own training.
Software Dependencies No The paper mentions the use of "Re LU" as the activation function and "Adam [Kingma and Ba, 2015]" as the optimizer with default parameters. However, it does not provide specific version numbers for these or any underlying machine learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python).
Experiment Setup Yes For a fair comparison with all the baseline approaches, we use pre-trained Res Net-101 [He et al., 2016] features for all the clustering tasks and the encoder networks of deep descriptive clustering model are stacked by three fully connected layers with size of [1200, 1200, K] where K is the desired number of clusters. We set the expected number of tags for each cluster as 8 and hyper-parameters l, λ, γ as 1, 1, 100 respectively. The tag annotated ratio r is set as 0.5 by default to simulate a challenging setting. The activation function is Re LU, and the optimizer is Adam [Kingma and Ba, 2015] with default parameters.