Learning Structured Representations by Embedding Class Hierarchy
Authors: Siqi Zeng, Remi Tachet des Combes, Han Zhao
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we demonstrate that this approach can help to learn more interpretable representations due to the preservation of the tree metric, and leads to better generalization in-distribution as well as under sub-population shifts over multiple datasets. |
| Researcher Affiliation | Collaboration | Siqi Zeng Department of Mathematical Sciences Carnegie Mellon University Pittsburgh, PA 15213, USA siqiz@andrew.cmu.edu; Han Zhao Department of Computer Science University of Illinois, Urbana-Champaign Urbana, IL 61801, USA hanzhao@illinois.edu; Work done while at MSR Montreal. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | We conduct our experiments on MNIST (Lecun et al., 1998), CIFAR100 (Krizhevsky, 2009), and BREEDS (Santurkar et al., 2020). |
| Dataset Splits | No | The paper describes dataset hierarchies and source/target splits for some datasets (e.g., BREEDS, MNIST, CIFAR) but does not provide explicit percentages, counts, or predefined citations for training, validation, and test splits typically needed for full reproducibility of data partitioning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions, or other libraries/solvers). |
| Experiment Setup | Yes | In App. C (Appendix C), the paper states: 'For all experiments, models are trained for 200 epochs using SGD optimizer with momentum 0.9 and initial learning rate 0.01. We use a step scheduler that drops the learning rate by a factor of 10 at epoch 100 and 150.' |