Learning (from) Deep Hierarchical Structure among Features

Authors: Yu Zhang, Lei Han5837-5844

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and real-world datasets show the effectiveness of the proposed methods.
Researcher Affiliation Collaboration Yu Zhang,1 Lei Han2 1HKUST, 2Tencent AI Lab
Pseudocode No The paper describes algorithms (FISTA, GIST, ADMM, QP solver) but does not include them in a structured pseudocode block or algorithm listing.
Open Source Code No The paper does not provide any statement or link indicating that the source code for their methods is publicly available.
Open Datasets Yes We experiment on three real-world datasets including the traffic volume data (Han and Zhang 2015b), the breast cancer data (Jacob, Obozinski, and Vert 2009) and the Covtype data. In these datasets, the hierarchical structure over the features is not available.
Dataset Splits Yes We randomly generate 100 samples for testing and use another 100 random samples for validation to choose regularization parameters of all the methods. The regularization parameters are chosen from the same candidate set as used in the synthetic setting via 5-fold cross-validation. By following (Yang et al. 2012; Han and Zhang 2015a), 50%, 30% and 20% of the data are randomly chosen for training, validation, and testing, respectively.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No The paper mentions optimization algorithms like FISTA, GIST, and ADMM, and problem types like QP, but it does not specify any software versions for libraries, frameworks, or programming languages used (e.g., Python, PyTorch, TensorFlow, specific solver versions).
Experiment Setup Yes All of the regularization parameters in different models are chosen from a set {10 5, 10 4 , 1}, except ηi s in the proposed LDHS method. As discussed before, we set ηi+1 = ηi/υ for i < m, where we choose η1 and υ from {10 5, 10 4 , 1} and {1.1, 2, 10}, respectively. In all the settings, we generate n = 100 training samples and set ξ = 2.