Data Representation and Compression Using Linear-Programming Approximations

Authors: Hristo Paskov, John Mitchell, Trevor Hastie

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, section 4 provides empirical evidence that deep compression finds hierarchical structure in data that is useful for learning and compression, and section 5 concludes. The entire Section 4 titled 'EXPERIMENTS' details data analysis, metrics, and comparisons.
Researcher Affiliation Academia Hristo S. Paskov Computer Science Department Stanford University hpaskov@stanford.edu John C. Mitchell Computer Science Department Stanford University jcm@stanford.edu Trevor J. Hastie Statistics Department Stanford University hastie@stanford.edu
Pseudocode No The paper describes algorithms and formulations mathematically but does not include a figure, block, or section explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code No No explicit statement or link indicating the release of open-source code for the Dracula framework described in this paper.
Open Datasets Yes Protein Data We ran Dracula using 7-grams and λ = 1 on 131 protein sequences that are labeled with the kingdom and phylum of their organism of origin (pro). Bacterial proteins (73) dominate this dataset, 68 of which evenly come from Actinobacteria (A) and Fermicutes (F). [Reference: Protein classification benchmark collection. http://hydra.icgeb.trieste.it/benchmark/index.php?page=00.] We use a dataset of 10, 662 movie review sentences (Pang & Lee (2005)) labeled as having positive or negative sentiment. [Reference: Pang, Bo and Lee, Lillian. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115 124. Association for Computational Linguistics, 2005.]
Dataset Splits Yes We extracted 100 sentences from each of the training and testing splits of the Reuters dataset (Liu) for 10 authors, i.e. 2, 000 total sentences... and ...Table 3 compares the 10-fold CV accuracy of a multinomial na ıve-Bayes (NB) classifier...
Hardware Specification No The paper mentions using "Gurobi" as a solver but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies Yes We used Gurobi (Gurobi Optimization (2015)) to solve the refined LP relaxation of Dracula for all of our experiments.
Experiment Setup Yes We limited our parameter tuning to the dictionary pointer cost λ (discussed in the solution path section) as this had the largest effect on performance. Experiments were performed with τ = 0, α = 1, a maximum n-gram length, and only on n-grams that appear at least twice in each corpus. We ran Dracula using 7-grams and λ = 1 on 131 protein sequences... We compare All features to Top features from Dracula and CFL using an ℓ2-regularized SVM with C = 1. We ran Dracula on this representation with 10-grams... We ran Dracula using 5-grams to highlight the utility of Flat features...