Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction

Authors: Joseph Wang, Kirill Trapeznikov, Venkatesh Saligrama

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the performance of our DAG sensor acquisition system, we provide experimental results on data sets previously used in budgeted learning. Three data sets previously used for budget cascades [19, 23] are tested. In these data sets, examples are composed of a small number of sensors (under 4 sensors). To compare performance, we apply the LP approach to learning sensor trees [20] and construct trees containing all subsets of sensors as opposed to fixed order cascades [19, 23]. Next, we examine performance of the DAG system using 3 higher dimensional sets of data previously used to compare budgeted learning performance [11]. In these cases, the dimensionality of the data (between 50 and 400 features) makes exhaustive subset construction computationally infeasible. We greedily construct sensor subsets using Alg. 2, then learn a DAG over all unions of these sensor subsets. We compare performance with CSTC [25] and ASTC [11].
Researcher Affiliation Collaboration Joseph Wang Department of Electrical & Computer Engineering Boston University, Boston, MA 02215 joewang@bu.edu Kirill Trapeznikov Systems & Technology Research Woburn, MA 01801 kirill.trapeznikov@ stresearch.com Venkatesh Saligrama Department of Electrical & Computer Engineering Boston University, Boston, MA 02215 srv@bu.edu
Pseudocode Yes Algorithm 1 Graph Reduce Algorithm
Open Source Code No The paper does not provide concrete access to source code for the described methodology.
Open Datasets Yes Three data sets previously used for budget cascades [19, 23] are tested. Next, we examine performance of the DAG system using 3 higher dimensional sets of data previously used to compare budgeted learning performance [11]. In these cases, the dimensionality of the data (between 50 and 400 features) makes exhaustive subset construction computationally infeasible. We greedily construct sensor subsets using Alg. 2, then learn a DAG over all unions of these sensor subsets. We compare performance with CSTC [25] and ASTC [11]. For all experiments, we use cost sensitive filter trees [2], where each binary classifier in the tree is learned using logistic regression. Homogeneous polynomials are used as decision functions in the filter trees. For all experiments, uniform sensor cost were were varied in the range [0, M] achieve systems with different budgets. Performance between the systems is compared by plotting the average number of features acquired during test-time vs. the average test error. We compare performance of our trained DAG with that of a complete tree trained using an LP surrogate [20] on the landsat, pima, and letter datasets. Next, we compare performance of our trained DAG with that of CSTC [25] and ASTC [11] for the Mini Boo NE, Forest, and CIFAR datasets.
Dataset Splits Yes We use the validation data to find the homogeneous polynomial that gives the best classification performance using all features (Mini Boo NE: linear, Forest: 2nd order, CIFAR: 3rd order).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper mentions using 'logistic regression' and 'cost sensitive filter trees' but does not specify software dependencies with version numbers.
Experiment Setup Yes 3rd-order homogeneous polynomials are used for both the classification and system functions in the LP and DAG. For each data set, Alg. 2 was used to find 7 subsets, with an 8th subset of all features added. An exhaustive DAG was trained over all unions of these 8 subsets. We use the validation data to find the homogeneous polynomial that gives the best classification performance using all features (Mini Boo NE: linear, Forest: 2nd order, CIFAR: 3rd order). These polynomial functions are then used for all classification and policy functions.