Efficient Loss-Based Decoding on Graphs for Extreme Classification
Authors: Itay Evron, Edward Moroshko, Koby Crammer
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms. |
| Researcher Affiliation | Academia | Itay Evron Computer Science Dept. The Technion, Israel evron.itay@gmail.com Edward Moroshko Electrical Engineering Dept. The Technion, Israel edward.moroshko@gmail.com Koby Crammer Electrical Engineering Dept. The Technion, Israel koby@ee.technion.ac.il |
| Pseudocode | No | The paper describes algorithms in narrative text and mathematical equations but does not include any formal pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Code is available online at https://github.com/ievron/wltls/ |
| Open Datasets | Yes | We test our algorithms on 5 extreme multiclass datasets previously used in [15], having approximately 10^2, 10^3, and 10^4 classes (see Table 1 in Appendix E.1). |
| Dataset Splits | No | We tune the threshold λ so that the degradation in the multiclass validation accuracy is at most 1% (tuning the threshold is done after the cumbersome learning of the weights, and does not require much time). The paper mentions "multiclass validation accuracy" but does not provide specific details on the dataset split for validation (e.g., percentages, counts, or reference to standard splits). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as CPU/GPU models, memory specifications, or cloud computing instance types used for running the experiments. |
| Software Dependencies | No | We use AROW [10] to train the binary functions {fj}ℓ j=1 of W-LTLS. The paper mentions AROW and programming languages (Python, C++) but does not provide specific version numbers for any software dependencies or libraries used, which are necessary for reproducibility. |
| Experiment Setup | No | For each dataset, we build wide graphs with multiple slice widths. For each configuration (dataset and graph) we perform five runs using random sample shuffling on every epoch, and a random path assignment (as explained in Section 4.1, unlike the greedy policy used in [19]), and report averages over these five runs. The paper describes some general aspects of the experimental setup but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) for the models trained. |