CAWA: An Attention-Network for Credit Attribution
Authors: Saurav Manchanda, George Karypis8472-8479
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the credit attribution task on a variety of datasets show that the sentence class labels generated by CAWA outperform the competing approaches. Additionally, on the multilabel text classification task, CAWA performs better than the competing credit attribution approaches. |
| Researcher Affiliation | Academia | Saurav Manchanda, George Karypis University of Minnesota, Twin Cities, USA {manch043, karypis}@umn.edu |
| Pseudocode | No | The paper does not include a pseudocode block or algorithm. |
| Open Source Code | Yes | 1Our code and data are available at https://github.com/ gurdaspuriya/cawa. |
| Open Datasets | Yes | We performed experiments on five multilabel text datasets from different domains: Movies (Bamman, O Connor, and Smith 2014), Ohsumed (Hersh et al. 1994), TMC20072, Patents3, Delicious (Zubiaga et al. 2009). |
| Dataset Splits | Yes | For both the credit attribution and multilabel classification tasks, we used the same training and test dataset split as used in (Manchanda and Karypis 2018). For the credit attribution, the test dataset is synthetic, and each test document corresponds to multiple single-label documents concatenated together (thus, giving us ground truth segment labels for a document). Additionally, we also use a validation dataset, created in a similar manner to this test dataset, for the hyperparameter selection. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'scikit-multilearn' and 'ADAM' as software used, but does not provide specific version numbers for these or any other key software components. |
| Experiment Setup | Yes | For CAWA, DNN+A, and DNN-A, the number of nodes in the each of the hidden layer, all representations length, as well as the batch size for training the CAWA was set to 256. For regularization, we used a dropout (Srivastava et al. 2014) of 0.5 between all layers, except the output layer. For optimization, we used the ADAM (Kingma and Ba 2014) optimizer. We trained all the models for 100 epochs, with the learning-rate set to 0.001. ... For average pooling in CAWA, we fixed the kernel-size to three. |