Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
AffectiveTweets: a Weka Package for Analyzing Affect in Tweets
Authors: Felipe Bravo-Marquez, Eibe Frank, Bernhard Pfahringer, Saif M. Mohammad
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013). Classification results on the testing partition and execution times are shown in Table 1. |
| Researcher Affiliation | Academia | Felipe Bravo-Marquez EMAIL Department of Computer Science, University of Chile & IMFD, Santiago, Chile Eibe Frank EMAIL Bernhard Pfahringer EMAIL Department of Computer Science, University of Waikato, Hamilton, New Zealand Saif M. Mohammad EMAIL National Research Council Canada, Ottawa, ON, Canada |
| Pseudocode | No | The paper describes the functionalities of the package as Weka filters and explains how they work (e.g., 'The Tweet To Sparse Feature Vector filter calculates several sparse features for every tweet'), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The software is implemented as a Weka2 package that can be installed with the Weka package manager. It can be accessed through Weka s GUIs or the command line interface. It is licensed under the GNU General Public License, Version 3 and hosted on Github.3 |
| Open Datasets | Yes | The package was used by several teams in the shared tasks: Emo Int 2017 and Affect in Tweets Sem Eval 2018 Task 1. ... This list includes AFINN ( Arup Nielsen, 2011), the Sentiment140 lexicon (Kiritchenko et al., 2014), and others. ... For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013). |
| Dataset Splits | Yes | For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013). Classification results on the testing partition and execution times are shown in Table 1. |
| Hardware Specification | No | The paper reports execution times in Table 1, but it does not provide any specific hardware details such as CPU, GPU, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions several software packages and libraries used, such as 'Weka (Hall et al., 2009)', 'Weka Deeplearning4j package (Lang et al., 2019)', 'Tweet NLP library (Gimpel et al., 2011)', 'NLTK (Bird and Loper, 2004)', and 'Scikit-learn (Pedregosa et al., 2011)'. However, it does not provide specific version numbers for these components. |
| Experiment Setup | No | Table 1 states 'Each model consists of a logistic regression trained on the corresponding features.' However, the paper does not provide specific hyperparameters for the logistic regression or other detailed training configurations such as learning rates, batch sizes, or number of epochs. |