Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

AffectiveTweets: a Weka Package for Analyzing Affect in Tweets

Authors: Felipe Bravo-Marquez, Eibe Frank, Bernhard Pfahringer, Saif M. Mohammad

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013). Classification results on the testing partition and execution times are shown in Table 1.
Researcher Affiliation Academia Felipe Bravo-Marquez EMAIL Department of Computer Science, University of Chile & IMFD, Santiago, Chile Eibe Frank EMAIL Bernhard Pfahringer EMAIL Department of Computer Science, University of Waikato, Hamilton, New Zealand Saif M. Mohammad EMAIL National Research Council Canada, Ottawa, ON, Canada
Pseudocode No The paper describes the functionalities of the package as Weka filters and explains how they work (e.g., 'The Tweet To Sparse Feature Vector filter calculates several sparse features for every tweet'), but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The software is implemented as a Weka2 package that can be installed with the Weka package manager. It can be accessed through Weka s GUIs or the command line interface. It is licensed under the GNU General Public License, Version 3 and hosted on Github.3
Open Datasets Yes The package was used by several teams in the shared tasks: Emo Int 2017 and Affect in Tweets Sem Eval 2018 Task 1. ... This list includes AFINN ( Arup Nielsen, 2011), the Sentiment140 lexicon (Kiritchenko et al., 2014), and others. ... For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013).
Dataset Splits Yes For demonstration, we benchmark Affective Tweets against similar and equivalent models built using the NLTK sentiment analysis module and Scikit-learn (Pedregosa et al., 2011) on the dataset from the Sem Eval 2013 Sentiment Analysis in Twitter Message Polarity Classification task (Nakov et al., 2013). Classification results on the testing partition and execution times are shown in Table 1.
Hardware Specification No The paper reports execution times in Table 1, but it does not provide any specific hardware details such as CPU, GPU, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions several software packages and libraries used, such as 'Weka (Hall et al., 2009)', 'Weka Deeplearning4j package (Lang et al., 2019)', 'Tweet NLP library (Gimpel et al., 2011)', 'NLTK (Bird and Loper, 2004)', and 'Scikit-learn (Pedregosa et al., 2011)'. However, it does not provide specific version numbers for these components.
Experiment Setup No Table 1 states 'Each model consists of a logistic regression trained on the corresponding features.' However, the paper does not provide specific hyperparameters for the logistic regression or other detailed training configurations such as learning rates, batch sizes, or number of epochs.