reproducibilityindex.ai

Discovering User Attribute Stylistic Differences via Paraphrasing

Authors: Daniel Preotiuc-Pietro, Wei Xu, Lyle Ungar

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first assess the predictive performance of user attributes using paraphrase pairs, differentiating between topical and stylistic influences. Results show significant stylistic differences between user traits and these still hold significant predictive power, albeit to a lesser extent than topical content. We then measure how well our method captures meaningful stylistic differences between all author traits by comparing it with human judgements, showing that our scores match with human perception.
Researcher Affiliation	Academia	Computer & Information Science University of Pennsylvania danielpr@sas.upenn.edu,{xwe,ungar}@cis.upenn.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states 'All paraphrase user trait scores described in this study are openly available.1' with a link, but does not explicitly state that the source code for the methodology is available.
Open Datasets	Yes	We use a data set consisting of 104,500,740 posts from 67,337 Twitter users... The gold gender labels are obtained from the users self-reported information in their linked accounts on other networks such as Facebook or My Space, a method used in (Burger et al. 2011; Volkova, Wilson, and Yarowsky 2013). This data set... was introduced in (Preot iuc-Pietro, Lampos, and Aletras 2015). The age data set... is identified by mining posts... (Volkova and Bachrach 2015).
Dataset Splits	No	The paper mentions 'We randomly select 80% of the users to build probabilities for each phrase and keep 20% for use in evaluating prediction accuracy.' but does not explicitly state a separate validation split or its size.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud resources used for running experiments.
Software Dependencies	No	The paper mentions software tools like 'langid.py', 'Stanford tagger', 'Paraphrase Database (PPDB) 2.0', and 'Moses', but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We use paraphrase pairs that have an equivalence probability of at least 0.2. Given a paraphrase pair, we use phrase (here, 1-3 grams) occurrence statistics... To score which user attribute... we compute the scores Male(w) and Female(w)... We use the Naïve Bayes classifier to assign a score to each user... We compare the performance of the full model using all phrase scores (with a relative frequency of over 10-5 in each dataset)...