reproducibilityindex.ai

Predicting the Quality of Short Narratives from Social Media

Authors: Tong Wang, Ping Chen, Boyang Li

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classiﬁer that labeled 28,320 answers as stories. To predict the number of upvotes without the use of social network features, we create neural networks that model textual regions and the interdependence among regions, which serve as strong benchmarks for future research. Our best model achieves a 18.10% reduction in mean square error relative to a strong random forest baseline.
Researcher Affiliation	Collaboration	Tong Wang1,2, Ping Chen1, and Boyang Li2 1University of Massachusetts Boston 2Disney Research
Pseudocode	No	The paper describes its models using block diagrams and mathematical equations but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or links to a code repository.
Open Datasets	No	We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classiﬁer that labeled 28,320 answers as stories. The paper describes the creation of their dataset from Quora but does not provide concrete access information (e.g., a link or citation for public availability) for this created dataset.
Dataset Splits	Yes	We partition the 28,320 story texts into a training set of 21,230 (75%) stories, a validation set of 2,832 (10%) stories and a test set of 4,258 (15%) stories.
Hardware Specification	No	The paper mentions neural networks and computational models but does not specify any hardware used for running the experiments (e.g., GPU, CPU models, memory).
Software Dependencies	No	The paper mentions using the 'word2vec algorithm' and 'logistic classiﬁcation', but does not provide specific version numbers for any software, libraries, or frameworks used (e.g., Python version, TensorFlow/PyTorch version, scikit-learn version).
Experiment Setup	Yes	The ﬁrst layer contains 32 3-by-5 ﬁlters with a horizontal stride of 3, followed by 32 2-by-3 ﬁlters with a horizontal stride of 2, and 16 1-by-3 ﬁlters with strides of 1. Each ﬁlter layer is followed by a Re LU activation function and max-pooling layers with 2-by-2 kernels. The two fully connected layers applied to the document embedding contain 128 units each. As the average length of stories is 369.3, we set the region size ktok to 36 and the number of regions kreg to 10. The dimension of regional embeddings rt is set to 10. The dimension of hidden states is set to 100; the magnitude of gradients is clipped at 1.