Predicting the Quality of Short Narratives from Social Media
Authors: Tong Wang, Ping Chen, Boyang Li
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classifier that labeled 28,320 answers as stories. To predict the number of upvotes without the use of social network features, we create neural networks that model textual regions and the interdependence among regions, which serve as strong benchmarks for future research. Our best model achieves a 18.10% reduction in mean square error relative to a strong random forest baseline. |
| Researcher Affiliation | Collaboration | Tong Wang1,2, Ping Chen1, and Boyang Li2 1University of Massachusetts Boston 2Disney Research |
| Pseudocode | No | The paper describes its models using block diagrams and mathematical equations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | No | We collected 54,484 answers from a crowd-powered question-and-answer website, Quora, and then used active learning to build a classifier that labeled 28,320 answers as stories. The paper describes the creation of their dataset from Quora but does not provide concrete access information (e.g., a link or citation for public availability) for this created dataset. |
| Dataset Splits | Yes | We partition the 28,320 story texts into a training set of 21,230 (75%) stories, a validation set of 2,832 (10%) stories and a test set of 4,258 (15%) stories. |
| Hardware Specification | No | The paper mentions neural networks and computational models but does not specify any hardware used for running the experiments (e.g., GPU, CPU models, memory). |
| Software Dependencies | No | The paper mentions using the 'word2vec algorithm' and 'logistic classification', but does not provide specific version numbers for any software, libraries, or frameworks used (e.g., Python version, TensorFlow/PyTorch version, scikit-learn version). |
| Experiment Setup | Yes | The first layer contains 32 3-by-5 filters with a horizontal stride of 3, followed by 32 2-by-3 filters with a horizontal stride of 2, and 16 1-by-3 filters with strides of 1. Each filter layer is followed by a Re LU activation function and max-pooling layers with 2-by-2 kernels. The two fully connected layers applied to the document embedding contain 128 units each. As the average length of stories is 369.3, we set the region size ktok to 36 and the number of regions kreg to 10. The dimension of regional embeddings rt is set to 10. The dimension of hidden states is set to 100; the magnitude of gradients is clipped at 1. |