Linguistic Fingerprints of Internet Censorship: The Case of Sina Weibo
Authors: Kei Yin Ng, Anna Feldman, Jing Peng446-453
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We build a classifier that significantly outperforms non-expert humans in predicting whether a blogpost will be censored. Our best results are over 30% higher than the baseline and about 60% higher than the human baseline obtained through crowdsourcing, which shows that our classifier has a greater censorship predictive ability compared to human judgments. |
| Researcher Affiliation | Academia | Kei Yin Ng, Anna Feldman, Jing Peng Montclair State University Montclair, New Jersey, USA |
| Pseudocode | No | The paper describes the methodologies in prose but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing open-source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | Using Zhu et al. (2003) s Corpus; Zhu et al. (2013) collected over 2 million posts published by a set of around 3,500 sensitive users during a 2-month period in 2012. |
| Dataset Splits | Yes | Each experiment is validated with 10-fold cross validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions several tools and libraries used (e.g., LIWC, Baidu AI, CRIE, word2vec, Jieba Part-of-speech tagger), but it does not specify version numbers for these software components, which is required for reproducibility. |
| Experiment Setup | Yes | The rest of the parameters are set to default learning rate of 0.3, momentum of 0.2, batch size of 100, validation threshold of 20. |