reproducibilityindex.ai

Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology

Authors: Valentin Hofmann, Janet Pierrehumbert, Hinrich Schütze

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments suggest that the ideological subspace encodes abstract evaluative semantics and reﬂects changes in the political left-right spectrum during the presidency of Donald Trump. Table 2: Performance on link prediction (MAUC).
Researcher Affiliation	Academia	Valentin Hofmann 1 2 Janet B. Pierrehumbert 3 1 Hinrich Sch utze 2 1Faculty of Linguistics, University of Oxford 2Center for Information and Language Processing, LMU Munich 3Department of Engineering Science, University of Oxford.
Pseudocode	No	The paper describes its method using textual descriptions and mathematical equations but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	We make our code available at https://github.com/valentinhofmann/unsupervised_bias.
Open Datasets	Yes	We base our study on the Reddit Politosphere (Hofmann et al., 2022b), a dataset covering the political discourse on the social media platform Reddit from 2008 to 2019.
Dataset Splits	Yes	We split concepts and edges for each year into train (60%), dev (20%), and test (20%).
Hardware Specification	Yes	Experiments are performed on a Ge Force GTX 1080 Ti GPU (11GB).
Software Dependencies	No	The paper mentions pretrained BERT and Adam optimizer, but does not provide specific version numbers for software dependencies like Python, PyTorch, TensorFlow, or other libraries.
Experiment Setup	Yes	We perform grid search for the learning rate r {1 10 4, 3 10 4, 1 10 3}. For the model used to ﬁnd X , we further perform grid search for the orthogonality constant λo {1 10 3, 3 10 3, 1 10 2} as well as the sparsity constant λs {1 10 2, 3 10 2, 1 10 1}. In total, there are 3 hyperparameter search trials for X and 27 for X per year. We use Adam (Kingma & Ba, 2015) as the optimizer. Both hidden layers of the graph auto-encoder have 10 dimensions.