Deep Clustering of Text Representations for Supervision-Free Probing of Syntax
Authors: Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan10720-10728
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report competitive performance of our probe on 45-tag English POSI, state-of-the-art performance on 12-tag POSI across 10 languages, and competitive results on Co Lab. We also perform zero-shot syntax induction on resource impoverished languages and report strong results. |
| Researcher Affiliation | Collaboration | Vikram Gupta1, Haoyue Shi2, Kevin Gimpel2, Mrinmaya Sachan3 1 Share Chat, India 2 Toyota Technological Institute at Chicago 3 Department of Computer Science, ETH Zurich |
| Pseudocode | No | The paper provides detailed descriptions of the proposed method and various algorithms within the text and figures, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We evaluate our approach for POSI on two datasets: 45-tag Penn Treebank Wall Street Journal (WSJ) dataset (Marcus, Santorini, and Marcinkiewicz 1993) and multilingual 12-tag datasets drawn from the universal dependencies project (Nivre et al. 2016). |
| Dataset Splits | Yes | For POSI, as per the standard practice (Stratos 2019), we use the complete dataset (train + val + test) for training as well as evaluation. However, for Co Lab, we use the train set to train our model and the test set for reporting results, following Drozdov et al. (2019a). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper describes the software components used (e.g., BERT, fastText, K-Means) and frameworks (e.g., autoencoders, deep clustering), but it does not specify version numbers for any of these components, which is necessary for reproducibility. |
| Experiment Setup | Yes | When augmenting the m BERT embeddings with morphological features (Synt DEC_Morph)... We concatenate fast Text embeddings of the trailing trigram of each word with contextualized representations before passing them as input to Synt DEC. ...where ν is set to 1 in all experiments. ...Ltotal = LKL + λLrec (1) |