Zipfian Whitening

Authors: Sho Yokoi, Han Bao, Hiroto Kurita, Hidetoshi Shimodaira

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation: We confirm the effectiveness of Zipfian whitening (Algorithm 1) by measuring performance on standard sentence-level downstream tasks using post-processed word vectors. We employed the most standard word embeddings Glo Ve [43], word2vec [37], and fast Text [11] and utilized the widely adopted evaluation tasks, including STS-B [15] and related benchmarks.
Researcher Affiliation Academia Sho Yokoi Tohoku University / RIKEN yokoi@tohoku.ac.jp Han Bao Kyoto University bao@i.kyoto-u.ac.jp Hiroto Kurita Tohoku University hiroto.kurita@dc.tohoku.ac.jp Hidetoshi Shimodaira Kyoto University / RIKEN shimo@i.kyoto-u.ac.jp
Pseudocode Yes The specific algorithm is as shown in Algorithm 1. Algorithm 1 Zipfian whitening; a post-processing algorithm on word embeddings.
Open Source Code Yes https://github.com/cl-tohoku/zipfian-whitening
Open Datasets Yes We employed the most standard word embeddings Glo Ve [43], word2vec [37], and fast Text [11] and utilized the widely adopted evaluation tasks, including STS-B [15] and related benchmarks.
Dataset Splits Yes We used the MTEB [40] implementation: https://github.com/embeddings-benchmark/mteb, for the evaluation of the static word embeddings in Table 2, Table 8, and Table 9. For the evaluation of the dynamic word embeddings in Table 5 and Table 12, we used the implementation in Sim CSE paper [22]: https://github.com/princeton-nlp/Sim CSE, to match the experimental setting.
Hardware Specification Yes We conducted all experiments using a single NVIDIA RTX 6000 Ada GPU with 48GB VRAM.
Software Dependencies No The paper mentions software tools like NLTK, MTEB, and Sim CSE's implementation, but does not provide specific version numbers for these or other key software components used in their experiments.
Experiment Setup Yes We followed the hyperparameter choices of the original papers, with the dimensionality reduction parameter for ABTT set to D := 3, and the weighting parameter for SIF set to a := 10 3.