P-SIF: Document Embeddings Using Partition Averaging

Authors: Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar7863-7870

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through a comprehensive set of experiments, we demonstrate P-SIF s effectiveness compared to simple weighted averaging and many other baselines. We perform a comprehensive set of experiments on several text similarity and multiclass or multilabel text classification tasks.
Researcher Affiliation Collaboration 1School of Computing, University of Utah, 2Info Edge (India) Limited, 3Microsoft Research Lab, Bangalore, 4Computer Science Department, IIT Kanpur, 5Indian Institute of Science, Bangalore
Pseudocode Yes Algorithm 1: P-SIF Embedding
Open Source Code Yes We have released the source code for P-SIF embeddings. 2
Open Datasets Yes We perform our experiments on the Sem Eval dataset (2012 2017). We run multi-class experiments on 20News Group dataset, and multi-label classification experiments on Reuters-21578 dataset.
Dataset Splits Yes We use 5-fold cross-validation on the F1 score to tune hyperparameters.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper mentions methods like 'Linear SVM' and 'Logistic regression' but does not specify any software names with version numbers for implementation details or dependencies.
Experiment Setup Yes We use the fixed weighting parameter a value of 10 3, and the word frequencies p(w) are estimated from the commoncrawl dataset.