P-SIF: Document Embeddings Using Partition Averaging
Authors: Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar7863-7870
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through a comprehensive set of experiments, we demonstrate P-SIF s effectiveness compared to simple weighted averaging and many other baselines. We perform a comprehensive set of experiments on several text similarity and multiclass or multilabel text classification tasks. |
| Researcher Affiliation | Collaboration | 1School of Computing, University of Utah, 2Info Edge (India) Limited, 3Microsoft Research Lab, Bangalore, 4Computer Science Department, IIT Kanpur, 5Indian Institute of Science, Bangalore |
| Pseudocode | Yes | Algorithm 1: P-SIF Embedding |
| Open Source Code | Yes | We have released the source code for P-SIF embeddings. 2 |
| Open Datasets | Yes | We perform our experiments on the Sem Eval dataset (2012 2017). We run multi-class experiments on 20News Group dataset, and multi-label classification experiments on Reuters-21578 dataset. |
| Dataset Splits | Yes | We use 5-fold cross-validation on the F1 score to tune hyperparameters. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions methods like 'Linear SVM' and 'Logistic regression' but does not specify any software names with version numbers for implementation details or dependencies. |
| Experiment Setup | Yes | We use the fixed weighting parameter a value of 10 3, and the word frequencies p(w) are estimated from the commoncrawl dataset. |