Cuckoo Feature Hashing: Dynamic Weight Sharing for Sparse Analytics
Authors: Jinyang Gao, Beng Chin Ooi, Yanyan Shen, Wang-Chien Lee
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing. Experimental results on public benchmark CTR datasets Avazu and malicious URL detection dataset show that compared with feature hashing and multiple hashing, CCFH can further reduce the number of parameters by around 4x to 8x to achieve the same model performance. |
| Researcher Affiliation | Academia | Jinyang Gao1, Beng Chin Ooi1, Yanyan Shen2, Wang-Chien Lee3 1 National University of Singapore 2 Shanghai Jiao Tong University 3 The Pennsylvania State University |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | URL [Ma et al., 2009] is a dataset for malicious URL detection. Avazu [Juan et al., 2016] is a dataset for mobile Ads CTR prediction from Kaggle competition. |
| Dataset Splits | No | The paper mentions 'test error rate' and 'log loss' for evaluation, implying a test set, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) needed for reproduction. |
| Hardware Specification | No | Table 3 shows a typical speciļ¬cation for an Intel Xeon CPU. However, this table presents 'A Typical CPU Specification' for illustrative purposes and does not state that these specific hardware components were used for the paper's experiments. No actual hardware specifications used for their experiments are provided. |
| Software Dependencies | No | The paper mentions using 'logistic regression' as the model and 'Adam' for learning rate adjustment with a citation, but it does not provide specific version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow, scikit-learn versions). |
| Experiment Setup | Yes | All models are trained using mini-batch stochastic gradient descent (SGD). The batch-size is set to 256, and the learning rate is adjusted based on Adam [Kingma and Ba, 2014] with a momentum of 0.9. L1-penalty is applied to the model parameter as used in [Weinberger et al., 2009; Zhou et al., 2015] to introduce model sparsity (i.e. feature selection). For CCFH, we split the parameter space as two part: 80% for feature weight v and 20% for weight indicator q. |