reproducibilityindex.ai

Cuckoo Feature Hashing: Dynamic Weight Sharing for Sparse Analytics

Authors: Jinyang Gao, Beng Chin Ooi, Yanyan Shen, Wang-Chien Lee

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing. Experimental results on public benchmark CTR datasets Avazu and malicious URL detection dataset show that compared with feature hashing and multiple hashing, CCFH can further reduce the number of parameters by around 4x to 8x to achieve the same model performance.
Researcher Affiliation	Academia	Jinyang Gao1, Beng Chin Ooi1, Yanyan Shen2, Wang-Chien Lee3 1 National University of Singapore 2 Shanghai Jiao Tong University 3 The Pennsylvania State University
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide a statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	URL [Ma et al., 2009] is a dataset for malicious URL detection. Avazu [Juan et al., 2016] is a dataset for mobile Ads CTR prediction from Kaggle competition.
Dataset Splits	No	The paper mentions 'test error rate' and 'log loss' for evaluation, implying a test set, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification	No	Table 3 shows a typical speciﬁcation for an Intel Xeon CPU. However, this table presents 'A Typical CPU Specification' for illustrative purposes and does not state that these specific hardware components were used for the paper's experiments. No actual hardware specifications used for their experiments are provided.
Software Dependencies	No	The paper mentions using 'logistic regression' as the model and 'Adam' for learning rate adjustment with a citation, but it does not provide specific version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow, scikit-learn versions).
Experiment Setup	Yes	All models are trained using mini-batch stochastic gradient descent (SGD). The batch-size is set to 256, and the learning rate is adjusted based on Adam [Kingma and Ba, 2014] with a momentum of 0.9. L1-penalty is applied to the model parameter as used in [Weinberger et al., 2009; Zhou et al., 2015] to introduce model sparsity (i.e. feature selection). For CCFH, we split the parameter space as two part: 80% for feature weight v and 20% for weight indicator q.