Image Feature Learning for Cold Start Problem in Display Advertising
Authors: Kaixiang Mo, Bo Liu, Lei Xiao, Yong Li, Jie Jiang
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a real world dataset with 47 billion records show that our feature learning method outperforms existing handcrafted features significantly, and it can extract discriminative and meaningful features. |
| Researcher Affiliation | Collaboration | Hong Kong University of Science and Technology, Hong Kong, China Tencent Inc., Shenzhen, China {kxmo, bliuab}@cse.ust.hk, {shawnxiao, nickyyli, zeus}@tencent.com |
| Pseudocode | No | The paper describes the architecture and training details of the convolutional neural network but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions modifying Caffe ('We modified Caffe [Jia, 2013] to train our ad image feature extractor network.') but does not provide concrete access to their modified source code for the described methodology. |
| Open Datasets | No | Our dataset is sampled from the Tencent online display advertising system for a period of 19 days, with approximately 47 billion records... The records from the first 15 days are used as training set, the records from the last 4 days are used as testing set. The training set have 45 billion records on 220,000 ads. No access information is provided for this proprietary dataset. |
| Dataset Splits | No | The paper describes using a 'training set' and 'testing set' split (45 billion records for training, 2.4 billion for testing new ads) but does not explicitly mention a distinct validation set split or its use for hyperparameter tuning. |
| Hardware Specification | Yes | Training the feature extractor takes about 2 days on a NVIDIA TESLA M2090 6GB GPU. |
| Software Dependencies | No | The paper mentions using 'Caffe', 'Open CV', and 'liblinear' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We used a batch size of 256, weight decay of 0.0005. In order to speed up the convergence, we double the batch size after every 5000 iterations. Learning rate is adjusted dynamically... The basic learning rate L is set to be 0.01, γ is set to be 0.0001 and p is set to be 1.5. Momentum is adjusted dynamically... We set the max momentum to be 0.9, ˆM to be 500. We also used Re LU [Nair and Hinton, 2010] as activation function... Local response normalization(LRN) [Krizhevsky et al., 2012]... All local response normalization use α = 0.0001, β = 0.75. All local response normalization have receptive filed of size 5. |