A Context-Integrated Transformer-Based Neural Network for Auction Design

Authors: Zhijian Duan, Jingwu Tang, Yutong Yin, Zhe Feng, Xiang Yan, Manzil Zaheer, Xiaotie Deng

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct empirical experiments to show the effectiveness of CITrans Net in different contextual auctions... We present the experimental results of Setting A, B and C in Table 1.
Researcher Affiliation Collaboration 1Peking University, Beijing, China 2Google Research, Mountain View, US 3Shanghai Jiao Tong University, Shanghai, China 4Google Deep Mind, Mountain View, US.
Pseudocode Yes Algorithm 1 describe the training procedure of CITrans Net.
Open Source Code Yes Our implementation is available at https://github. com/zjduan/CITrans Net.
Open Datasets No For all the settings (Setting A-I), we generate the training set of each setting with size in {50000, 100000, 200000} and test set of size 5000.
Dataset Splits No For all the settings (Setting A-I), we generate the training set of each setting with size in {50000, 100000, 200000} and test set of size 5000.
Hardware Specification No Our experiments are run on a Linux machine with NVIDIA Graphics Processing Unit (GPU) cores.
Software Dependencies No All the models and regret are optimized through Adam (Kingma & Ba, 2014) optimizer.
Experiment Setup Yes We set the embedding size in settings with discrete context (Setting A, B, D, E, F) as 16. The value of ρ in the augmented Lagrangian (Equation (22)) was set as 1.0 at the beginning and incremented by 5 every two epochs. The value of λ in Equation (22) was set as 5.0 initially and incremented every certain number (selected from {2 10}) of epochs... For our proposed CITrans Net, the output channel of the first 1 1 convolution in both the input layer and interaction layers are set to 64. We set d = 64 for the 1 1 convolution with residual connection in input layer, and dh = 64 for the final 1 1 convolution in each interaction layer. We tune the numbers of interaction layers from {2, 3}, and in each interaction layer we adopt transformer with 4 heads and 64 hidden nodes.