Hyperparameter Learning via Distributional Transfer
Authors: Ho Chung Law, Peilin Zhao, Leung Sing Chan, Junzhou Huang, Dino Sejdinovic
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, across a range of regression and classification tasks, our methodology performs favourably at initialisation and has a faster convergence compared to existing baselines in some cases, the optimal accuracy is achieved in just a few evaluations. |
| Researcher Affiliation | Collaboration | Ho Chung Leon Law University of Oxford ho.law@stats.ox.ac.uk Peilin Zhao Tencent AI Lab masonzhao@tencent.com Lucian Chan University of Oxford leung.chan@stats.ox.ac.uk Junzhou Huang Tencent AI Lab joehhuang@tencent.com Dino Sejdinovic University of Oxford dino.sejdinovic@stats.ox.ac.uk |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It mentions using TensorFlow for implementation but does not share its own code. |
| Open Datasets | Yes | In particular, the Protein dataset consists of 7 different proteins extracted from [9]: ADAM17, AKT1, BRAF, COX1, FXA, GR, VEGFR2. |
| Dataset Splits | Yes | For testing, we use the same number of samples si for toy data, while using a 60-40 train-test split for real data. |
| Hardware Specification | Yes | Training time is less than 2 minutes on a standard 2.60GHz single-core CPU in all experiments. |
| Software Dependencies | No | The paper mentions using 'Tensor Flow [1] for implementation' and 'Sci Py [14]', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For φx and φy we will use a single hidden layer NN with tanh activation (with 20 hidden and 10 output units), except for classification tasks, where we use a one-hot encoding for φy. [...] For BLR, we will follow [26] and take feature map υ to be a NN with three 50-unit layers and tanh activation. [...] We take the embedding batch-size b = 1000, and learning rate for ADAM to be 0.005. |