Optimal Activation Functions for the Random Features Regression Model
Authors: Jianxin Wang, José Bento
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Nonetheless, here we test some of our more general conclusions on real data. This appendix is referenced in the main text in Section 3.3. In this section, the data, and the fact that we do not work with infinite dimensions, are the only deviations from our theoretical setup. In particular, we work with an RFR model. We use the MNIST data Deng (2012) to train an RFR model that approximates a function 푓, our ground truth object, defined as follows. For a given digit image 푥with class 푐 {0, 1, . . . , 9}, we define 푓(푥) = 5 + 푐/9. |
| Researcher Affiliation | Academia | Jianxin Wang Department of Electrical and Computer Engineering Rice University jw162@rice.edu José Bento Department of Computer Science Boston College bentoayr@bc.edu |
| Pseudocode | No | The paper does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We include code to generate Figure 1 in the following Github link: https://github.com/Jeffwang87/RFR_AF. This code is also available in the supplementary zip file provided. |
| Open Datasets | Yes | We use the MNIST data Deng (2012) to train an RFR model that approximates a function 푓, our ground truth object, defined as follows. |
| Dataset Splits | No | The paper mentions using 4000 training samples and 10000 test samples, but it does not specify a separate validation dataset split. |
| Hardware Specification | Yes | We ran it using a Mac Book Pro with 2.6 GHz 6-Core Intel Core i7 and 32 GB 2667 MHz DDR4. |
| Software Dependencies | Yes | It runs using Wolfram Mathematica V12. ... It runs using Matlab 2020b. |
| Experiment Setup | Yes | Training is done with 휆= 10 7. ... For the test set we use 10000 random samples. ... in Figure 4 we plot the test error E has a function of 휓1/휓2 = 푁/푛when we have 푛= 4000 train samples and when the number of features 푁ranges from 1 to 14250. ... In Figure 5 we plot the test error E has a function of 휆when 휓2 = 10, when we have 푛= 휓2푑 train samples, and when the number of features is very large, namely, 푁= 10000. |