SplineNets: Continuous Neural Decision Graphs
Authors: Cem Keskin, Shahram Izadi
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our approach can significantly increase the accuracy of Res Nets with negligible cost in speed, matching the precision of a 110 level Res Net with a 32 level Spline Net. We compared against 32 and 110 level Res Nets on CIFAR-10, which was augmented with random shifts, crops and horizontal flips, followed by per-image whitening. Figure 8 shows how the accuracy, model size and runtime complexity for a single sample (measured in FLOPS) are affected when going from two to five knots. |
| Researcher Affiliation | Industry | Cem Keskin cemkeskin@google.com Shahram Izadi shahrami@google.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. It describes methods in prose and uses mathematical equations, but no code-like formatting. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described. |
| Open Datasets | Yes | We compared against 32 and 110 level Res Nets on CIFAR-10, which was augmented with random shifts, crops and horizontal flips, followed by per-image whitening. We trained Le Net-32, Le Net-64 and Le Net-128 as baseline models on CIFAR-10. Finally, we experimented with Spline-Le Nets on MNIST. |
| Dataset Splits | No | The paper mentions using CIFAR-10 and MNIST datasets but does not explicitly provide details about training, validation, or test dataset splits (e.g., specific percentages, sample counts, or citations to predefined splits). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models, memory, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions using 'Tensorflow [22]' for implementation but does not specify a version number for Tensorflow or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | A batch size of 250 and learning rate 0.3 were found to be optimal for CIFAR-10. Initializer variance constant c was set to 0.05, number of bins B to 50 and quantization slope υ to 100. We found it useful to add a multiplicative slope parameter inside the decision sigmoid functions, which was set to 0.4. The diffusion parameter had almost no effect on the shallow Le Net, so α was set to 1. Regularizer had a more significant effect on the shallow Le Net, with ws and wu set to 0.2 giving the best results. On Res Net, the effect was less significant after fine tuning the initialization and sigmoid slope parameters. Setting ws and wu to 0.05 gave slightly better results. |