Machines of Finite Depth: Towards a Formalization of Neural Networks
Authors: Pietro Vertechi, Mattia G. Bergomi
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition to the theoretical framework, we implement the input-output computations of parametric machines, as well as their derivatives, in the Julia programming language (Bezanson et al. 2017) (both on CPU and GPU). Section 3 is devoted to practical applications: we discuss in detail an implementation of machines that extends classical dense, convolutional, and recurrent networks with a rich shortcut structure. Figure 2: Ratio of runtime of backward pass over forward pass. The runtimes of backward and forward pass are comparable, across different models, problem sizes, and devices. |
| Researcher Affiliation | Academia | The paper lists authors' emails as 'pietro.vertechi@protonmail.com' and 'mattiagbergomi@gmail.com'. No institutional affiliations (university or company names) are provided in the paper text. |
| Pseudocode | Yes | Algorithm 1: Computation of non-equivariant machine. Forward pass: 1: Initialize arrays y, z of size nc and value y = y0, z = z0 2: for i = 0 to n do 3: y[Ii] += W[Ii, :]z, eq. (9) 4: z[Ii] += σ (y[Ii]), eq. (8) 5: end for Backward pass: 1: Initialize arrays u, v of size nc and value u = u0, v = v0 2: for i = n to 0 do 3: u[Ii] += (L[:, Ii]) v, eq. (11) 4: v[Ii] += σ (y[Ii]) u[Ii], eq. (10) 5: end for 6: Initialize Q = vz , eq. (12) 7: Set Q[Ij, Ii] = 0, for all j i, eq. (12) |
| Open Source Code | Yes | The implementation is open source and available at https:// github.com/Limen Research/Parametric Machines Demos.jl. |
| Open Datasets | No | The paper discusses different model architectures (dense, convolutional, recurrent networks) and their performance (runtime), but does not explicitly name any specific public datasets used for training or evaluation in the main text. |
| Dataset Splits | No | The paper does not specify any training/validation/test dataset splits, as it does not name any specific datasets used for its experiments. |
| Hardware Specification | No | The paper mentions implementation 'both on CPU and GPU' but does not provide specific hardware details such as CPU/GPU models, memory, or processor types. |
| Software Dependencies | No | The paper mentions using the 'Julia programming language' but does not specify a version number for Julia or any other key software dependencies or libraries with their versions. |
| Experiment Setup | No | The paper describes the mathematical framework and general implementation strategy, but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or training configurations. |