Machines of Finite Depth: Towards a Formalization of Neural Networks

Authors: Pietro Vertechi, Mattia G. Bergomi

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In addition to the theoretical framework, we implement the input-output computations of parametric machines, as well as their derivatives, in the Julia programming language (Bezanson et al. 2017) (both on CPU and GPU). Section 3 is devoted to practical applications: we discuss in detail an implementation of machines that extends classical dense, convolutional, and recurrent networks with a rich shortcut structure. Figure 2: Ratio of runtime of backward pass over forward pass. The runtimes of backward and forward pass are comparable, across different models, problem sizes, and devices.
Researcher Affiliation Academia The paper lists authors' emails as 'pietro.vertechi@protonmail.com' and 'mattiagbergomi@gmail.com'. No institutional affiliations (university or company names) are provided in the paper text.
Pseudocode Yes Algorithm 1: Computation of non-equivariant machine. Forward pass: 1: Initialize arrays y, z of size nc and value y = y0, z = z0 2: for i = 0 to n do 3: y[Ii] += W[Ii, :]z, eq. (9) 4: z[Ii] += σ (y[Ii]), eq. (8) 5: end for Backward pass: 1: Initialize arrays u, v of size nc and value u = u0, v = v0 2: for i = n to 0 do 3: u[Ii] += (L[:, Ii]) v, eq. (11) 4: v[Ii] += σ (y[Ii]) u[Ii], eq. (10) 5: end for 6: Initialize Q = vz , eq. (12) 7: Set Q[Ij, Ii] = 0, for all j i, eq. (12)
Open Source Code Yes The implementation is open source and available at https:// github.com/Limen Research/Parametric Machines Demos.jl.
Open Datasets No The paper discusses different model architectures (dense, convolutional, recurrent networks) and their performance (runtime), but does not explicitly name any specific public datasets used for training or evaluation in the main text.
Dataset Splits No The paper does not specify any training/validation/test dataset splits, as it does not name any specific datasets used for its experiments.
Hardware Specification No The paper mentions implementation 'both on CPU and GPU' but does not provide specific hardware details such as CPU/GPU models, memory, or processor types.
Software Dependencies No The paper mentions using the 'Julia programming language' but does not specify a version number for Julia or any other key software dependencies or libraries with their versions.
Experiment Setup No The paper describes the mathematical framework and general implementation strategy, but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or training configurations.