reproducibilityindex.ai

Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions

Authors: Nadav Cohen, Ronen Tamari, Amnon Shashua

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation demonstrates how the expressive efﬁciency of connectivity, similarly to that of depth, translates into gains in accuracy. An experiment on TIMIT speech corpus (Garofolo et al. (1993)) evaluates the dilated convolutional network architectures covered by our analysis.
Researcher Affiliation	Collaboration	Nadav Cohen Institute for Advanced Study cohennadav@ias.edu Ronen Tamari The Hebrew University of Jerusalem ronent@cs.huji.ac.il Amnon Shashua The Hebrew University of Jerusalem shashua@cs.huji.ac.il
Pseudocode	No	The paper provides formal mathematical definitions and equations (e.g., eq. 4) but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states the framework used for experiments ('The framework chosen for running the experiment was Caffe toolbox (Jia et al. (2014))'), but does not provide any statement or link for the open-sourcing of their own methodology's code.
Open Datasets	Yes	We trained a baseline dilated convolutional network N... to classify individual phonemes in the TIMIT acoustic speech corpus (Garofolo et al. (1993)).
Dataset Splits	Yes	We split the data into train and validation sets in accordance with Halberstadt (1998)
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'Caffe toolbox (Jia et al. (2014))' and 'Adam optimizer (Kingma and Ba (2014))' but does not provide specific version numbers for these components.
Experiment Setup	Yes	In accordance with Wave Net, the baseline dilated convolutional network had Re LU activation (g(a, b)= max{a+b, 0} see sec. 3.1), 32 channels per layer, and input vectors of dimension 256 holding one-hot quantizations of the audio signal. The number of layers L was set to 12, corresponding to an input window of N=2L=4096 samples, spanning 250ms of audio signal standard practice with TIMIT dataset. The framework chosen for running the experiment was Caffe toolbox (Jia et al. (2014)), and we used Adam optimizer (Kingma and Ba (2014)) for training (with default hyper-parameters: moment decay rates β1 = 0.9, β2 = 0.999; learning rate α = 0.001). Weight decay and batch size were set to 10 5 and 128 respectively. Models were trained for 35000 iterations, with learning rate decreased by a factor of 10 after 80% of iterations took place.