Short-Term Memory Convolutions
Authors: Grzegorz Stefański, Krzysztof Arendt, Paweł Daniluk, Bartłomiej Jasik, Artur Szumaczuk
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study we demonstrate an application of this solution to a U-Net model for a speech separation task and Ghost Net model in acoustic scene classification (ASC) task. In case of speech separation we achieved a 5-fold reduction in inference time and a 2-fold reduction in latency without affecting the output quality. The inference time for ASC task was up to 4 times faster while preserving the original accuracy. |
| Researcher Affiliation | Industry | Grzegorz Stefa nski, Krzysztof Arendt, Paweł Daniluk, Bartłomiej Jasik, Artur Szumaczuk Samsung R&D Institute Poland {g.stefanski, k.arendt, p.daniluk, b.jasik, a.szumaczuk}@samsung.com |
| Pseudocode | No | The paper includes conceptual diagrams and mathematical equations but no pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that code for the methodology is being released. |
| Open Datasets | Yes | The Deep Noise Suppression (DNS) Challenge Interspeech 2020 dataset (Reddy et al., 2020), licensed under CC-BY 4.0, was adopted for both training and validation of networks as it can be easily processed for speech separation task. The TAU Urban Acoustic Scene 2020 Mobile dataset (Heittola et al., 2020), was adopted for both training and validation of networks as our dataset for ASC task. |
| Dataset Splits | No | The paper states that datasets were adopted for 'training and validation' but does not specify the explicit split percentages or sample counts for the validation set, nor does it refer to a predefined validation split. |
| Hardware Specification | Yes | Each model was tested in 1,000 independent runs on 8 cores of the Intel(R) Xeon(R) Gold 6246R CPU with 3.40 GHz clock... We used Samsung Fold 3 SM-F925U equipped with Qualcomm Snapdragon 888... All models (U-Net, Causal U-Net and Causal U-Net t STMC) were trained on Nvidia A100... |
| Software Dependencies | No | The paper mentions 'TFLite Model Benchmark Tool' and 'TFLite representation' but does not specify version numbers for TFLite or any other software dependencies. |
| Experiment Setup | Yes | All models (U-Net, Causal U-Net and Causal U-Net t STMC) were trained on Nvidia A100 with batch size of 128 and Adam optimizer with learning rate of 1e-3. Each model was trained for 100 epochs which took about 90 hours per model. |