A Foundation Model for Error Correction Codes

Authors: Yoni Choukroun, Lior Wolf

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTS To evaluate our method, we train one model of the proposed architecture with four classes of linear codes: Low-Density Parity Check (LDPC) codes (Gallager, 1962), Polar codes (Arikan, 2008), Reed Solomon codes (Reed & Solomon, 1960) and Bose Chaudhuri Hocquenghem (BCH) codes (Bose & Ray-Chaudhuri, 1960).
Researcher Affiliation Academia Yoni Choukroun The Blavatnik School of Computer Science Tel Aviv University choukroun.yoni@gmail.com Lior Wolf The Blavatnik School of Computer Science Tel Aviv University wolf@cs.tau.ac.il
Pseudocode No No pseudocode or algorithm blocks are explicitly presented or labeled in the paper.
Open Source Code No The paper does not provide an explicit statement about releasing the source code for their method, nor does it provide a direct link to a code repository.
Open Datasets Yes The code database from Helmling et al. (2019) was web-scraped in order to extract all possible binary codes.
Dataset Splits No The paper mentions training data (“512 samples per minibatch”) and testing data (“at least 10^5 random codewords are decoded”), but does not provide specific percentages or absolute counts for training, validation, and test splits, nor does it refer to predefined splits with citations.
Hardware Specification Yes Training and experiments are performed on four 12GB Ge Force RTX 2080 Ti GPUs.
Software Dependencies No The paper mentions the use of the Adam optimizer and GEGLU layers but does not provide specific version numbers for software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries.
Experiment Setup Yes The decoder is defined as a concatenation of N = 6 decoding layers composed of self-attention and feed-forward layers interleaved with normalization layers with d = 128. The Adam optimizer (Kingma & Ba, 2014) is used with 512 samples per minibatch, for 3000 epochs, with 1000 minibatches per epoch. We initialized the learning rate to 10^-4 coupled with a cosine decay scheduler down to 10^-6 at the end of the training.