CRISP: Curriculum based Sequential neural decoders for Polar code family

Authors: S Ashwin Hebbar, Viraj Vivek Nadkarni, Ashok Vardhan Makkuva, Suma Bhat, Sewoong Oh, Pramod Viswanath

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We design a principled curriculum, guided by information-theoretic insights, to train CRISP and show that it outperforms the successive-cancellation (SC) decoder and attains near-optimal reliability performance on the Polar(32, 16) and Polar(64, 22) codes. The choice of the proposed curriculum is critical in achieving the accuracy gains of CRISP, as we show by comparing against other curricula.
Researcher Affiliation Academia 1Princeton University 2EPFL 3University of Washington.
Pseudocode No Appendix A is titled 'Successive Cancellation decoder' and provides mathematical expressions and step-by-step descriptions for decoding, but it is presented as descriptive text and formulas rather than a formal pseudocode block or algorithm.
Open Source Code Yes *Source code available at the following link. We provide our code at the following link.
Open Datasets No Data generation. The input message u {0, 1}k is randomly drawn uniformly from the boolean hypercube and encoded as a polar codeword x { 1}n. The classical additive white Gaussian noise (AWGN) channel, y = x + z, z N(0, σ2In), generates the training/test data (y, u) for the decoder.
Dataset Splits No The paper mentions 'training/test data' and 'validation BER' in figures, but it does not specify explicit percentages or counts for training, validation, or test dataset splits.
Hardware Specification Yes To quantitatively compare the complexities of these decoders, we evaluate their throughput on a single GTX 1080 Ti GPU as well as a CPU (Intel i7-6850K, 12 threads). This training schedule required 13-15 hours of training on a GTX 1080Ti GPU.
Software Dependencies No For training our models (both sequential and block decoders), we use Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 10 3. The paper mentions the AdamW optimizer and the 'aff3ct toolbox' but does not provide specific version numbers for these or other relevant software libraries/frameworks.
Experiment Setup Yes For training our models (both sequential and block decoders), we use Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 10 3. We use a batch size of 4096 or 8192. We use a 2-layer GRU with a hidden state size of 512. To train CRISP for Polar(64,22), we use the following curriculum schedule: Train each subcode for 2000 iterations, and finally train the full code until convergence with a decaying learning rate.