CRISP: Curriculum based Sequential neural decoders for Polar code family
Authors: S Ashwin Hebbar, Viraj Vivek Nadkarni, Ashok Vardhan Makkuva, Suma Bhat, Sewoong Oh, Pramod Viswanath
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We design a principled curriculum, guided by information-theoretic insights, to train CRISP and show that it outperforms the successive-cancellation (SC) decoder and attains near-optimal reliability performance on the Polar(32, 16) and Polar(64, 22) codes. The choice of the proposed curriculum is critical in achieving the accuracy gains of CRISP, as we show by comparing against other curricula. |
| Researcher Affiliation | Academia | 1Princeton University 2EPFL 3University of Washington. |
| Pseudocode | No | Appendix A is titled 'Successive Cancellation decoder' and provides mathematical expressions and step-by-step descriptions for decoding, but it is presented as descriptive text and formulas rather than a formal pseudocode block or algorithm. |
| Open Source Code | Yes | *Source code available at the following link. We provide our code at the following link. |
| Open Datasets | No | Data generation. The input message u {0, 1}k is randomly drawn uniformly from the boolean hypercube and encoded as a polar codeword x { 1}n. The classical additive white Gaussian noise (AWGN) channel, y = x + z, z N(0, σ2In), generates the training/test data (y, u) for the decoder. |
| Dataset Splits | No | The paper mentions 'training/test data' and 'validation BER' in figures, but it does not specify explicit percentages or counts for training, validation, or test dataset splits. |
| Hardware Specification | Yes | To quantitatively compare the complexities of these decoders, we evaluate their throughput on a single GTX 1080 Ti GPU as well as a CPU (Intel i7-6850K, 12 threads). This training schedule required 13-15 hours of training on a GTX 1080Ti GPU. |
| Software Dependencies | No | For training our models (both sequential and block decoders), we use Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 10 3. The paper mentions the AdamW optimizer and the 'aff3ct toolbox' but does not provide specific version numbers for these or other relevant software libraries/frameworks. |
| Experiment Setup | Yes | For training our models (both sequential and block decoders), we use Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 10 3. We use a batch size of 4096 or 8192. We use a 2-layer GRU with a hidden state size of 512. To train CRISP for Polar(64,22), we use the following curriculum schedule: Train each subcode for 2000 iterations, and finally train the full code until convergence with a decaying learning rate. |