W-CTC: a Connectionist Temporal Classification Loss with Wild Cards
Authors: Xingyu Cai, Jiahong Yuan, Yuchen Bian, Guangxu Xun, Jiaji Huang, Kenneth Church
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluations on a number of tasks in speech and vision domains, show that the proposed W-CTC consistently outperforms the standard CTC by a large margin when label is incomplete. The effectiveness of the proposed method is further confirmed in an ablation study. |
| Researcher Affiliation | Industry | Xingyu Cai, Jiahong Yuan, Yuchen Bian, Guangxu Xun, Jiaji Huang, Kenneth Church Baidu Research, 1195 Bordeaux Dr, Sunnyvale, CA 94089, USA xingyucai@baidu.com |
| Pseudocode | No | The paper describes the algorithm steps in text and equations but does not include a formally labeled pseudocode or algorithm block. |
| Open Source Code | Yes | All the codes can be found at https://github.com/Tide Dancer/iclr22-wctc. |
| Open Datasets | Yes | We use the TIMIT (Garofolo, 1993) dataset in this experiment... Two standard collections were used for training: (a) MJSynth (MJ, 9 million images) (Jaderberg et al., 2014) and (b) Synth Text (ST, 800k images) (Gupta et al., 2016)... The dataset is PHOENIX14T (Camgoz et al., 2018) |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly detail validation dataset splits or how they were derived, beyond general statements about evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory) used for running its experiments, only general statements about backbone models. |
| Software Dependencies | No | The paper lists external code repositories and references for its implementations (e.g., huggingface/transformers, Media-Smart/vedastr, neccam/slt), but does not provide specific version numbers for the underlying software libraries or programming languages (e.g., Python, PyTorch). |
| Experiment Setup | Yes | Table 3: List of key training hyper-parameters. Batch-size Optimizer LR Steps ASR 32 Adam W 1e-4 7k (50 epochs) PR 32 Adam W 1e-4 7k (50 epochs) OCR 500 Ada Delta 1 150k CSLR 32 Adam 1e-3 Stop if no better for 800 steps |