Discovery of Natural Language Concepts in Individual Units of CNNs
Authors: Seil Na, Yo Joong Choe, Dong-Hyun Lee, Gunhee Kim
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct analyses with different architectures on multiple datasets for classification and translation tasks and provide new insights into how deep models understand natural language. We use VDCNN (Conneau et al., 2017) for sentiment and topic classification tasks on Yelp Reviews, AG News (Zhang et al., 2015), and DBpedia ontology dataset (Lehmann et al., 2015) and Byte Net (Kalchbrenner et al., 2016) for translation tasks on Europarl (Koehn, 2005) and News Commentary (Tiedemann, 2012) datasets. |
| Researcher Affiliation | Collaboration | 1Seoul National University, 2Kakao, 3Kakao Brain |
| Pseudocode | No | The paper describes its concept alignment method in Section 3.2 with text and an equation, but it does not include a formal pseudocode block or algorithm figure. |
| Open Source Code | Yes | https://github.com/seilna/CNN-Units-in-NLP |
| Open Datasets | Yes | We use VDCNN (Conneau et al., 2017) for sentiment and topic classification tasks on Yelp Reviews, AG News (Zhang et al., 2015), and DBpedia ontology dataset (Lehmann et al., 2015) and Byte Net (Kalchbrenner et al., 2016) for translation tasks on Europarl (Koehn, 2005) and News Commentary (Tiedemann, 2012) datasets. |
| Dataset Splits | Yes | We record their BLEU scores on the validation data as shown in Figure 6. The parameters were optimized with Adam (Kingma & Ba, 2015) for 5 epochs, and early stopping was actively used for finding parameters that generalize well. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU/CPU models, memory specifications, or cloud instance types. It only mentions using TensorFlow, a software framework. |
| Software Dependencies | No | The paper mentions using "Tensor Flow" and "Adam" optimizer, but it does not specify any version numbers for these or other software components. For example, "Our code is based on a Tensor Flow..." and "The parameters were optimized with Adam..." |
| Experiment Setup | Yes | We set the batch size to 8 and the learning rate to 0.001. The parameters were optimized with Adam (Kingma & Ba, 2015) for 5 epochs, and early stopping was actively used for finding parameters that generalize well. We set the batch size to 64 and the learning rate to 0.01. The parameters are optimized using SGD optimizer for 50 epochs, and early stopping is actively used. |