Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text
Authors: Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha13516-13524
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results on various domains, show that GYC generates counterfactual text samples exhibiting the above four properties. |
| Researcher Affiliation | Industry | Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha 1IBM Research AI {nishthamadaan, naveen.panwar, diptsaha}@in.ibm.com, inkpad@ibm.com |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | Yes | We use publicly available DBpedia dataset by (Zhang, Zhao, and Le Cun 2015) containing 14 classes. and Ag News. This dataset focuses on real-world data... and Yelp. This dataset focuses on informal text containing reviews. The original YELP Polarity dataset has been filtered in (Shen et al. 2017)... |
| Dataset Splits | No | The paper specifies 'training samples' and 'test samples' for DBpedia and Ag News (e.g., 'DBpedia...containing 560K training samples and 70K test samples'), but does not explicitly mention a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (like CPU/GPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper mentions various models (e.g., 'pre-trained GPT-2 decoder', 'BERT-NER based model', 'XL-Net', 'RoBERTa') but does not specify the version numbers for general software dependencies or libraries used for implementation. |
| Experiment Setup | Yes | The combined objective which we maximize can be stated as following: L = λr Lr + λHLH + λp Lp (18) where λr, λH and λp are hyper-parameters that can be tuned. To facilitate training we perform annealing on these hyper-parameters. In early training, we keep λr, λH = 0. After reconstruction completes, λp is lowered and λr and λH are set to non-zero. See more details in Appendix A in supplementary document. |