Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Authors: Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha13516-13524

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results on various domains, show that GYC generates counterfactual text samples exhibiting the above four properties.
Researcher Affiliation Industry Nishtha Madaan, Inkit Padhi, Naveen Panwar, Diptikalyan Saha 1IBM Research AI {nishthamadaan, naveen.panwar, diptsaha}@in.ibm.com, inkpad@ibm.com
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the methodology described.
Open Datasets Yes We use publicly available DBpedia dataset by (Zhang, Zhao, and Le Cun 2015) containing 14 classes. and Ag News. This dataset focuses on real-world data... and Yelp. This dataset focuses on informal text containing reviews. The original YELP Polarity dataset has been filtered in (Shen et al. 2017)...
Dataset Splits No The paper specifies 'training samples' and 'test samples' for DBpedia and Ag News (e.g., 'DBpedia...containing 560K training samples and 70K test samples'), but does not explicitly mention a validation split.
Hardware Specification No The paper does not provide specific hardware details (like CPU/GPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies No The paper mentions various models (e.g., 'pre-trained GPT-2 decoder', 'BERT-NER based model', 'XL-Net', 'RoBERTa') but does not specify the version numbers for general software dependencies or libraries used for implementation.
Experiment Setup Yes The combined objective which we maximize can be stated as following: L = λr Lr + λHLH + λp Lp (18) where λr, λH and λp are hyper-parameters that can be tuned. To facilitate training we perform annealing on these hyper-parameters. In early training, we keep λr, λH = 0. After reconstruction completes, λp is lowered and λr and λH are set to non-zero. See more details in Appendix A in supplementary document.