reproducibilityindex.ai

A Deep Generative Model for Code Switched Text

Authors: Bidisha Samanta, Sharmila Reddy, Hussain Jagirdar, Niloy Ganguly, Soumen Chakrabarti

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that using synthetic code-switched text with natural monolingual data results in signiﬁcant (33.06%) drop in perplexity. ... 4 Experimental Setup ... 5 Results and Analysis
Researcher Affiliation	Academia	1Indian Institute of Technology, Kharagpur 2Indian Institute of Technology, Bombay {bidisha, sharmilanangi, hussainjagirdar.hj}@iitkgp.ac.in, niloy@cse.iitkgp.ac.in, soumen@cse.iitb.ac.in
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	1https://github.com/bidishasamantakgp/VACS
Open Datasets	Yes	To train the generative models, we use a subset of the (real) Hindi-English tweets collected by [Patro et al., 2017] and automatically language-tagged by [Rijhwani et al., 2017] with reasonable accuracy.
Dataset Splits	Yes	From this set we sample 6K tweets where code-switching is present, which we collect into folds r CS-train, r CS-valid. ... We sample 7K instances from the original real code-switched pool for validation and 7K for testing.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper mentions using 'Adam optimiser' but does not specify version numbers for any software libraries, programming languages, or other dependencies.
Experiment Setup	No	The paper mentions the use of 'Adam optimiser and KL cost annealing technique [Bowman et al., 2015b]' but does not provide specific hyperparameter values such as learning rates, batch sizes, or number of epochs for the experimental setup.