ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs

Authors: Zuohui Fu, Yikun Xian, Shijie Geng, Yingqiang Ge, Yuting Wang, Xin Dong, Guang Wang, Gerard de Melo7756-7763

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments show that our method outperforms several technically more powerful approaches, especially under challenging low-resource circumstances. The source code is available from https://github.com/zuohuif/ABSent along with relevant datasets. In this section, we extensively evaluate the effectiveness of our ABSent method compared with state-of-the-art approaches on two heterogeneous real-world corpora.
Researcher Affiliation Academia Department of Computer Science Rutgers University, New Brunswick, NJ, USA {zuohui.fu, sg1309, yingqiang.ge, xd48}@rutgers.edu, siriusxyk@gmail.com, {yw632, gw255}@cs.rutgers.edu, gdm@demelo.org
Pseudocode No The paper provides mathematical formulations and a diagram of the framework but no structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available from https://github.com/zuohuif/ABSent along with relevant datasets.
Open Datasets Yes We evaluate the precision of our approach on the Europarl parallel corpus and on extracted from the Tatoeba service1, which provides translations of commonly used phrases that might be useful to language learners. 1http://tatoeba.org
Dataset Splits No The paper specifies training and test set sizes, e.g., "160k pairs as the training set and 1,600 pairs as the test set", but does not explicitly mention or detail a validation set split.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions optimizers (Adam) and word vector tools (fastText) but does not provide specific version numbers for any software dependencies like programming languages, frameworks, or libraries.
Experiment Setup Yes Both generators GX and GY consist of three fully connected layers with hidden sizes of 512, 1024, 512, respectively. Each hidden layer is connected with a Batch Norm layer and the Re LU activation function. The final activation function is tanh. Both discriminators Dreal and Ddom take as input two embeddings, followed by three fully connected layers of sizes 512, 1024, 512 with concatenation. Each hidden layer is connected with a leaky Re LU activation function (0.2), while the output is activated by a sigmoid function. We rely on Adam optimization with an initial learning rate of 0.002 and a batch size of 128.