reproducibilityindex.ai

Semi-Supervised Matrix Completion for Cross-Lingual Text Classiﬁcation

Authors: Min Xiao, Yuhong Guo

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the proposed learning technique, we conduct extensive experiments on eighteen cross language sentiment classiﬁcation tasks with four different languages. The empirical results demonstrate the efﬁcacy of the proposed approach, and show it outperforms a number of related cross-lingual learning methods.
Researcher Affiliation	Academia	Min Xiao and Yuhong Guo Department of Computer and Information Sciences Temple University Philadelphia, PA 19122, USA {minxiao, yuhong}@temple.edu
Pseudocode	Yes	Algorithm 1 Algorithm Input: M 0, γ > 0, β 1, 0 < < min(2, 2 β ), µ Initialize M as the nonnegative projection of the rank-1 approximation of M 0; initialize z as zeros. while not converged do 1. gradient descent: [M, z] = [M, z] rg(M, z). 2. shrinkage operation: [M, z] = S γ([M, z]). 3. project M onto the feasible set: M = max(M, 0). end while
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	We used the multilingual Amazon product review dataset in our experiments for cross-lingual sentiment classiﬁcation, which contains reviews in three different categories (Books(B), DVD(D), and Music(M)), written in four different languages (English(E), French(F), German(G) and Japanese(J)). The paper does not provide a specific link, DOI, or a formal citation with author/year for accessing this dataset.
Dataset Splits	Yes	For each of the eighteen cross language sentiment classiﬁcation tasks, in addition to the 2000 unlabeled parallel reviews which we used only for representation learning, we used all the documents in the source language as labeled data (4000 English reviews or 2000 non-English reviews) and randomly chose 100 reviews in the target language as labeled data while keeping the rest reviews in the target language as unlabeled data. We conducted parameter selection based on three runs over the ﬁrst task EFB with different random selections of the 100 labeled training data in the target language.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or specific computer configurations) used for running experiments are mentioned in the paper.
Software Dependencies	No	We used the LIBSVM package (Chang and Lin 2011) with linear kernels and default parameter setting. However, no specific version number for LIBSVM or other software dependencies is provided.
Experiment Setup	Yes	For SSMC, we chose γ from {0.01, 0.1, 1, 10, 100}, β from {1, 2, 5, 10, 100}, µ from {10 6, 10 5, 10 4, 10 3, 10 2, 10 1} and chose the reduced dimension size k from {20, 50, 100, 200, 500}. This leads to the following setting: γ = 10, β = 1, µ = 10 4, k = 50. We used = 1. For TSL, we set µ = 10 6, = 1, and chose γ from {0.01, 0.1, 1, 10, 100}, from {10 5, 10 4, 10 3, 10 2, 10 1, 1}, and the reduced dimension size k from {20, 50, 100, 200, 500}. This leads to the setting γ = 0.1, = 10 4, and k = 50.