reproducibilityindex.ai

Mixture of Expert/Imitator Networks: Scalable Semi-Supervised Learning Framework

Authors: Shun Kiyono, Jun Suzuki, Kentaro Inui4073-4081

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that the proposed method consistently improves the performance of several types of baseline DNNs.
Researcher Affiliation	Academia	Shun Kiyono,1 Jun Suzuki,1,2 Kentaro Inui1,2 1Tohoku University, 2RIKEN Center for Advanced Intelligence Project
Pseudocode	Yes	Algorithm 1: Training framework of MEIN Data: Labeled data Ds and unlabeled data Du Result: Trained set of parameters bΘ, bΦ, bΛ 1 Θ arg min Θ {Ls(Θ\|Ds)} Train EXN (Equation 3) 2 bΦ arg min Φ {Lu(Φ\|Θ , Du)} Train IMN(s) (Equation 11) 3 bΘ, bΛ arg min Θ,Λ {L s(Θ, Λ\|bΦ,Ds)} Train EXN (Equation 13)
Open Source Code	No	The paper mentions using a third-party tool, SentencePiece, and provides its GitHub link, but it does not state that the authors' own source code for the proposed method is available.
Open Datasets	Yes	For SEC, we selected the following widely used benchmark datasets: IMDB (Maas et al. 2011), Elec (Johnson and Zhang 2015), and Rotten Tomatoes (Rotten) (Pang and Lee 2005). For the Rotten dataset, we used the Amazon Reviews dataset (Mc Auley and Leskovec 2013) as unlabeled data, following previous studies (Dai and Le 2015; Miyato, Dai, and Goodfellow 2017; Sato et al. 2018). For CAC, we used the RCV1 dataset (Lewis et al. 2004).
Dataset Splits	Yes	Table 1: Summary of datasets. Each value represents the number of instances contained in each dataset. ... Elec 2 22,500 2,500 25,000 200,000 ... IMDB 2 21,246 3,754 25,000 50,000 ... Rotten 2 8,636 960 1,066 7,911,684 ... RCV1 55 14,007 1,557 49,838 668,640
Hardware Specification	Yes	We used identical hardware for each measurement, namely, a single NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper mentions "cu DNN implementation" and "sentencepiece" but does not provide specific version numbers for these or any other software dependencies required for replication.
Experiment Setup	Yes	Table 2 summarizes the hyperparameters and network conﬁgurations of our experiments. We carefully selected the settings commonly used in the previous studies (Dai and Le 2015; Miyato, Dai, and Goodfellow 2017; Sato et al. 2018). ... Table 2: Summary of hyperparameters (includes Word Embedding Dim., Embedding Dropout Rate, LSTM Hidden State Dim., MLP Dim., Activation Function, CNN Kernel Dim., Number of IMNs, Optimization Algorithm, Mini-Batch Size, Initial Learning Rate, Fine-tune Learning Rate, Decay Rate, Baseline Max Epoch, Fine-tune Max Epoch)