reproducibilityindex.ai

Do Not Have Enough Data? Deep Learning to the Rescue!

Authors: Ateret Anaby-Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, Naama Zwerdling7383-7390

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a series of experiments, we show that LAMBADA improves classiﬁers performance on a variety of datasets.
Researcher Affiliation	Collaboration	1IBM Research AI, 2University of Haifa, Israel, 3Technion Israel Institute of Technology
Pseudocode	Yes	We deﬁne the method in Algorithm 1 and elaborate on its steps in the following section.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	ATIS Flight reservations 17 4.2k (www.kaggle.com/siddhadev/atis-dataset-from-ms-cntk); TREC Open-domain questions 50 6k (https://cogcomp.seas.upenn.edu/Data/QA/QC/)
Dataset Splits	Yes	We randomly split each dataset into train, validation, and test sets (80%, 10%, 10%).
Hardware Specification	No	The paper does not specify the exact hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions software components like BERT, SVM, LSTM, and GloVe but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper describes model architectures and general settings (e.g., GloVe 100 dimensions) but lacks specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations to reproduce the experiment setup.