reproducibilityindex.ai

Sequential Predictive Two-Sample and Independence Testing

Authors: Aleksandr Podkopaev, Aaditya Ramdas

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time. 3 Experiments
Researcher Affiliation	Collaboration	Aleksandr Podkopaev Walmart Global Tech sasha.podkopaev@walmart.com Aaditya Ramdas Carnegie Mellon University aramdas@cmu.edu
Pseudocode	Yes	Algorithm 1 Online Newton step (ONS) strategy for selecting betting fractions
Open Source Code	No	The paper does not contain an explicit statement about the availability of its source code or a link to a code repository for the methodology described.
Open Datasets	Yes	First, we compare sequential classification-based and kernelized 2STs using Karolinska Directed Emotional Faces dataset (KDEF) [Lundqvist et al., 1998] which contains images of actors and actresses expressing different emotions: afraid (AF), angry (AN), disgusted (DI), happy (HA), neutral (HE), sad (SA), and surprised (SU). Following earlier works [Lopez-Paz and Oquab, 2017, Jitkrittum et al., 2016], we focus on straight profile only and assign HA, NE, SU emotions to the positive class (instances from P), and AF, AN, DI emotions to the negative class (instances from Q); see Figure 3a.
Dataset Splits	Yes	Dropout (p = 0.5) and early stopping (with patience equal to ten epochs and 20% of data used in the validation set) is used for regularization.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models, or memory specifications.
Software Dependencies	No	The paper mentions software components like 'Adam optimizer' and 'logistic regression', but it does not specify version numbers for these or other key software libraries and dependencies.
Experiment Setup	Yes	We use CNN with 4 convolutional layers (kernel size is taken to be 3 3) and 16, 32, 32, 64 filters respectively. Further, each convolutional layer is followed by max-pooling layer (2 2). After flattening, those layers are followed by 1 fully connected layer with 128 neurons. Dropout (p = 0.5) and early stopping (with patience equal to ten epochs and 20% of data used in the validation set) is used for regularization. Re LU activation functions are used in each layer. Adam optimizer is used for training the network. We start training after processing twenty observations, and update the model parameters after processing every next ten observations. Maximum number of epochs is set to 25 for each training iteration. The batch size is set to 32.