reproducibilityindex.ai

Learning from Streaming Data when Users Choose

Authors: Jinyan Su, Sarah Dean

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also experimentally demonstrate the utility of our algorithm with real world data.
Researcher Affiliation	Academia	1Department of Computer Science, Cornell University. Correspondence to: Jinyan Su <js3673@cornell.edu>, Sarah Dean <sdean@cornell.edu>.
Pseudocode	Yes	Algorithm 1 Multi-learner Streaming Gradient Descent (MSGD)
Open Source Code	Yes	Code for reproducing these results can be found at https://github.com/ sdean-group/MSGD.
Open Datasets	Yes	Our first experimental setting is based on a widely used movie recommendation dataset Movielens10M (Harper & Konstan, 2015)... Our second setting is based on census data made available by folktables (Ding et al., 2021).
Dataset Splits	No	The paper mentions splitting data for testing ('0.2 of them for testing') but does not explicitly state a validation set or describe a train/validation/test split.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments (e.g., specific GPU/CPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions 'Python toolkit Surprise' and 'folktables' but does not provide specific version numbers for these software components or for Python itself.
Experiment Setup	Yes	Then the selected service updates their parameter with the gradient of the loss on the user s data with step size ηt = 1/t. At each time step, we sample a user x at random from the data described above. We assign this user to one of k services according to bounded rational with parameters ζ. For MSGD, we illustrate results of different total number of services k = 2, 4, 6. for each k, we use compute the average accuracy after T = 2000 k total timesteps and plot the average over 3 trials.