reproducibilityindex.ai

Infinity Learning: Learning Markov Chains from Aggregate Steady-State Observations

Authors: Jianfei Gao, Mohamed A. Zahran, Amit Sheoran, Sonia Fahmy, Bruno Ribeiro3922-3929

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply -SGD to a real-world testbed and synthetic experiments showcasing its accuracy, ability to extrapolate the steady state distribution to unobserved states under unobserved conditions (heavy loads, when training under light loads), and succeeding in difﬁcult scenarios where even a tailor-made extension of existing methods fails.
Researcher Affiliation	Academia	Jianfei Gao,1 Mohamed A. Zahran,2 Amit Sheoran,3 Sonia Fahmy,4 Bruno Ribeiro5 Department of Computer Science, Purdue University 305 N. University St, West Lafayette, IN 47907 {gao4621, mzahran2, asheoran3}@purdue.edu, {fahmy4, ribeiro5}@cs.purdue.edu
Pseudocode	No	The paper describes mathematical derivations and theoretical concepts, but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-sourcing of the code for the described methodology.
Open Datasets	No	The training data consists of custom-generated data: 'The training data (86 time windows) is generated under light loads...' for the testbed, and 'Structures used to simulate data are provided in Supplementary Material A3' for synthetic experiments. No concrete access information (link, DOI, or formal citation) to these datasets is provided for public availability.
Dataset Splits	No	The paper mentions 'training data' and 'test data' but does not explicitly provide details about a separate validation split or the percentages/counts for the data partitioning.
Hardware Specification	No	The paper describes the setup of the testbed system being emulated ('single server with a waiting queue of size K = 20'), but it does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments or simulations.
Software Dependencies	No	The paper mentions 'Pytorch s autodiff function' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	In most of our simulations, we set t = 128 = 27 and p(events)(0) = 1T/\|S\| throughout all our experiments.We also tested BPTT without divide and conquer but ﬁnd the optimization unstable due to the long backpropagation paths. We tested t {16, 128} and found that smaller values of t are easier to optimize but as expected generally produce worse approximations of the steady state for heavy loads. In most of our experiments, p = 0.1, that is, on average we consider only the ﬁrst ten terms in the sum of Equation (10). Contrast, -SGD s 10 summation terms with matrix powers that need no chain rule, with the baseline DCBPTT approach (Deﬁnition 2) where P (Q(x, θ))128 needs to be computed together with a chain rule to compute gradients over the matrix multiplications. It is no surprise that SGD is a more stable optimization method (no vanishing or exploding gradients); interestingly, -SGD also works well on the tested slow-mixing CTMCs, while baseline methods like DC-BPTT fail in these scenarios (see Supplementary Material C5).