A Deep Reinforcement Learning Perspective on Internet Congestion Control

Authors: Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, Aviv Tamar

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that casting congestion control as RL enables training deep network policies that capture intricate patterns in data traffic and network conditions, and leverage this to outperform the state-of-the-art. We present a test suite for RL-guided congestion control based on the Open AI Gym interface. Our preliminary evaluation results suggest training Aurora in simple, simulated environments is sufficient to generate congestion control policies that perform well also in very different network domains and which are comparable to, or outperform, recent state-of-the-art handcrafted protocols. Our evaluation results demonstrated two key conclusions. First, Aurora is surprisingly robust to environments outside the scope of its training. Second, Aurora is comparable to or outperforms the state-of-the-art in congestion control (BBR, PCC-Vivace, Remy CC, and Copa).
Researcher Affiliation Academia 1University of Illinois at Urbana Champaign 2Hebrew University of Jerusalem 3Technion. Correspondence to: Nathan Jay <njay2@illinois.edu>, Noga H. Rotman <nogar02@cs.huji.ac.il>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Methods are described in prose.
Open Source Code Yes Our code is open-sourced as an Open AI Gym environment and an accompanying testing module, to be used by RL researchers and practitioners to evaluate their algorithms. Our code is available at our github repo. https://github.com/PCCproject/PCC-RL
Open Datasets No The paper uses a custom simulated environment for training described in Section 5 and Table 1, where data is generated dynamically. It does not refer to a pre-existing, static, publicly available dataset with a direct link, DOI, or formal citation.
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, and testing. It describes training in a simulated environment and evaluating in an emulated one, but not explicit data partitioning.
Hardware Specification No The paper mentions using Mininet and Pantheon for emulation, which implies a computing environment, but it does not specify any particular CPU models, GPU models, memory amounts, or cloud instance types used for running the experiments.
Software Dependencies No The paper mentions 'stable-baselines python package' and 'Open AI Gym environment' but does not specify version numbers for these or any other software components (e.g., Python version, specific libraries like PyTorch, TensorFlow, etc.).
Experiment Setup Yes We chose an architecture with two hidden layers composed of 32 16 neurons and tanh nonlinearity. We use α = 0.025. We trained Aurora with a linear reward function that rewards throughput while penalizing loss and latency: 10 throughput 1000 latency 2000 loss. We trained models with k ranging from 1 to 10 MIs. We examined three different values of γ and determined that γ = 0.99 gave the best results quickly. Table 1 provides training framework variables ranges for bandwidth, latency, queue size, and loss rate.