Knowledge-Guided Agent-Tactic-Aware Learning for StarCraft Micromanagement

Authors: Yue Hu, Juntao Li, Xi Li, Gang Pan, Mingliang Xu

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the effectiveness of the proposed scheme against the state-of-the-art approaches in several benchmark combat scenarios.
Researcher Affiliation Academia 1 College of Computer Science and Technology, Zhejiang University, Hangzhou, China 2 Zhengzhou University, Zhengzhou, China
Pseudocode Yes Algorithm 1 Opponent-Guided Tactic Learning
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets No The paper states, "we conducted experiments on the Star Craft platform" and "we build several micromanagement scenarios, like m5v5, w15v17 and w18v20." However, it does not provide concrete access information (e.g., a link, DOI, or specific citation for a public dataset) for these scenarios or the StarCraft environment used beyond general references to the game platform.
Dataset Splits No The paper describes how training and evaluation are conducted (e.g., "We train 100 models for testing", "evaluate the learned model for 100 episodes"), but it does not specify traditional train/validation/test dataset splits with percentages or sample counts for a predefined dataset. The data is generated through interaction with the environment.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using "deep reinforcement learning (DRL)", "deep Q-network (DQN)", and "deep neural network (DNN)" with ReLU and Softmax activation functions. However, it does not specify version numbers for any programming languages, libraries (e.g., TensorFlow, PyTorch), or other software components used.
Experiment Setup Yes Algorithm 1 explicitly lists hyperparameter values: "Set learning-rate η = 0.001, collection number M = 4 Set train epochs E = 10, minibatch size B = 64 Set discount factor γ = 1, C = 100 Set T1 = 200, T2 = 500000, N = 10000". The paper also details feature map sizes ("map the whole game into a map, sized 85x85"), reward definition ("kill bonus is 100"), and network architecture (convolutional layers, full-connected layers, activation functions).