reproducibilityindex.ai

GeoMAN: Multi-level Attention Networks for Geo-sensory Time Series Prediction

Authors: Yuxuan Liang, Songyu Ke, Junbo Zhang, Xiuwen Yi, Yu Zheng

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two types of real-world datasets, viz., air quality data and water quality data, demonstrate that our method outperforms nine baseline methods.
Researcher Affiliation	Collaboration	1 School of Computer Science and Technology, Xidian University, Xi an, China 2 Urban Computing Business Unit, JD Finance, Beijing, China 3 Zhiyuan College, Shanghai Jiao Tong University, Shanghai, China 4 School of Information Science and Technology, Southwest Jiaotong University, Chengdu, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	Yes	Air quality: Scratched from a public website1, this dataset includes the concentration of many different pollutants (e.g., PM2.5, SO2 and NO), together with some meteorological readings (e.g., temperature and wind speed) collected by totally 35 sensors every hour in Beijing. Among them, the primary pollutant of air quality is PM2.5 in most cases, thus we employ its reading as the target series. We brieﬂy use the inverse of geospatial distance to denote the similarity between two sensors. 1http://zx.bjmemc.com.cn/ and Water quality: ... We use the metric proposed by [Liu et al., 2016a] as the similarity matrix in this dataset.
Dataset Splits	Yes	In the experiment with respect to the water quality, we partition the data into non-overlapped training, validation and test data by a ratio of 4:1:1. i.e., we use the ﬁrst two-year data as the training set, the ﬁrst half of the last year as the validation set, and the second half of the last year as the test set. Unfortunately, we cannot obtain such big data in the second dataset. Hence, we use a ratio of 8:1:1 to overcome it.
Hardware Specification	Yes	Our model, as well as the baselines, are implemented with Tensor Flow [Abadi et al., 2016] on the server with one Tesla K40m and Intel Xeon E5.
Software Dependencies	No	The paper mentions 'Tensor Flow' as the implementation framework but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	Yes	During the training phase, the batch size is 256 and the learning rate is 0.001. In external factor fusion module, we embed Sensor ID to R6 and the time features to R10. Totally, there are 4 hyperparameters in our model, of which the trade-off parameter λ is empirically ﬁxed from 0.1 to 0.5. For the length of window size T, we set T {6, 12, 24, 36, 48}. For simplicity, we use the same hidden dimensionality at the encoder and the decoder, and conduct a grid search over {32, 64, 128, 256}. Moreover, we use stacked LSTMs (the number of layers is denoted as q) as the unit of encoder and decoder to enhance our performance. The setting in which q = 2, m = n = 64 and λ = 0.2 outperforms the others in the validation set.