reproducibilityindex.ai

Probabilistic Programming Bots in Intuitive Physics Game Play

Authors: Fahad Alhasoun, Sarah Alneghiemish778-783

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We discuss a case study showing empirical results of the performance of the model on the game of Flappy Bird. The approach was tested on the game of Flappy Bird.
Researcher Affiliation	Academia	Fahad Alhasoun, Sarah Alneghiemish Massachusetts Institute of Technology fha@mit.edu, smish@mit.edu
Pseudocode	Yes	Algorithm 1 Probabilistic Intuitive Physics Process procedure GETACTION(s, t, m) Inputs are the state s, current time t and previously sampled actions m Initialize the simulation horizon h to constant c α M(φ(s)) estimate the starting α from the CNN while elapsed time < ϵ do keep sampling for a duration of ϵ seconds a, ch SAMPLEACTIONS(s, t, α, h) if ch is False then observe when unwanted collision didn t occur add a to m store samples resulting in no unwanted collisions αi αi + h Pj=1 1aj=i update α parameters h h + δ expand the horizon of the simulation end if end while p(at = x\|ch = False) P\|m\| i=1 1mi,t=x estimating the conditional from samples in m at arg maxat( p(at\|ch = False)) return at end procedure procedure SAMPLEACTIONS(s, t, α, h) θ Dirichlet(α) sample θ, the probability of actions at, ..., at+(h 1) Categorical(θ) sample actions for timesteps t to t + (h 1) γai Gaussian(µai, σai) sample impact on the velocity ch PHYSICSSIMULATION(s, [γat, ..., γat+(h 1)], h) simulate sampled plan for next h steps return a, ch end procedure
Open Source Code	No	The paper mentions that Several replica implementations of the game is available on github along with an implementation of DQN and A3C proposed by Mnih et al., but this refers to third-party code for the game itself or baselines, not the authors' own source code for their proposed method.
Open Datasets	Yes	Human data are gathered through players on the web page in flappy.io. The average score for humans was 11.27 in 47 million games played, 95% of them had a score of 6 points or lower according to flappybird.io.
Dataset Splits	No	The paper states After training the CNN part of PBCNN of the model on 10000 frames of game play and Average scores are calculated after running each trained model for 10 times, but it does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments were provided in the paper.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions). It mentions using a convolutional neural network and probabilistic programs but no specific library details.
Experiment Setup	Yes	The CNN takes the last 4 frames as inputs and outputs α parameterizing the prior for the distribution of actions probabilities. The input of the neural network consists of an 80 80 4 frames of pixel data... The ﬁrst hidden layer convolves 32 ﬁlter of size 8 8 with stride 4 on the input frames and applies rectiﬁer nonlinearity. The second hidden layer convolves 64 ﬁlters with sizes of 4 4 with stride 2 then applies rectiﬁer nonlinearity. The third hidden layer convolves 64 ﬁlters with sizes of 2 2 with stride 1 then applies rectiﬁer nonlinearity. The fourth layer is a fully connected 512 nodes with rectiﬁer nonlinearity. Then the output layer is fully connected and has as many nodes as there are actions in the game. The CNN is ﬁt through applying stochastic gradient descent on the following cost function: L(α, φ, s, κ) = α M(φ(s), κ) 2