Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning
Authors: Alexandre Oliveira, Katarina Dyreby, Francisco Caldas, Claudia Soares
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The environment is validated against various real satellite constellations, including Starlink, achieving a Mean Absolute Percentage Error (MAPE) of 0.16% compared to real-world data. This validation ensures reliability for generating high-fidelity simulations and enabling autonomous and independent satellite operations. |
| Researcher Affiliation | Academia | Alexandre Oliveira NOVA University of Lisbon Caparica, Almada, Portugal EMAIL Katarina Dyreby NOVA University of Lisbon Caparica, Almada, Portugal EMAIL Francisco Caldas NOVA University of Lisbon Caparica, Almada, Portugal EMAIL Cláudia Soares NOVA University of Lisbon Caparica, Almada, Portugal EMAIL |
| Pseudocode | No | Formal Model Definitions: For completeness, we provide the mathematical details of the methods referred to in the main paper. C.1 Double Deep Q-Network (DDQN)... C.2 Deep Deterministic Policy Gradient (DDPG)... C.3 Proximal Policy Optimization (PPO)... The paper provides mathematical formulas for algorithms but no pseudocode blocks. |
| Open Source Code | Yes | This project is open source1 and has a dedicated project page2. 1https://github.com/orbitzoo/orbit_zoo |
| Open Datasets | Yes | To evaluate Orbit Zoo s ability to bridge the reality gap, we configured the environment to simulate Starlink satellites and compared its output to publicly available ephemeris data in Space-Track. |
| Dataset Splits | No | The paper uses simulated environments for training RL agents and compares its simulation results with real-world ephemeris data. It does not describe explicit train/test/validation splits for a fixed dataset in the traditional machine learning sense. |
| Hardware Specification | Yes | For the following experiments and evaluations, the hardware used is detailed in Table 4, with the GPU utilized solely for training RL agents. Table 4: Hardware specifications. Hardware Specification CPU Intel(R) Core(TM) i3-8100 CPU @3.60 Hz, 3600 Mhz, 4 Cores, 4 Logical Processors GPU NVIDIA GeForce GTX 1050 Ti RAM 16.0 GB, 2933 Mhz |
| Software Dependencies | Yes | Data Generation: Built on Python and with a robust space dynamics library on its background... Orekit [45] is one of the most comprehensive open-source libraries for astrodynamics... [45] Maisonobe... Cs-si/orekit: 12.2.1, December 2024. |
| Experiment Setup | Yes | The overall structure of the actor and critic networks used throughout the experiments is similar, with two hidden layers and Tanh activation functions, as represented in Table 5. No extensive research was made to find optimal hyperparameters. ... Table 8: Orbit Zoo vs. Kolosa: Training hyperparameters. Parameter Value Actor learning rate 0.00001 Critic learning rate 0.0001 Epochs 1 Discount factor (γ) 0.99 τ 0.01 µ 0 σ 0.2 θ 0.15 t 0.01 Memory Capacity 10 000 Initial Standard Deviation 0.5 Batch Size 256 |