Interferobot: aligning an optical interferometer by a reinforcement learning agent

Authors: Dmitry Sorokin, Alexander Ulanov, Ekaterina Sazhina, Alexander Lvovsky

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we train an RL agent to align a Mach-Zehnder interferometer, which is an essential part of many optical experiments, based on images of interference fringes acquired by a monocular camera. The agent is trained in a simulated environment, without any hand-coded features or a priori information about the physics, and subsequently transferred to a physical interferometer.
Researcher Affiliation Academia 1Russian Quantum Center, Moscow, Russia 2University of Oxford, United Kingdom 3P. N. Lebedev Physics Institute, Moscow, Russia 4Moscow Institute of Physics and Technology
Pseudocode No The paper describes the method in prose and mathematical equations but does not include structured pseudocode or an algorithm block.
Open Source Code Yes Videos of the interferometer alignment and software are available via link https://github.com/ dmitry Sorokin/interferobot Project
Open Datasets No The paper uses data generated by its own simulator for training, but does not provide concrete access information (link, DOI, etc.) to a pre-existing or archived public dataset.
Dataset Splits No The paper describes training and evaluation episodes, but it does not specify traditional train/validation/test dataset splits with percentages or sample counts for a static dataset. Data is generated dynamically within a simulated environment for training.
Hardware Specification Yes The whole training on a NVidia GTX 2060 GPU took about 10 hours.
Software Dependencies No The paper mentions software components like 'gym-like interface' and 'parallel C++ code' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We train the agent in the simulated environment using the double dueling DQN algorithm [31] with a discount factor γ = 0.99, total number of steps 5 106, and replay buffer size 3 104. Updates were performed every four steps.