Parallel Droplet Control in MEDA Biochips using Multi-Agent Reinforcement Learning

Authors: Tung-Che Liang, Jin Zhou, Yun-Sheng Chan, Tsung-Yi Ho, Krishnendu Chakrabarty, Cy Lee

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of the models in a realistic simulation setting, where microelectrodes degrade over time. We designed and executed a real-life bioassay, namely serial dilution, on a fabricated MEDA biochip. We evaluate three RL algorithms, i.e. double DQN, PPO, and ACER, described in Section 2 using three training schemes, namely centralized, concurrent, and parameter sharing.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA 2Department of Electronics Engineering, National Yang Ming Chiao Tung University, Hsinchu, Taiwan 3Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan.
Pseudocode No The paper describes algorithms conceptually but does not present them in pseudocode or a clearly labeled algorithm block.
Open Source Code Yes We open-source the simulator to the RL community for future research2. 2https://github.com/tcliang-tw/medaenv.git
Open Datasets No The paper uses a custom simulation environment (MEDA-Env) to generate training data on the fly and does not mention using a pre-existing, publicly available fixed dataset with access information.
Dataset Splits No The paper describes training processes and convergence but does not explicitly specify train/validation/test dataset splits with percentages or sample counts.
Hardware Specification Yes The training was executed on a Linux platform integrated with a 11 GB-memory GPU (Nvidia Ge Force RTX 2080 Ti). The chip has an area of 17.2 mm2, and it was fabricated using a 0.35 µm standard CMOS process. The chip was operated under 3.3 V at 1 KHz frequency.
Software Dependencies No The paper mentions using a 'Petting Zoo Gym environment' and various RL algorithms (Double DQN, PPO, ACER) but does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes For each RL algorithm, we ran 18 simulations with random seeds; the average performance of each algorithm is plotted as a solid line, and the similar color region shows the interval between its best performance and its worst performance. For each training game of MEDA-Env, nrt random routing tasks are generated, where 1 < nrt 3. A training epoch contains 20, 000 timesteps.