Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

$\epsilon$-Seg: Sparsely Supervised Semantic Segmentation of Microscopy Data

Authors: Sheida Rahnamai Kordasiabi, Damian Nogare, Florian Jug

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirical results of ϵ-Seg and baseline methods on 2 dense EM datasets of biological tissues and demonstrate the applicability of our method also on fluorescence microscopy data. Our results show that ϵ-Seg is capable of achieving competitive sparsely-supervised segmentation results on complex biological image data, even if only limited amounts of training labels are available. Code available at https://github.com/juglab/eps-Seg. 4 Experiments and Results Datasets. We used the Beta Seg [22] dataset1, which was made publicly available by the authors. This Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) dataset captured primary mouse pancreatic islet β cells at a 16 nm isotropic resolution. The final dataset consists of two groups of high and low glucose cells and also provides human curated binary segmentation masks for seven subcellular structures, i.e. centrioles, nucleus, plasma membrane, microtubules, golgi body, granules, and mitochondria. Consistent with [34], we also chose the 4 high glucose cells for this work. For evaluation, cells 1, 2, and 3 from four cell volumes of high glucose were used for training, while cell 4 served as an independent test set.
Researcher Affiliation	Academia	Sheida Rahnamai Kordasiabi1,2, Damian Dalle Nogare1, Florian Jug1 1Human Technopole, Milan, Italy 2Technical University of Dresden, Germany
Pseudocode	No	The paper describes the method using diagrams (Figure 1 shows a pipeline) and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Code available at https://github.com/juglab/eps-Seg.
Open Datasets	Yes	Datasets. We used the Beta Seg [22] dataset1, which was made publicly available by the authors. This Focused Ion Beam Scanning Electron Microscopy (FIB-SEM) dataset captured primary mouse pancreatic islet β cells at a 16 nm isotropic resolution. The final dataset consists of two groups of high and low glucose cells and also provides human curated binary segmentation masks for seven subcellular structures, i.e. centrioles, nucleus, plasma membrane, microtubules, golgi body, granules, and mitochondria. Consistent with [34], we also chose the 4 high glucose cells for this work. For evaluation, cells 1, 2, and 3 from four cell volumes of high glucose were used for training, while cell 4 served as an independent test set. Next, we used the liver FIBSEM dataset, which consists of samples that were fresh needle biopsies fixed with 4%PFA and 2%GA in phosphate buffer. High contrast staining was performed with reduced osmium and Waltons lead aspartate stain [33] and embedded in Epon. Sample preparation and imaging were done on a ZEISS Gemini SEM according to prior reports [35]. The final dataset consists of one cell volume with 11 crops that have been extracted from a cell volume, annotated manually, and used for training, validation, and testing. The segmentation masks consist of six subcellular structures, mitochondria, peroxisomes, lipofuscin, basolateral membrane, open bile canaliculi, and closed bile canaliculi, along with an additional background category. Furthermore, we conducted an experiment on the overlapping subset of two datasets Aitslab-bioimaging1 [1] and Aitslab-bioimaging2 [25].
Dataset Splits	Yes	For evaluation, cells 1, 2, and 3 from four cell volumes of high glucose were used for training, while cell 4 served as an independent test set. The final dataset consists of one cell volume with 11 crops that have been extracted from a cell volume, annotated manually, and used for training, validation, and testing. The overlapping subset of them contains 30, 2-channel images for training and 10 for testing. While our main experiment (Table 1) includes 5-fold cross-validation to mitigate variability due to data splits, we did not report error bars or perform statistical significance tests.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments. It mentions that Transformer-based architectures require powerful computing setups, but does not specify the hardware used for their own model.
Software Dependencies	No	The paper does not explicitly list specific software dependencies with their version numbers, such as Python or specific library versions.
Experiment Setup	Yes	For all hyperparameters we have introduced, we used grid searches to find a good balance between performance and stability. The temperature parameter τ in the Gumbel-Softmax distribution plays a crucial role in controlling the degree of discreteness in the sampled values. During training, τ is often annealed from a higher value to a lower one, gradually transitioning from a smooth approximation to a discrete categorical distribution. In ϵ-Seg, we use a typical annealing schedule τ = max(τmin, exp( rt)), where r = 0.999 is the decay rate, τmin = 0.5, and t is the training step. L = LI + α1LCE + α2LKL + α3LCL, (17) where αi are hyperparameters to adjust the contribution of each loss to one another. We tuned those hyperparameters using grid search and manual tuning.