Benchmarking RGB Image Segmentation of Arabidopsis thaliana using Encoder–Decoder Models
High-throughput plant phenotyping using RGB imaging offers a scalable and non-invasive solution for monitoring plant growth and extracting various traits. However, achieving accurate segmentation across experiments remains a challenging task due to image variability usually caused by shifts in pot positions. This study introduces a customized image stabilization method to align pots consistently across time-series images of Arabidopsis thaliana, enhancing spatial consistency. A large-scale RGB dataset was collected and prepared, with 4,000 manually annotated images used to train multiple encoder–decoder deep learning models. Various CNN-based encoders were paired with well-known decoders, including U-Net, $\mathbf{U}^{2}$-Net, PANet, and DeepLabv3. Stabilization significantly improved performance of models, with the $EffNetB1 +\mathbf{U}^{2}$-Net encoder-decoder combination achieving the highest precision score of 0.95 and Intersection over Union of 0.96. These results demonstrate the value of spatial consistency and offer a robust, scalable pipeline for automated plant segmentation in indoor phenotyping systems.