Interactive semantic segmentation in laparoscopic images with deep neural networks

  1. MONASTERIO EXPÓSITO, LETICIA
Supervised by:
  1. Daniel Pizarro Pérez Director
  2. Javier Macías Guarasa Co-director

Defence university: Universidad de Alcalá

Fecha de defensa: 24 November 2023

Committee:
  1. José Miguel Buenaposada Biencinto Chair
  2. Cristina Losada Gutiérrez Secretary
  3. David Jiménez Cabello Committee member

Type: Thesis

Abstract

Semantic segmentation plays a crucial role as a prerequisite task for various Computer Vision applications, such as image processing, tracking, image registration, and reconstruction. It finds applications in diverse fields such as industrial image processing, robotic vision, automated driving, and particularly, medical imaging. This thesis focuses on semantic segmentation in medical imaging, with a specific emphasis on Augmented Reality (AR) guided Minimally Invasive Surgery systems, for which semantic segmentation is a key preprocessing step. The major challenges faced in this domain are the lack of large and variable datasets and the intra-patient variability. Addressing these issues by creating suitable datasets is a challenging task due to limited patient availability and the time-consuming nature of manual annotation by experts. To overcome these challenges, this thesis investigates a semantic segmentation training methodology that improves generalization by using a new strategy for labeled data augmentation. The proposed method is based on using synthetic label deterioration methods to train a discriminator network. The discriminator is then used during training of the segmentation model and helps improving the segmentation results in unseen data. Our proposed method can be applied to state-of-the-art segmentation models and undergoes rigorous experimentation to validate its effectiveness. Additionally, we explore the use of interactive segmentation to leverage the expert’s knowledge and improve results. We devise an interactive click-based correction method that uses information from previous interactions in the form of accumulated guidance maps that are then used as an additional input in the segmentation model. Our approach is extensively evaluated on medical imaging datasets, demonstrating its efficacy and potential for advancement in the field of semantic segmentation, particularly in the context of Minimally Invasive Surgery images.