Quantitative analysis of large-scale medical imaging datasets can be streamlined using automated segmentation. The growing number of AI-based methods for anatomical segmentation raises a central challenge of choosing among functionally similar models due to: the absence of ground truth data for representative samples, and practical challenges in comparing segmentation results (inconsistent structure naming, non-uniform formats, and complexity of visualization). Our work alleviates these issues by evaluating six open-source segmentation models—TotalSegmentator 1.5 and 2.6, Auto3DSeg, Moose, MultiTalent, and OMAS—on a sample of CT scans from the publicly available National Lung Screening Trial (NLST) dataset. We analyzed 31 anatomical structures—lungs, vertebrae, ribs, and heart—after harmonizing segmentation results to follow consistent representation. To support visual comparison, we developed open-source tools in 3D Slicer automating loading, structure-wise inspection and comparison across models. For quantitative comparison we evaluated consensus segmentations per structure and assessed model agreement using Dice similarity and volume differences. Preliminary results show excellent agreement segmenting some (e.g., lung) but not all structures (e.g., some models produce invalid vertebrae or rib segmentations). Only one model, Moose, segmented the costovertebral joints—rib-to-spine connections. Overall, this work assists in model evaluation in absence of ground truth, ultimately enabling informed model selection.
This project builds upon the previous work “Review of segmentation results quality across various multi-organ segmentation models”, conducted during the last Project Week in Gran Canaria. The goal is to systematically evaluate and compare the segmentations of six publicly available multi-organ segmentation models. This evaluation is done by identifying areas of agreement and disagreement across anatomical structures in our dataset, for which ground truth segmentations are unavailable.
During this Project Week, we will improve upon and extend the previous analysis by extending the scope of comparison, and engaging with the users of the evaluated models and model developers.
Current Status of our Analysis:
The Slicer Segmentation Verification Module Extension currently:
An interactive poster with a summary of the results and the current state of the project can be found at the following link: https://www.dropbox.com/scl/fi/c84sm9djytyi80jk2ixfa/giebeler.lena.pptx?rlkey=g3sf82zuv5fgmuog0an3dsy96&dl=0
Current Status of the Slicer Segmentation Verification Module Extension: