Assistant Professor Department of Computer Engineering and IT, Payame Noor University, Tehran, Iran
Abstract: (91 Views)
Semantic segmentation seeks to give a set of pixels depicting an object in an image suitable labels depending on their appearance and semantic characteristics. Though it is still one of the most difficult issues in image processing and computer vision, this work has attracted a lot of interest recently.
The availability of RGBD sensors has introduced new possibilities for segmentation by incorporating depth information alongside color. However, effectively combining these modalities presents challenges due to misalignments and depth inaccuracies. This paper proposes CRFCut, a novel unsupervised segmentation method that utilizes a Conditional Random Field (CRF) model optimized with graph cuts to segment RGBD images into coherent regions. The method recursively divides regions into foreground and background layers, employing superpixel-based appearance segmentation for the RGB component and integrating depth cues to refine results. This approach enables robust segmentation, even in the presence of noisy or incomplete depth information.
The CRFCut algorithm begins by separating the depth image into foreground and background regions using a median depth threshold. This initial step requires no preprocessing and provides the basis for further segmentation. Simultaneously, the RGB image is segmented into superpixels using an appearance-based approach, such as the mean-shift algorithm. These superpixels and the depth regions are combined within a CRF model, where labels are assigned by minimizing the energy function using the graph-cut α-expansion algorithm. The algorithm is applied recursively to subdivided regions, allowing finer segmentation in a parallelizable manner.
The proposed method was evaluated on two datasets: the NYUv2 dataset and the MIT dataset. The NYUv2 dataset, which includes 1449 RGBD images with annotated object classes, demonstrated the superior performance of CRFCut compared to five state-of-the-art segmentation techniques in Table 1. In the MIT dataset, which provides human-labeled sequences of indoor and outdoor scenes, CRFCut achieved comparable or better results, even with depth maps generated from 2D images using existing estimation methods (Table 2). The RandIndex metric was used to evaluate segmentation accuracy, and qualitative results, as shown in Figures 3 and 4, highlight CRFCut’s robustness, particularly with noisy or imprecise depth data.
In summary, CRFCut introduces an unsupervised CRF-based approach that integrates RGB and depth information for accurate scene segmentation. By leveraging graph-cut optimization and a recursive structure, the method achieves high-quality segmentation results with minimal preprocessing. Despite some limitations, such as challenges in distinguishing adjacent objects with similar features, CRFCut offers a promising framework for real-time segmentation of RGBD images. Future work will address these limitations by incorporating supervised techniques and improving depth data quality for enhanced performance.
Article number: 3
Type of Study:
Research |
Subject:
Paper Received: 2024/11/27 | Accepted: 2025/07/21 | Published: 2026/03/20 | ePublished: 2026/03/20