1. D. Comaniciu and P. Meer, "Mean shift: A robust approach toward feature space analysis," IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, pp. 603-619, 2002. [
DOI:10.1109/34.1000236]
2. Z. Wu, Z. Zhou, G. Allibert, C. Stolz, C. Demonceaux, and C. Ma, "Transformer fusion for indoor rgb-d semantic segmentation," Computer Vision and Image Understanding, vol. 249, p. 104174, 2024. [
DOI:10.1016/j.cviu.2024.104174]
3. C. Liu, W. T. Freeman, E. H. Adelson, and Y. Weiss, "Human-assisted motion annotation," in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008: IEEE, pp. 1-8. [
DOI:10.1109/CVPR.2008.4587845]
4. P. F. Felzenszwalb and D. P. Huttenlocher, "Efficient graph-based image segmentation," International journal of computer vision, vol. 59, pp. 167-181, 2004. [
DOI:10.1023/B:VISI.0000022288.19776.77]
5. D. Sun, E. B. Sudderth, and M. J. Black, "Layered segmentation and optical flow estimation over time," in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012: IEEE, pp. 1768-1775. [
DOI:10.1109/CVPR.2012.6247873]
6. L. u. Ladický, C. Russell, P. Kohli, and P. H. Torr, "Associative hierarchical crfs for object class image segmentation," in 2009 IEEE 12th international conference on computer vision, 2009: IEEE, pp. 739-746. [
DOI:10.1109/ICCV.2009.5459248]
7. Criminisi, G. Cross, A. Blake, and V. Kolmogorov, "Bilayer segmentation of live video," in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), 2006, vol. 1: IEEE, pp. 53-60. [
DOI:10.1109/CVPR.2006.69]
8. M. Szummer, P. Kohli, and D. Hoiem, "Learning CRFs using graph cuts," in Computer Vision-ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part II 10, 2008: Springer, pp. 582-595. [
DOI:10.1007/978-3-540-88688-4_43]
9. Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts," IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 11, pp. 1222-1239, 2001. [
DOI:10.1109/34.969114]
10. C. Rother, V. Kolmogorov, and A. Blake, "" GrabCut" interactive foreground extraction using iterated graph cuts," ACM transactions on graphics (TOG), vol. 23, no. 3, pp. 309-314, 2004. [
DOI:10.1145/1015706.1015720]
11. S. Mirkamali and P. Nagabhushan, "Depth-wise image inpainting," in Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), 2012: IEEE, pp. 141-144.
12. حاجی اسماعیلی، محمدمهدی، منتظر، غلامعلی، «مروری نقادانه بر روشهای بازیابی محتوامحور و معناگرای تصاویر»، فصلنامة پردازش علائم و دادهها، 22 (1)، صص 113-141، 1404.
12. M. M. Haji-Esmaeili and G. Montazer, "a Critical Survey on Content-Based & Semantic Image Retrieval - Abstract," (in eng), Signal and Data Processing, Research vol. 22, no. 1, pp. 113-141, 2025, doi: 10.61186/jsdp.22.1.113. [
DOI:10.61186/jsdp.22.1.113]
13. J. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888-905, 2000. [
DOI:10.1109/34.868688]
14. S. Du, W. Wang, R. Guo, R. Wang, and S. Tang, "Asymformer: Asymmetrical cross-modal representation learning for mobile platform real-time rgb-d semantic segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 7608-7615. [
DOI:10.1109/CVPRW63382.2024.00756]
15. X. He, R. S. Zemel, and D. Ray, "Learning and incorporating top-down cues in image segmentation," in Computer Vision-ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 2006,7-13 Proceedings, Part I 9, 2006: Springer, pp. 338-351. [
DOI:10.1007/11744023_27]
16. Ren and Malik, "Learning a classification model for segmentation," in Proceedings ninth IEEE international conference on computer vision, 2003: IEEE, pp. 10-17 vol. 1. [
DOI:10.1109/ICCV.2003.1238308]
17. A. Jepson and M. J. Black, "Mixture models for optical flow computation," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1993: IEEE, pp. 760-761. [
DOI:10.1109/CVPR.1993.341161]
18. N. Jojic and B. J. Frey, "Learning flexible sprites in video layers," in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001, vol. 1: IEEE, pp. I-I. [
DOI:10.1109/CVPR.2001.990476]
19. D. Sun, E. Sudderth, and M. Black, "Layered image motion with explicit occlusions, temporal consistency, and depth ordering," Advances in Neural Information Processing Systems, vol. 23, 2010.
20. M. Bleyer, C. Rother, P. Kohli, D. Scharstein, and S. Sinha, "Object stereo-joint stereo matching and object segmentation," in CVPR 2011, 2011: IEEE, pp. 3081-3088. [
DOI:10.1109/CVPR.2011.5995581]
21. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, "Indoor segmentation and support inference from rgbd images," in Computer Vision-ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12, 2012: Springer, pp. 746-760. [
DOI:10.1007/978-3-642-33715-4_54]
22. L. Wang, C. Zhang, R. Yang, and C. Zhang, "Tofcut: Towards robust real-time foreground extraction using a time-of-flight camera," in Proc. of 3DPVT, 2010, pp. 1-8.
23. A. D. Jepson, D. J. Fleet, and M. J. Black, "A layered motion representation with occlusion and compact spatial support," in Computer Vision-ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28-31, 2002 Proceedings, Part I 7, 2002: Springer, pp. 692-706. [
DOI:10.1007/3-540-47969-4_46]
24. Y. Weiss and E. H. Adelson, "A unified mixture framework for motion segmentation: Incorporating spatial coherence and estimating the number of models," in Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1996: IEEE, pp. 321-326. [
DOI:10.1109/CVPR.1996.517092]
25. J. Wills, Agarwal, S., and Belongie, S., "What Went Where," CVPR, vol. v.1, pp. 37-44, 2003.
26. J. Xiao and M. Shah, "Motion layer extraction in the presence of occlusion using graph cuts," IEEE transactions on pattern analysis and machine intelligence, vol. 27, no. 10, pp. 1644-1659, 2005. [
DOI:10.1109/TPAMI.2005.202]
27. P. Kohli, L. u. Ladický, and P. H. Torr, "Robust higher order potentials for enforcing label consistency," International Journal of Computer Vision, vol. 82, pp. 302-324, 2009. [
DOI:10.1007/s11263-008-0202-0]
28. B. Yin, X. Zhang, Z. Li, L. Liu, M.-M. Cheng, and Q. Hou, "Dformer: Rethinking rgbd representation learning for semantic segmentation," arXiv preprint arXiv:2309.09668, 2023.
29. L. Zhong, C. Guo, J. Zhan, and J. Deng, "Attention-based fusion network for RGB-D semantic segmentation," Neurocomputing, vol. 608, p. 128371, 2024. [
DOI:10.1016/j.neucom.2024.128371]
30. Z. Li, C. Lang, G. Li, T. Wang, and Y. Li, "Depth guided feature selection for RGBD salient object detection," Neurocomputing, vol. 519, pp. 57-68, 2023. [
DOI:10.1016/j.neucom.2022.11.030]
31. Y. Tong, J. Chen, and Y. Wang, "Geometry-guided multilevel RGBD fusion for surface normal estimation," Computer Communications, vol. 206, pp. 73-84, 2023. [
DOI:10.1016/j.comcom.2023.04.014]
32. B. Xiong, Y. Peng, J. Zhu, J. Gu, Z. Chen, and W. Qin, "AGWNet: Attention-guided adaptive shuffle channel gate warped feature network for indoor scene RGB-D semantic segmentation," Displays, p. 102730, 2024. [
DOI:10.1016/j.displa.2024.102730]
33. N. Komodakis, G. Tziritas, and N. Paragios, "Fast, approximately optimal solutions for single and dynamic MRFs," in 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007: IEEE, pp. 1-8. [
DOI:10.1109/CVPR.2007.383095]
34. S. Gupta, P. Arbelaez, and J. Malik, "Perceptual organization and recognition of indoor scenes from RGB-D images," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 564-571. [
DOI:10.1109/CVPR.2013.79]
35. S. Gupta, R. Girshick, P. Arbeláez, and J. Malik, "Learning rich features from RGB-D images for object detection and segmentation," in Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13, 2014: Springer, pp. 345-360. [
DOI:10.1007/978-3-319-10584-0_23]
36. Y. Liu, O. Yoshie, and H. Watanabe, "Application of multi-modal fusion attention mechanism in semantic segmentation," in Proceedings of the Asian conference on computer vision, 2022, pp. 1245-1264. [
DOI:10.1007/978-3-031-26293-7_23]
37. Y. Zhang, C. Xiong, J. Liu, X. Ye, and G. Sun, "Spatial-information guided adaptive context-aware network for efficient RGB-D semantic segmentation," IEEE Sensors Journal, 2023. [
DOI:10.1109/JSEN.2023.3304637]
38. G. Zhang, J. Jia, T.-T. Wong, and H. Bao, "Consistent depth maps recovery from a video sequence," IEEE Transactions on pattern analysis and machine intelligence, vol. 31, no. 6, pp. 974-988, 2009. [
DOI:10.1109/TPAMI.2009.52]
39. W. M. Rand, "Objective criteria for the evaluation of clustering methods," Journal of the American Statistical association, vol. 66, no. 336, pp. 846-850, 1971. [
DOI:10.1080/01621459.1971.10482356]