Volume 14, Issue 4 (3-2018)                   JSDP 2018, 14(4): 97-116 | Back to browse issues page

XML Persian Abstract Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Zare mehrjardi M, Rezaeian M. Using of Model Based Hand Poses Estimation for Imitation of User's Arm Movements by Robot Arm. JSDP. 2018; 14 (4) :97-116
URL: http://jsdp.rcisp.ac.ir/article-1-522-en.html
MSc Yazd University
Abstract:   (77 Views)

Pose estimation is a process to identify how a human body and/or individual limbs are configured in a given scene. Hand pose estimation is an important research topic which has a variety of applications in human-computer interaction (HCI) scenarios, such as gesture recognition, animation synthesis and robot control. However, capturing the hand motion is quite a challenging task due to its high flexibility. Many sensor-based and vision-based methods have been proposed to fulfill the task.
In sensor-based systems, specialized hardware is used for hand motion capture. Generally, vision-based hand pose estimation methods can be divided into two categories: appearance-based methods and model-based methods. In appearance-based approaches, various features are extracted from the input images to estimate the hand pose. Usually a lot of training samples are used to train a mapping function from the features to the hand poses in advance. Given the learned mapping function, the hand pose can be estimated efficiently. In model-based approaches the hand pose is estimated by aligning a projected 3D hand model to the extracted hand features in the inputs. Therefore, the desired information to be provided includes state at any time. These methods require a lot of calculations which are not possible in practice to implement them immediately.
Hand pose estimation using (color/depth) images consist of three steps:

  1. Hand detection and its separation
  2. Feature extraction
  3. Setting the parameters of the model using extracted feature and updating the model

To extract necessary features for pose estimation, depending on used model and usage of hand gesture analysis, features such as fingertips position, number of fingers, palm position and joint angles are extracted.
In this paper a model-based markerless dynamic hand poses estimation scheme is presented.  Motion Capture is the process of recording a live motion event and translating it into usable mathematical terms by tracking a number of key points in space over time and combining them to obtain a single 3D representation of the performance. The sequence of depth images, color images and skeleton data obtained from Kinect (a new tool for markerless motion capture) at 30 frames per second are as inputs of this scheme. The proposed scheme exploits both temporal and spatial features of the input sequences, and focuses on index and thumb fingertips localization and joint angles of the robot arm to mimic the user's arm movements in 3D space in an uncontrolled environment. The RoboTECH II ST240 is used as a real robot arm model. Depth and skeleton data are used to determine the angles of the robot joints. Three approaches to identify the tip of the thumb and index fingers are presented using existing data, each with its own limitations. In these approaches, concepts such as thresholding, edge detection, making convex hull, skin modeling and background subtraction are used. Finally, by comparing tracked trajectories of the user's wrist and robot end effector, the graphs show an error about 0.43 degree in average which is an appropriate performance in this research.
The key contribution of this work is hand pose estimation per every input frame and updating arm robot according to estimated pose. Thumb and index fingertips detection as part of feature vector resulted using presented approaches. User movements transmit to the corresponding Move instruction for robot. Necessary features for Move instruction are rotation values around joints in different directions and opening value of index and thumb fingers at each other.

Full-Text [PDF 8424 kb]   (24 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2016/05/22 | Accepted: 2017/03/5 | Published: 2018/03/13 | ePublished: 2018/03/13

1. [1] T. B. Moeslund and E. Granum, "A survey of computer vision-based human motion capture", Computer vision and image understanding, vol. 81, no. 3, pp. 231–268, 2001. [DOI:10.1006/cviu.2000.0897]
2. [2] H. Liang, J. Yuan, D. Thalmann, and Z. Zhang, "Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization", vol. 29, no. 6–8, pp. 837–848, 2013.
3. [3] K. Fujimura and X. Liu, "Sign recognition using depth image streams", 7th International Conference on Automatic Face and Gesture Recognition, pp. 381–386, 2006. [DOI:10.1109/FGR.2006.101]
4. [4] J. MacCormick and A. Blake, "A probabilistic exclusion principle for tracking multiple objects", International Journal of Computer Vision, vol. 39, no. 1, pp. 57–71, 2000. [DOI:10.1023/A:1008122218374]
5. [5] M. Moghaddam, M. Nahvi, and R. H. Pak, "Static Persian Sign Language Recognition Using Kernel-Based Feature Extraction", 7th Iranian Machine Vision and Image Processing (MVIP), pp. 1–5, 2011. [PMID] [PMCID]
6. [6] P. Breuer, C. Eckes, and S. Müller, "Hand gesture recognition with a novel IR time-of-flight range camera--a pilot study", in Comput-er Vision/Computer Graphics Collaboration Techniques, Springer, pp. 247–260, 2007. [DOI:10.1007/978-3-540-71457-6_23]
7. [7] C. Plagemann, V. Ganapathi, D. Koller, and S. Thrun, "Real-time identification and localiza-tion of body parts from depth images", IEEE International Conference on Robotics and Automation (ICRA), pp. 3108–3113, 2010.
8. [8] A. Baak, M. Müller, G. Bharaj, H.-P. Seidel, and C. Theobalt, "A data-driven approach for real-time full body pose reconstruction from a depth camera", Consumer Depth Cameras for Computer Vision. Springer London, pp. 71–98, 2013. [DOI:10.1007/978-1-4471-4640-7_5]
9. [9] L. A. Schwarz, A. Mkhitaryan, D. Mateus, and N. Navab, "Estimating human 3d pose from time-of-flight images based on geodesic distances and optical flow", IEEE International Conference on Automatic Face & Gesture Recognition and Workshops, pp. 700–706, 2011. [DOI:10.1109/FG.2011.5771333]
10. [10] A. Kuznetsova and B. Rosenhahn, "Hand pose estimation from a single rgb-d image", Advances in Visual Computing. Springer Berlin Heidelberg, pp. 592–602, 2013. [DOI:10.1007/978-3-642-41939-3_58]
11. [11] J. L. Raheja, A. Chaudhary, and K. Singal, "Tracking of fingertips and centers of palm using kinect", 3th international conference on Computational intelligence, modelling and simulation, pp. 248–252, 2011. [DOI:10.1109/CIMSim.2011.51]
12. [12] Y. Li, "Hand gesture recognition using Kinect", IEEE 3rd International Conference on Software Engineering and Service Science, pp. 196–199, 2012.
13. [13] Z. Li and R. Jarvis, "Real time hand gesture recognition using a range camera", Australa-sian Conference on Robotics and Automation, pp. 21–27, 2009.
14. [14] Q. K. Le, C. H. Pham, and T. H. Le, "Road Traffic Control Gesture Recognition using Depth Images", journal of IEEK Transactions on Smart Processing and Computing, vol. 1, no. 1, pp. 1–7, 2012.
15. [15] L. Cheng, Q. Sun, H. Su, Y. Cong, and S. Zhao, "Design and implementation of human-robot interactive demonstration system based on Kinect", 24th Chinese Control and Decision Conference, pp. 971–975,2012.
16. [16] K. Qian, J. Niu, and H. Yang, "Developing a Gesture Based Remote Human-Robot Interac-tion System Using Kinect", International Journal of Smart Home, vol. 7, no. 4, 2013.
17. [17] D. Xu, X. Wu, Y.-L. Chen, and Y. Xu, "Online Dynamic Gesture Recognition for Human Robot Interaction", Journal of Intelligent & Robotic Systems, pp. 1–14, 2014.
18. [18] A. Billard and M.J. Matarić, "Learning human arm movements by imitation:: Evaluation of a biologically inspired connectionist architect-ture", Robotics and Autonomous Syst-ems, vol. 37, no. 2, pp.145-160, 2001. [DOI:10.1016/S0921-8890(01)00155-5]
19. [19] S. Filiatrault and A.M. Cretu. "Human arm motion imitation by a humanoid robot", IEEE International Symposium on Robotic and Sensors Environments, pp.31-36,2014. [DOI:10.1109/ROSE.2014.6952979]
20. [20] J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, M. Cook, and R. Moore, "Real-time human pose recognition in parts from single depth images", Communica-tions of the ACM, vol. 56, no. 1, pp. 116–124, 2013. [DOI:10.1145/2398356.2398381]
21. [21] P. Kakumanu, S. Makrogiannis, and N. Bourbakis, "A survey of skin-color modeling and detection methods", Pattern Recognition, vol. 40, no. 3, pp. 1106–1122, 2007. [DOI:10.1016/j.patcog.2006.06.010]
22. [22] J. Yang, W. Lu, and A. Waibel, "Skin-color modeling and adaptation, " Springer, 1997. [DOI:10.21236/ADA327881]
23. [23] V. Vezhnevets, V. Sazonov, and A. Andreeva, "A survey on pixel-based skin color detection techniques", Proc. Graphicon, vol. 3, pp. 85–92, 2003.
24. [24] R.-L. Hsu, M. Abdel-Mottaleb, and A. K. Jain, "Face detection in color images", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 696–706, 2002. [DOI:10.1109/34.1000242]
25. [25] D. Chai and A. Bouzerdoum, "A Bayesian approach to skin color classification in YCbCr color space", TENCON Proceedings, vol. 2, pp. 421–424, 2000. [DOI:10.1109/TENCON.2000.888774]
26. [26] D. Chai and K. N. Ngan, "Locating facial region of a head-and-shoulders color image", Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 124–129, 1998. [DOI:10.1109/AFGR.1998.670936]
27. [27] K.-W. Wong, K.-M. Lam, and W.-C. Siu, "A robust scheme for live detection of human faces in color images", Signal Processing: Image Communication, vol. 18, no. 2, pp. 103–114, 2003. [DOI:10.1016/S0923-5965(02)00088-7]
28. [28] J. J. De Dios and N. Garcia, "Face detection based on a new color space YCgCr", Interna-tional Conference on Image Processing, vol. 3, pp. 906–909, 2003. [DOI:10.1109/ICIP.2003.1247393]
29. [29] H.-S. Yeo, B.-G. Lee, and H. Lim, "Hand tracking and gesture recognition system for human-computer interaction using low-cost hardware", Multimedia Tools and Applica-tions, vol. 74, no. 8, pp. 2687–2715, 2013. [DOI:10.1007/s11042-013-1501-1]

Add your comments about this article : Your username or Email:
Write the security code in the box

Send email to the article author

© 2015 All Rights Reserved | Signal and Data Processing