Abstract
Video object tracking represents a very important computer vision domain. In this paper, a perceptual hashing based template-matching method for object tracking is proposed to efficiently track objects in challenging video sequences. In the tracking process, we first apply three existing basic perceptual hashing techniques to visual tracking, namely average hash (aHash), perceptive hash (pHash) and difference hash (dHash). Compared with previous tracking methods such as mean-shift or compressive tracking (CT), perceptual hashing-based tracking outperforms in terms of efficiency and accuracy. In order to further improve the accuracy of object localization and the robustness of tracking, we propose Laplace-based Hash (LHash) and Laplace-based Difference Hash (LDHash). By qualitative and quantitative comparison with some representative tracking algorithms, experimental results show that our improved perceptual hashing-based tracking algorithms perform favorably against the state-of-the-art algorithms under various challenging environments in terms of time cost, accuracy and robustness. Since our improved perceptual hashing can be a compact and efficient representation of objects, it can be further applied to fusing with depth information for more robust RGB-D video tracking.
Similar content being viewed by others
References
Altinok A, El-Saban M, Peck AJ (2006) Activity analysis in microtubule videos by mixture of hidden Markov models. 2006 I.E. Conf Comput Vision Pattern Recognit 2:1662–1669
Avidan S (2004) Support vector tracking. IEEE Trans Pattern Anal Mach Intell 26(8):1064–1072
Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632
Bhattacharjee S, Kutter M (1998) Compression tolerant image authentication. Proc Int Conf Image Process 1:435–439
Black MJ, Jepson AD (1998) Eigentracking: robust matching and tracking of articulated objects using a view-based representation. Int J Comput Vis 26(1):63–84
Bradski G, Kaehler A (2008) Learning OpenCV: computer vision with the OpenCV library. “ O’Reilly Media, Inc.”
Bulling A, Gellersen H (2010) Toward mobile eye-based human-computer interaction. Pervasive Comput 9(4):8–12
Bulling A, Ward J, Gellersen H et al (2011) Eye movement analysis for activity recognition using electrooculography. IEEE Trans Pattern Anal Mach Intell 33(4):741–753
Cesetti A, Frontoni E, Mancini A (2010) A vision-based guidance system for UAV navigation and safe landing using natural landmarks. Selected papers from the 2nd International Symposium on UAVs, Reno, Nevada, USA June 8–10, 2009. Springer, Netherlands, pp 233–257
Chen J (2010) UAV-guided navigation for ground robot tele-operation in a military reconnaissance environment. Ergonomics 53:940–950
Chen N, Xiao HD, Wan W (2011) Audio hash function based on non-negative matrix factorisation of mel-frequency cepstral coefficients. Information Security, IET 5(1):19–25
Comaniciu D, Ramesh V, Meer P (2000) Real-time tracking of non-rigid objects using mean shift. IEEE Conf Comput Vision Pattern Recognit 2:142–149
Coşkun B, Sankur B (2004) Robust video hash extraction. Proc IEEE Conf Sign Process Commun Appl:292–295
Jia Z, Balasuriya A, Challa S (2008) Autonomous vehicles navigation with visual target tracking: technical approaches. Algorithms 1(2):153–182
Jie Z (2013) A novel block-DCT and PCA based image perceptual hashing algorithm. arXiv preprint arXiv:1306.4079
Kalal Z, Matas J, Mikolajczyk K (2010) Pn learning: bootstrapping binary classifiers by structural constraints. IEEE Conf Comput Vision Pattern Recognit:49–56
Karavasilis V, Nikou C, Likas A (2011) Visual tracking using the earth mover’s distance between Gaussian mixtures and Kalman filtering. Image Vis Comput 29(5):295–305
Kwon J, Lee KM (2010) Visual tracking decomposition. IEEE Conf Comput Vision Pattern Recognit:1269–1276
Kwon J, Lee KM (2010) Visual tracking decomposition. (CVPR). IEEE Conf Comput Vision Pattern Recognit 1269–1276
Laradji IH, Ghouti L, Khiari EH (2013) Perceptual hashing of color images using hypercomplex representations. IEEE Int Conf Imag Process: 4402–4406
Li J, Allinson NM (2008) A comprehensive review of current local features for computer vision. Neurocomputing 71(10):1771–1787
Li X, Shen C, Dick A, et al. (2013) Learning compact binary codes for visual tracking. IEEE Conf Comput Vision Pattern Recognit:2419–2426
Li H, Shen C, Shi Q (2011) Real-time visual tracking using compressive sensing. IEEE Conf Computer Vision Pattern Recognition: 1305–1312
Liu L, Shao L (2013) Learning discriminative representations from RGB-D video data. Proc Twenty-Third Int Joint Conf Artificial Intell. AAAI Press, 1493–1500
Liu L, Yu M, Shao L (2015) Multiview alignment hashing for efficient image search. IEEE Trans Image Process 24(3):956–966
Mei X, Ling H (2011) Robust visual tracking and vehicle classification via sparse representation. IEEE Trans Pattern Anal Mach Intell 33(11):2259–2272
Micalizio R, Scala E, Torasso P (2011) Intelligent supervision for robust plan execution. AI* IA 2011: artificial intelligence around man and beyond. Springer, Berlin Heidelberg, pp 151–163
Monga V, Evans BL (2006) Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans Image Process 15(11):3452–3465
Narasimha MJ, Peterson AM (1978) On the computation of the discrete cosine transform. IEEE Trans Commun 26(6):934–936
Newcombe R, Fox D, Seitz S (2015) DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. IEEE Comput Vision Pattern Recognition: 343–352
Perng MH, Chang HH (1993) Intelligent supervision of servo control. control theory and applications. IEE Proc D IET 140(6):405–412
Santner J, Leistner C, Saffari A, et al. (2010) Prost: parallel robust online simple tracking. 2010 I.E. Conf Comput Vision Pattern Recognit:723–730
Shao L, Liu L, Li X (2014) Feature learning for image classification via multiobjective genetic programming. IEEE Trans Neural Networks Learn Syst 25(7):1359–1371
Wang L, Liu T, Wang G, Chan KL, Yang Q (2015) Video tracking using learned hierarchical features. IEEE Trans Image Process 24(4):1424–1435
Wang PK, Torrione PA, Collins LM, et al. (2012) Rapid position estimation and tracking for autonomous driving. SPIE defense, security, and sensing. Int Soc Optics Photonics:83871I–83871I
Watson AB (1994) Image compression using the discrete cosine transform. Mathematica J 4(1):81
Weng L, Preneel B (2009) Shape-based features for image hashing. 2009. IEEE Int Conf Multimed Expo: 1074–1077
Wen-Hsiung C, Smith C, Fralick S (1977) A fast computational algorithm for the discrete cosine tranfsorm. IEEE Trans Commun 25(9):1004–1009
Yang B, Gu F, Niu X (2006) Block mean value based image perceptual hashing. Int Conf Intell Inform Hiding Multimed Sign Process:167–172
Yang H, Shao L, Zheng F et al (2011) Recent advances and trends in visual tracking: a review. Neurocomputing 74(18):3823–3831
Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv (CSUR) 38(4):13
Yoon Y, Yun W, Yoon H, et al. (2014) Real-time visual target tracking in RGB-D data for person-following robots. Pattern Recognition (ICPR), 2014 22nd Int Conf. IEEE, 2227–2232
Yu M, Liu L, Shao L (2015) Structure-preserving binary representations for RGB-D action recognition. IEEE Trans Pattern Anal Mach Intell. doi:10.1109/TPAMI.2015.2491925
Zhang P, Li N (2005) The intellectual development of human-computer interaction research: a critical assessment of the MIS literature (1990–2002). J Assoc Inf Syst 6(11):227–292
Zhang BC, Li ZG, Perina, A (2016) Adaptive local movement modeling for object tracking, IEEE TCSVT
Zhang B, Perina A, Li Z, Murino V, Liu J, Ji R (2016) Bounding multiple gaussians uncertainty with application to object tracking. Int J Comput Vision:1–16
Zhang S, Yao H, Zhou H et al (2013) Robust visual tracking based on online learning sparse representation. Neurocomputing 100:31–40
Zhang K, Zhang L, Yang MH (2012) Real-time compressive tracking. computer vision–ECCV 2012. Springer, Berlin Heidelberg, pp 864–877
Zhu F, Shao L (2014) Weakly-supervised cross-domain dictionary learning for visual recognition. Int J Comput Vis 109(1–2):42–59
Acknowledgments
This work is supported by National Natural Science Foundation of China (61463032, 61563035, and 81501560) and Scientific Research Foundation for Returned Scholars, Ministry of Education of China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fei, M., Ju, Z., Zhen, X. et al. Real-time visual tracking based on improved perceptual hashing. Multimed Tools Appl 76, 4617–4634 (2017). https://doi.org/10.1007/s11042-016-3723-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3723-5