3D Scene and Object Classification Based on Information Complexity of Depth Data

Authors

Industrial Control Center of Excellence (ICCE), Advanced Robotics and Automated Systems (ARAS), Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran, Iran, P. O. Box 16315-1355

Abstract

In this paper the problem of 3D scene and object classification from depth data is addressed. In contrast to high-dimensional feature-based representation, the depth data is described in a low dimensional space. In order to remedy the curse of dimensionality problem, the depth data is described by a sparse model over a learned dictionary. Exploiting the algorithmic information theory, a new definition for the Kolmogorov complexity is presented based on the Earth Mover’s Distance (EMD). Finally the classification of 3D scenes and objects is accomplished by means of a normalized complexity distance, where its applicability in practice is proved by some experiments on publicly available datasets. Also, the experimental results are compared to some state-of-the-art 3D object classification methods. Furthermore, it has been shown that the proposed method outperforms FAB-Map 2.0 in detecting loop closures, in the sense of the precision and recall.

Keywords


W. Wohlkinger, M. Vincze, Ensemble of shape functions for 3D object classification, in Robotics and Biomimetics (ROBIO), 2011 IEEE International Conference on. IEEE, (2011), 2987–2992.

L. Bo, X. Ren, D. Fox, Depth kernel descriptors for object recognition, in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, (2011), 821–826.

W. Wohlkinger, M. Vincze, Shape-based depth image to 3D model matching and classification with inter-view similarity, in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on. IEEE, (2011), 4865–4870.

M. Cummins, P. Newman, Appearance-only SLAM at large scale with FAB-MAP 2.0, The International Journal of Robotics Research, Vol. 30(9), (2011), 1100–1123.

K. Granstrom, T. Schon, J. Nieto, F. Ramos, Learning to close loops from range data, The International Journal of Robotics Research, (2011).

M. Radovanovi´c, A. Nanopoulos, M. Ivanovi´c, Hubs in space: Popular nearest neighbors in high-dimensional data, The Journal of Machine Learning Research, Vol. 9999, (2010), 2487–2531.

R. Figueras i Ventura, P. Vandergheynst, P. Frossard, Low-rate and flexible image coding with redundant representations, Image Processing, IEEE Transactions on, Vol. 15(3), (2006), 726–739.

T. Cover and J. Thomas, Elements of information theory. Wiley-Interscience, (2006).

S. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, Signal Processing, IEEE Transactions on, Vol. 41(12), (1993), 3397–3415.

J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online dictionary learning for sparse coding, in Proceedings of the 26th Annual International Conference on Machine Learning. ACM, (2009), 689–696.

G. Davis, S. Mallat, M. Avellaneda, Adaptive greedy approximations, Constructive approximation, Vol. 13(1), (1997), 57–98.

E. Livshitz, On the optimality of the orthogonal greedy algorithm for -coherent dictionaries, Journal of Approximation Theory, Vol. 164(5), (2012), 668–681.

M. Davenport, M. Wakin, Analysis of orthogonal matching pursuit using the restricted isometry property, Information Theory, IEEE Transactions on, Vol. 56(9), (2010), 4395–4401.

A. Rahmoune, P. Vandergheynst, P. Frossard, Sparse approximation using m-term pursuit and application in image and video coding, Image Processing, IEEE Transactions on, Vol. 21(4), (2012), 1950 –1962.

E. Kokiopoulou, D. Kressner, P. Frossard, Optimal image alignment with random projections of manifolds: Algorithm and geometric analysis, Image Processing, IEEE Transactions on, Vol. 20(6), (2011),1543 –1557.

D. Cerra, M. Datcu, A fast compression-based similarity measure with applications to content-based image retrieval, Journal of Visual Communication and Image Representation, (2011).

H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 27(8), (2005), 1226–1238.

M. Li, P. Vitanyi, An introduction to Kolmogorov complexity and its applications. Springer-Verlag New York Inc, (2008).

M. Li, X. Chen, X. Li, B. Ma, and P. Vit´anyi, The similarity metric, Information Theory, IEEE Transactions on, Vol. 50(12), (2004), 3250–3264.

R. Cilibrasi, P. Vitanyi, Clustering by compression, Information Theory, IEEE Transactions on, Vol. 51(4), (2005), 1523 – 1545.

P. V´azquez, J. Marco, Using normalized compression distance for image similarity measurement: an experimental study, The Visual Computer, (2011), 1-22.

Y. Rubner, C. Tomasi, L. Guibas, The earth mover’s distance as a metric for image retrieval, International Journal of Computer Vision, Vol. 40(2), (2000), 99–121.

K. Lai, L. Bo, X. Ren, D. Fox, RGB-D Object Recognition: Features, Algorithms, and a Large Scale Benchmark, in Consumer Depth Cameras for Computer Vision. (2013), 167–192.

K. Lai, L. Bo, X. Ren, D. Fox, A large-scale hierarchical multi-view RGB-D object dataset, in Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, (2011), 1817–1824.

A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The kitti vision benchmark suite, in Computer Vision and Pattern Recognition (CVPR), Providence, (2012), USA.