September 2019 Full

proposed approach, the two models can compensate for each

other. Specifically, the image-based point cloud model can be

used to extend the

SLAM

point clouds from short distances,

whereas the

SLAM

model can provide a scale for the image-

based model. Moreover, as discussed, the

SLAM

results can

provide additional depth constraints for the

SfM

process to

improve it. The designed scale-adaptive registration can then

merge those kinds of point clouds into a common coordinate

system to produce enhanced and extended 3D mapping re-

sults. Two challenging cases were used to evaluate the perfor-

mance of the proposed solution. The theoretical analysis and

experimental validation yield the following conclusions.

1. The incorporation of additional depth constraints from the

SLAM

results benefits the offline

SfM

; moreover, the data

collection can be completed at one time measurement.

2. The fusion of the distant point cloud model from the

RGB

image sequences to the short-range point clouds from the

depth sensors can significantly improve the coverage of 3D

mapping results (more than 50

% in our cases).

3. The designed scale-adaptive regi

stration can ensure the

geometric accuracy of the struct

ural edges (i.e., accuracy

in distant ranges is 1% at 20 m i

n our cases) and the 3D

point quality (i.e., the bias is lower than 10 cm for the

major surface of about 800 m

2

).

Although

RGB-D

sensors are rarely used in real mapping cases,

this paper shows the potential of such sensors to generate en-

hanced and extended 3D models with high mobility and low

cost. Such low-cost equipment could be used to quickly build

3D models in large indoor spaces, such as shopping malls,

hospitals, and airports, for a variety of indoor navigation ap-

plications. Thus, our future work will not only focus on the

methods of related technical aspects but also consider the ap-

plication of equipment in mapping and modeling projects.

Acknowledgments

This work was supported by grants from the Hong Kong Poly-

technic University (Project Nos. 1-ZEAB and 1-ZVN6) and

grants from the National Natural Science Foundation of China

(Project Nos. 41671426 and 41471345).

References

Aiger, D., N. J. Mitra and D. Cohen-Or. 2008. 4-points congruent sets

for robust pairwise surface registration.

ACM Transactions on

Graphics

(

TOG

): 85.

Besl, P. J. and N. D. McKay. 1992. Method for registration of 3-D

shapes, sensor fusion IV: Control paradigms and data structures.

International Society for Optics and Photonics

: 586–607.

Bolles, R. C. and M. A. Fischler. 1981. A RANSAC-based approach

to model fitting and its application to finding cylinders in range

data.

IJCAI

: 637–643.

Byravan, A. and D. Fox. 2017. Se3-nets: Learning rigid body motion

using deep neural networks. Pages 173–180 in

2017 IEEE

International Conference on Robotics and Automation

(

ICRA

).

Chow, J. C., D. D. Lichti, J. D. Hol, G. Bellusci and H. Luinge. 2014.

IMU and multiple RGB-D camera fusion for assisting indoor

stop-and-go 3D terrestrial laser scanning.

Robotics

3 (3):247–280.

Comport, A. I., E. Malis and P. Rives. 2007. Accurate quadrifocal

tracking for robust 3d visual odometry.

ICRA

, Citeseer: 40–45.

Crandall, D., A. Owens, N. Snavely and D. Huttenlocher. 2011.

Discrete-continuous optimization for large-scale structure from

motion. Pages 3001–3008 in

2011 IEEE Conference on Computer

Vision and Pattern Recognition

(

CVPR

).

Dai, A., M. Nießner, M. Zollhöfer, S. Izadi and C. Theobalt. 2017.

Bundlefusion: Real-time globally consistent 3d reconstruction

using on-the-fly surface reintegration.

ACM Transactions on

Graphics

(

TOG

) 36 (4):76a.

Dryanovski, I., R. G. Valenti and J. Xiao. 2013. Fast visual odometry

and mapping from RGB-D data. Pages 2305–2310 in

2013 IEEE

International Conference on Robotics and Automation

(

ICRA

).

Du, S., N. Zheng, S. Ying, Q. You and Y. Wu. 2007. An extension of

the ICP algorithm considering scale factor. Pages V-193-V-196 in

IEEE International Conference on Image Processing

(

ICIP

), 2007.

Engel, J., T. Schöps and D. Cremers. 2014. LSD-SLAM: Large-scale

direct monocular SLAM. Pages 834–849 in

European Conference

on Computer Vision

. Springer.

Frahm, J. M., P. Fite-Georgel, D. Gallup, T. Johnson, R. Raguram, C.

Wu and M. Pollefeys. 2010. Building Rome on a cloudless day.

Pages 368–381 in

European Conference on Computer Vision

.

Springer.

Gao, X., R. Wang, N. Demmel and D. Cremers. 2018. LDSO:

Direct sparse odometry with loop closure.

arXiv

preprint:

arXiv:1808.01111.

Garg, R., B. G., V. K., G. Carne

iro and I. Reid. 2016. Unsupervised

CNN for single view dep

th estimation: Geometry to the rescue.

Pages 740–756 in

Europe

an Conference on Computer Vision

.

Springer.

Ge, X. 2017. Automatic markerless registration of point clouds with

semantic-keypoint-based 4-points congruent sets.

ISPRS Journal

of Photogrammetry and Remote Sensing

130:344–357.

Ge, X. and T. Wunderlich. 2015. Target identification in terrestrial

laser scanning.

Survey Review

47 (341):129–140.

Ge, X. and T. Wunderlich. 2016. Surface-based matching of 3D point

clouds with variable coordinates in source and target system.

ISPRS Journal of Photogrammetry and Remote Sensing

111:

1–12.

Ge, X. 2016. Terrestrial laser scanning technology from calibration to

registration with respect to deformation monitoring. Dissertation,

Technische Universität München.

Gherardi, R., M. Farenzena and A. Fusiello. 2010. Improving the

efficiency of hierarchical structure-and-motion. Pages 1594–1600

in

CVPR

.

Havlena, M. and K. Schindler. 2014. Vocmatch: Efficient multiview

correspondence for structure from motion. Pages 46–60 in

European Conference on Computer Vision

. Springer.

Heinly, J., E. Dunn and J. Frahm. 2012. Comparative evaluation of

binary features. Pages 759–773 in

Computer Vision–ECCV 2012

.

Springer.

Henry, P., M. Krainin, E. Herbst, X. Ren and D. Fox. 2010. RGB-D

mapping: Using depth cameras for dense 3D modeling of

indoor environments. In the

12th International Symposium on

Experimental Robotics

(

ISER

). Citeseer.

Hesch, J. A., D. G. Kottas, S. L. Bowman and S. I. Roumeliotis. 2014.

Camera-IMU-based localization: Observability analysis and

consistency improvement.

The International Journal of Robotics

Research

33 (1):182–201.

Huttenlocher, D. 1991. Fast affine point matching: An output-

sensitive method. Pages 263–268 in

Proceedings IEEE Computer

Society Conference on Computer Vision and Pattern Recognition

(

CVPR

), 1991.

Johnson, R. and T. Zhang. 2014. Effective use of word order for

text categorization with convolutional neural networks.

arXiv

preprint: arXiv:1412.1058.

Kerl, C., J. Sturm and D. Cremers. 2013. Dense visual SLAM for

RGB-D cameras. Pages 2100–2106 in

2013 IEEE/RSJ International

Conference on Intelligent Robots and Systems

(

IROS

). Citeseer.

Kerl, C., J. Sturm and D. Cremers. 2013. Robust odometry estimation

for RGB-D cameras. Pages 3748–3754 in

2013 IEEE International

Conference on Robotics and Automation

(

ICRA

)..

Kümmerle, R., G. Grisetti, H. Strasdat, K. Konolige and W. Burgard.

2011. g 2 o: A general framework for graph optimization. Pages

3607–3613 in

2011 IEEE International Conference on Robotics

and Automation

(

ICRA

). .

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

September 2019

641

September 2019 Full - page 641

Warning.