Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data

Gledson Melotti; Cristiano Premebida; Nuno M. M. Da S. Goncalves; Urbano J. C. Nunes; Diego R. Faria

doi:10.1109/ITSC.2018.8569666

Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data

Gledson Melotti, Cristiano Premebida, Nuno M. M. Da S. Goncalves, Urbano J. C. Nunes, Diego R. Faria

College of Engineering and Physical Sciences

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

This paper presents a study on pedestrian classification based on deep learning using data from a monocular camera and a 3D LIDAR sensor, separately and in combination. Early and late multi-modal sensor fusion approaches are revisited and compared in terms of classification performance. The problem of pedestrian classification finds applications in advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently because, among other reasons, safety involving self-driving vehicles. Convolutional Neural Networks (CNN) is used in this work as classifier in distinct situations: having a single sensor data as input, and by combining data from both sensors in the CNN input layer. Range (distance) and intensity (reflectance) data from LIDAR are considered as separate channels, where data from the LIDAR sensor is feed to the CNN in the form of dense maps, as the result of sensor coordinate transformation and spatial filtering; this allows a direct implementation of the same CNN-based approach on both sensors data. In terms of late-fusion, the outputs from individual CNNs are combined by means of learning and non-learning approaches. Pedestrian classification is evaluated on a ‘binary classification’ dataset created from the KITTI Vision Benchmark Suite, and results are shown for each sensor-modality individually, and for the fusion strategies.

Original language	English
Title of host publication	2018 21st International Conference on Intelligent Transportation Systems (ITSC)
Publisher	IEEE
Pages	3138-3143
ISBN (Electronic)	978-1-7281-0323-5
ISBN (Print)	978-1-7281-0321-1
DOIs	https://doi.org/10.1109/ITSC.2018.8569666
Publication status	Published - 10 Dec 2018
Event	2018 IEEE International Conference on Intelligent Transportation Systems (ITSC) - Maui, HI, USA Duration: 4 Nov 2018 → 7 Nov 2018

Publication series

Name	2018 21st International Conference on Intelligent Transportation Systems (ITSC)
Publisher	IEEE
ISSN (Print)	2153-0009
ISSN (Electronic)	2153-0017

Conference

Conference	2018 IEEE International Conference on Intelligent Transportation Systems (ITSC)
Period	4/11/18 → 7/11/18

Access to Document

10.1109/ITSC.2018.8569666

Cite this

Melotti, G., Premebida, C., Goncalves, N. M. M. D. S., Nunes, U. J. C., & Faria, D. R. (2018). Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC) (pp. 3138-3143). (2018 21st International Conference on Intelligent Transportation Systems (ITSC)). IEEE. https://doi.org/10.1109/ITSC.2018.8569666

@inproceedings{d835f5ffc5c64a5db7e1a93295d75c69,

title = "Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data",

abstract = "This paper presents a study on pedestrian classification based on deep learning using data from a monocular camera and a 3D LIDAR sensor, separately and in combination. Early and late multi-modal sensor fusion approaches are revisited and compared in terms of classification performance. The problem of pedestrian classification finds applications in advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently because, among other reasons, safety involving self-driving vehicles. Convolutional Neural Networks (CNN) is used in this work as classifier in distinct situations: having a single sensor data as input, and by combining data from both sensors in the CNN input layer. Range (distance) and intensity (reflectance) data from LIDAR are considered as separate channels, where data from the LIDAR sensor is feed to the CNN in the form of dense maps, as the result of sensor coordinate transformation and spatial filtering; this allows a direct implementation of the same CNN-based approach on both sensors data. In terms of late-fusion, the outputs from individual CNNs are combined by means of learning and non-learning approaches. Pedestrian classification is evaluated on a {\textquoteleft}binary classification{\textquoteright} dataset created from the KITTI Vision Benchmark Suite, and results are shown for each sensor-modality individually, and for the fusion strategies.",

author = "Gledson Melotti and Cristiano Premebida and Goncalves, {Nuno M. M. Da S.} and Nunes, {Urbano J. C.} and Faria, {Diego R.}",

year = "2018",

month = dec,

day = "10",

doi = "10.1109/ITSC.2018.8569666",

language = "English",

isbn = "978-1-7281-0321-1",

series = "2018 21st International Conference on Intelligent Transportation Systems (ITSC)",

publisher = "IEEE",

pages = "3138--3143",

booktitle = "2018 21st International Conference on Intelligent Transportation Systems (ITSC)",

address = "United States",

note = "2018 IEEE International Conference on Intelligent Transportation Systems (ITSC) ; Conference date: 04-11-2018 Through 07-11-2018",

}

Melotti, G, Premebida, C, Goncalves, NMMDS, Nunes, UJC & Faria, DR 2018, Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data. in 2018 21st International Conference on Intelligent Transportation Systems (ITSC). 2018 21st International Conference on Intelligent Transportation Systems (ITSC), IEEE, pp. 3138-3143, 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC), 4/11/18. https://doi.org/10.1109/ITSC.2018.8569666

Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data. / Melotti, Gledson; Premebida, Cristiano; Goncalves, Nuno M. M. Da S. et al.
2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018. p. 3138-3143 (2018 21st International Conference on Intelligent Transportation Systems (ITSC)).

Research output: Chapter in Book/Published conference output › Conference publication

TY - GEN

T1 - Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data

AU - Melotti, Gledson

AU - Premebida, Cristiano

AU - Goncalves, Nuno M. M. Da S.

AU - Nunes, Urbano J. C.

AU - Faria, Diego R.

PY - 2018/12/10

Y1 - 2018/12/10

N2 - This paper presents a study on pedestrian classification based on deep learning using data from a monocular camera and a 3D LIDAR sensor, separately and in combination. Early and late multi-modal sensor fusion approaches are revisited and compared in terms of classification performance. The problem of pedestrian classification finds applications in advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently because, among other reasons, safety involving self-driving vehicles. Convolutional Neural Networks (CNN) is used in this work as classifier in distinct situations: having a single sensor data as input, and by combining data from both sensors in the CNN input layer. Range (distance) and intensity (reflectance) data from LIDAR are considered as separate channels, where data from the LIDAR sensor is feed to the CNN in the form of dense maps, as the result of sensor coordinate transformation and spatial filtering; this allows a direct implementation of the same CNN-based approach on both sensors data. In terms of late-fusion, the outputs from individual CNNs are combined by means of learning and non-learning approaches. Pedestrian classification is evaluated on a ‘binary classification’ dataset created from the KITTI Vision Benchmark Suite, and results are shown for each sensor-modality individually, and for the fusion strategies.

AB - This paper presents a study on pedestrian classification based on deep learning using data from a monocular camera and a 3D LIDAR sensor, separately and in combination. Early and late multi-modal sensor fusion approaches are revisited and compared in terms of classification performance. The problem of pedestrian classification finds applications in advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently because, among other reasons, safety involving self-driving vehicles. Convolutional Neural Networks (CNN) is used in this work as classifier in distinct situations: having a single sensor data as input, and by combining data from both sensors in the CNN input layer. Range (distance) and intensity (reflectance) data from LIDAR are considered as separate channels, where data from the LIDAR sensor is feed to the CNN in the form of dense maps, as the result of sensor coordinate transformation and spatial filtering; this allows a direct implementation of the same CNN-based approach on both sensors data. In terms of late-fusion, the outputs from individual CNNs are combined by means of learning and non-learning approaches. Pedestrian classification is evaluated on a ‘binary classification’ dataset created from the KITTI Vision Benchmark Suite, and results are shown for each sensor-modality individually, and for the fusion strategies.

UR - https://ieeexplore.ieee.org/document/8569666/

U2 - 10.1109/ITSC.2018.8569666

DO - 10.1109/ITSC.2018.8569666

M3 - Conference publication

SN - 978-1-7281-0321-1

T3 - 2018 21st International Conference on Intelligent Transportation Systems (ITSC)

SP - 3138

EP - 3143

BT - 2018 21st International Conference on Intelligent Transportation Systems (ITSC)

PB - IEEE

T2 - 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC)

Y2 - 4 November 2018 through 7 November 2018

ER -

Melotti G, Premebida C, Goncalves NMMDS, Nunes UJC, Faria DR. Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018. p. 3138-3143. (2018 21st International Conference on Intelligent Transportation Systems (ITSC)). doi: 10.1109/ITSC.2018.8569666

Multimodal CNN Pedestrian Classification: A Study on Combining LIDAR and Camera Data

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this