A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots

Marco Gutiérrez; Luis Manso; Harit Pandya; Pedro Núñez

doi:10.3390/s17020353

A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots

Marco Gutiérrez, Luis Manso, Harit Pandya, Pedro Núñez

College of Engineering and Physical Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.

Original language	English
Article number	353
Journal	Sensors
Volume	17
Issue number	2
DOIs	https://doi.org/10.3390/s17020353
Publication status	Published - 11 Feb 2017

Bibliographical note

© 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Access to Document

10.3390/s17020353Licence: CC BY 3.0

A Passive Learning Sensor Architecture for Multimodal Image Labeling
© 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Final published version, 4.06 MBLicence: CC BY 3.0

Cite this

@article{f16f42f734f849849f8d6fe8886827e2,

title = "A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots",

abstract = "Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.",

author = "Marco Guti{\'e}rrez and Luis Manso and Harit Pandya and Pedro N{\'u}{\~n}ez",

note = "{\textcopyright} 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).",

year = "2017",

month = feb,

day = "11",

doi = "10.3390/s17020353",

language = "English",

volume = "17",

journal = "Sensors",

issn = "1424-8239",

publisher = "MDPI AG",

number = "2",

}

TY - JOUR

T1 - A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots

AU - Gutiérrez, Marco

AU - Manso, Luis

AU - Pandya, Harit

AU - Núñez, Pedro

N1 - © 2017 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

PY - 2017/2/11

Y1 - 2017/2/11

N2 - Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.

AB - Object detection and classification have countless applications in human-robot interacting systems. It is a necessary skill for autonomous robots that perform tasks in household scenarios. Despite the great advances in deep learning and computer vision, social robots performing non-trivial tasks usually spend most of their time finding and modeling objects. Working in real scenarios means dealing with constant environment changes and relatively low-quality sensor data due to the distance at which objects are often found. Ambient intelligence systems equipped with different sensors can also benefit from the ability to find objects, enabling them to inform humans about their location. For these applications to succeed, systems need to detect the objects that may potentially contain other objects, working with relatively low-resolution sensor data. A passive learning architecture for sensors has been designed in order to take advantage of multimodal information, obtained using an RGB-D camera and trained semantic language models. The main contribution of the architecture lies in the improvement of the performance of the sensor under conditions of low resolution and high light variations using a combination of image labeling and word semantics. The tests performed on each of the stages of the architecture compare this solution with current research labeling techniques for the application of an autonomous social robot working in an apartment. The results obtained demonstrate that the proposed sensor architecture outperforms state-of-the-art approaches.

UR - https://www.mdpi.com/1424-8220/17/2/353

U2 - 10.3390/s17020353

DO - 10.3390/s17020353

M3 - Article

SN - 1424-8239

VL - 17

JO - Sensors

JF - Sensors

IS - 2

M1 - 353

ER -

A Passive Learning Sensor Architecture for Multimodal Image Labeling: An Application for Social Robots

Abstract

Bibliographical note

Access to Document

Other files and links

Fingerprint

Cite this