filmov
tv
Object-Centric Approach to Prediction and Labeling of Manipulation Tasks
Показать описание
ICRA 2018 Spotlight Video
Interactive Session Thu PM Pod F.7
Authors: Chen, Ee Heng; Burschka, Darius
Title: Object-Centric Approach to Prediction and Labeling of Manipulation Tasks
Abstract:
We propose an object-centric framework to label and predict human manipulation actions from observations of the object trajectories n 3D space. The goal is to lift the low-level sensor observation to a context specific human vocabulary. The low-level visual sensory input from a depth camera is processed into high-level descriptive action labels using a directed action graph representation. It is built based on the concepts of pre-computed Location Areas (LA), regions within a scene where an action typically occur, and Sector-Maps (SM), reference trajectories between the LAs. The framework consists of two stages, an offline {em teaching phase} for graph generation, and an {em online action recognition phase} that maps the current observations to the generated graph. This graph representation allows the framework to predict the most probable action from the observed motion in real-time and to adapt its structure whenever a new LA appears. Furthermore, the descriptive action labels enable not only a better exchange of information between a human and a robot but they allow also the robots to perform high-level reasoning. We present experimental results on real human manipulation actions using a system designed with this framework to show the performance of prediction and labeling that can be achieved.
Interactive Session Thu PM Pod F.7
Authors: Chen, Ee Heng; Burschka, Darius
Title: Object-Centric Approach to Prediction and Labeling of Manipulation Tasks
Abstract:
We propose an object-centric framework to label and predict human manipulation actions from observations of the object trajectories n 3D space. The goal is to lift the low-level sensor observation to a context specific human vocabulary. The low-level visual sensory input from a depth camera is processed into high-level descriptive action labels using a directed action graph representation. It is built based on the concepts of pre-computed Location Areas (LA), regions within a scene where an action typically occur, and Sector-Maps (SM), reference trajectories between the LAs. The framework consists of two stages, an offline {em teaching phase} for graph generation, and an {em online action recognition phase} that maps the current observations to the generated graph. This graph representation allows the framework to predict the most probable action from the observed motion in real-time and to adapt its structure whenever a new LA appears. Furthermore, the descriptive action labels enable not only a better exchange of information between a human and a robot but they allow also the robots to perform high-level reasoning. We present experimental results on real human manipulation actions using a system designed with this framework to show the performance of prediction and labeling that can be achieved.