Robot Learning Manipulation Action Plans by “Watching” Unconstrained Videos from the World Wide Web

From Yezhou Yang, Yi Li, Cornelia Fermuller and Yiannis Aloimonos:

In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation.


The list of the grasping types.

Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by “watching” unconstrained videos with high accuracy.... (article at Kurzweilai.net) (original paper)

Comments (0)

This post does not have any comments. Be the first to leave a comment below.


Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

The FLIR Si1-LD is an easy-to-use acoustic imaging camera for locating and quantifying pressurized leaks in compressed air systems. This lightweight, one-handed camera is designed to help maintenance, manufacturing, and engineering professionals identify air leaks faster than with traditional methods. Built with a carefully constructed array of MEMS microphones for high sensitivity, the Si1-LD produces a precise acoustic image that visually displays ultrasonic information, even in loud, industrial environments. The acoustic image is overlaid in real time on a digital image, allowing you to accurately pinpoint the source of the sound, with onboard analytics which quantify the losses being incurred. The Si1-LD features a plugin that enables you to import acoustic images to FLIR Thermal Studio suite for offline editing, analysis, and advanced report creation. Field analysis and reporting can also be done using the FLIR Acoustic Camera Viewer cloud service. Transferring of images can be managed via memory stick or USB data cable. Through a regular maintenance routine, the FLIR Si1-LD can help facilities reduce their environmental impact and save money on utility bills.