Industry Updates

'SAMENA Daily' - News

YouTube launches labelled video dataset

As proven by the likes of Netflix, analytics is fundamental to success in video-on-demand (VOD), and Google has thrown its hat into the arena with YouTube-8M, a dataset of eight million YouTube video URLs.

The company argues that many recent breakthroughs in machine learning and machine perception have come from the availability of large labelled datasets and that their availability has significantly accelerated research in image understanding. It adds that one of the key bottlenecks for further advancements in this area has been the lack of real-world video datasets with the same scale and diversity as image datasets.

Google believes that video analysis provides even more information for detecting and recognising objects, and understanding human actions and interactions with the world, and that basically improving video understanding can lead to better video search and discovery.

Explaining the mission to improve analysis with YouTube-8M, software engineers Sudheendra Vijayanarasimhan and Paul Natsev said: “[YouTube-8M] represents a significant increase in scale and diversity compared to existing video datasets. For example, Sports-1M, the largest existing labelled video dataset we are aware of, has around one million YouTube videos and 500 sports-specific classes - represents nearly an order of magnitude increase in both number of videos and classes.”

The engineers are confident that its new dataset can significantly accelerate research on video understanding as they believe that it enables researchers and students without access to big data or big machines to do their research at previously unprecedented scale. The hope is that the work will spur exciting new research on video modelling architectures and representation learning, especially approaches that deal effectively with noisy or incomplete labels, transfer learning and domain adaptation.



Source: http://www.rapidtvnews.com/2016100344533/youtube-launches-labelled-video-dataset.html#axzz4M3UqJAcx

ATTENTION