DALY is a dataset intended to train and test temporal/spatial action localization. It contains high quality annotations for 10 actions in 31 hours of videos (3.3M frames), an order of magnitude larger than standard action localization datasets at publication time.
[project page]SURREAL is a synthetic dataset of humans performings actions in a 3D-rendered scene. It contains 6 million images, and networks trained on this synthetic data generalize reasonably well to real data. MoCap sequences from CMU are used, combined with a large array of 3D body shapes, clothing and background textures.
[project page]Simple-to-use video shot classification with DenseTrack-based action classifier. Created for the purposes of the AXES European project.
[github]