Dense Trajectories Video Description

You can download the source code and compile it under Linux.

Notes:

Before using the code make sure that your video is read correctly. We are using the same stuff to decode videos as HOG3D. Check the HOG3D website for a reference. If you encounter problems, try to convert your video to a different format (e.g., with mencoder). Since we use the ffmpeg library, various formats should work with the code. Please note also that our code is mentioned only for scientific or personal use. If you have problems running the code, if you found bugs, feel free to contact me.

An Example:

Output the help information with -h:

Usage: DenseTrack video_file [options]
Options:
  -h                        Display this message and exit
  -S [start frame]          The start frame to compute feature (default: S=0 frame)
  -E [end frame]            The end frame for feature computing (default: E=last frame)
  -L [trajectory length]    The length of the trajectory (default: L=15 frames)
  -W [sampling stride]      The stride for dense sampling feature points (default: W=5 pixels)
  -N [neighborhood size]    The neighborhood size for computing the descriptor (default: N=32 pixels)
  -s [spatial cells]        The number of cells in the nxy axis (default: nxy=2 cells)
  -t [temporal cells]       The number of cells in the nt axis (default: nt=3 cells)

Compute the features for a video file

DenseTrack myvideo.vob [options] | gzip > myfeatures.txt.gz

If there are no option, the features will be computed using the default parameters.

The format of the computed features

The features are computed one by one, and each one in a single line, with the following format:

frameNum mean_x mean_y var_x var_y length scale Trajectory HOG HOF MBHx MBHy

The first seven element are information about the trajectory:

frameNum:     The trajectory ends on which frame
mean_x:       The mean value of the x coordinates of the trajectory
mean_y:       The mean value of the y coordinates of the trajectory
var_x:        The variance of the x coordinates of the trajectory
var_y:        The variance of the y coordinates of the trajectory
length:       The length of the trajectory
scale:        The trajectory is computed on which scale

The following element are five descriptors concatenated one by one:

Trajectory:    2x[trajectory length] (default 30 dimension) 
HOG:           8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)
HOF:           9x[spatial cells]x[spatial cells]x[temporal cells] (default 108 dimension)
MBHx:          8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)
MBHy:          8x[spatial cells]x[spatial cells]x[temporal cells] (default 96 dimension)

Citation

Please cite our paper if you use the code.

@inproceedings{wang:2011:inria-00583818:1,
  AUTHOR = {Heng Wang and Alexander Kl{\"a}ser and Cordelia Schmid and Cheng-Lin Liu},
  TITLE = {{Action Recognition by Dense Trajectories}},
  BOOKTITLE = {IEEE Conference on Computer Vision \& Pattern Recognition},
  YEAR = {2011},
  MONTH = Jun,
  PAGES = {3169-3176},
  ADDRESS = {Colorado Springs, United States},
  URL = {http://hal.inria.fr/inria-00583818/en}
}