In order to measure the time set the TIMEFLG environment variable. A small table will be sent to the standard output. Note: In bash use the export, in tcsh the setenv commands.
In the following section we shortly describe each program in the package.
The default detector type is Harris-Laplace, but it can be changed with the -dtype option. For each detected point and orientation can be computed (which will be stored and used by ComputeDescriptor) with -angle option. Optional affine region estimation can be invoked with -affine. To process a set of images one can specify a list file instead of a single image. The list file must be an ASCII text file where each line contains a file name with full or relative path to an image. I suggest to avoid of special non-ASCII characters as well as spaces in file names. The output of such detection is still one file, however a file identifier is saved together with each detection. In case of a list file if you also plan to use -angle or -affine do not specify it here, but later in ComputeDescriptor instead. To indicate that a list file had been specified instead of a single image start the first argument with a @ sign (no spaces between @ and the list file name).
We have made some changes the implementation of Harris-Laplace and Hessian-Laplace. Our experience that the results are improved in average with the new implementation of Harris-Laplace, however, the old implementation is still available with the -old option.
Additional parameters. The scale selection for Harris-Laplace, Hessian-Laplace and LoG can be turned of with -noscale. this switches the detectors from scale invariant (or precisely covariant) mode to multi-scale. The detector will still be run on the whole scale pyramid, but no scale selection will be applied, therefore one can expect much more detection. The same (x,y) location can be selected many times with different scale values. The minimum and maximum scale can also be set. (The new implementation does not suppors minscale=maxscale, yet). E.g. to get the standard harris detector on the original image run:
Detect -old -dt har -noscale -minsc 1 -maxsc 1 a.pgm a.har
|
The scale-step, i.e. the multiplier of the pyramids can also be set with the -scalestep, and it is highly suggested to keep this settings within a reasonable bounds.
The detector type dense helps to simulate a dense representation by creating a set of descriptors along a grid on the image. The grid is defined on each scale between minscale and maxscale (quantized with scalestep multiplier. The cell width and height computed by currentscale*gridstep, where gridstep is a parameter. The minimum scale can also be set by -msbysize.
For more options and precise syntax see Detect -help. For example see Section 6.
The default descriptor is SIFT, it can be changed with -dtype option. Optional angle (rotation invariance) and affine computation (viewpoint invariance) can be specified to precede the descriptor computation. (If you have done that in Detect it is unnecessary to do it again!) If the detection was done on a list-file you must specify here the same file with similar syntax.
SIFT descriptor. The implementation is inspired by David Lowe’s implementation1 of SIFT descriptors. Present version of ComputeDescriptor contains a reimplemented algorithm, however, the our old version which was based on David Lowe’s code is still available with the -old option. SIFT, as default, is computed on a 4x4 grid, and for each cell the dimension of the orientation histogram is 8. These parameters now can be changed with -siftis and -siftos. The default settings leads us to a 128 dimensional descriptors. The -siftws parameter is the window size which the detected patch will be downscaled before sift computation. This is necessary for performance reasons. To further lower this setting can speed up computation, especially for detections with large scales, but can also significantly decrease the quality of the descriptors. -siftscone specifies the patch size (neighborhood) which the sift will be computed if the scale of the detection is 1 (only modify this if you know what you are doing!). The option -unnormalized switched off the normalization of the computation it is only useful if you would like to implement you own normalization. Otherwise the descriiptors are normalized to unit length.
Local Jets. the computation is based on earlier code of Krystian Mikolajczyk’s implementation2. Steerable Filters computed until the 4th order, therefore the length of such descriptor is 15.
1: | the pixel value, |
2,3: | First derivatives (Dx, Dy) |
4,5,6: | Second derivatives (Dxx, Dxy, Dyy) |
7,8,9,10: | Third derivatives (Dxxx, Dxxy, Dxyy, Dyyy) |
11,12,13,14,15: | Forth order derivatives (Dxxxx, Dxxxy, Dxxyy, Dxyyy, Dyyyy) |
Spin Images. the implementation is inspired by Svetlana Lazebnik’s code3. Our reimplementation provides the following parameters: spinis and spinds determines the number of bins in intensity and dinstance dimensions. The spinws, similarly to siftws is the window size which the detected patch is to be downscaled before descriptor computation, to achieve better performance. To further lower this setting can speed up computation, especially for detections with large scales, but can also significantly decrease the quality of the descriptors.
The program has options for different colors and pen widths. If the detection has been done on a list of images, the list file can be specified here also (in the same way) but the detection result has to be preselected for only one image (see. SelectCorners.) The second argument can be either the output of Detect, ComputeDescriptors or SelectCorners.
Create thumbnails. DrawCorners can be used to create an image with small image patches extracted from the detected location. These feature are available via -th_... parameters. E.g.:
Detect @a.lst all.har
DrawCorners -th_on @a.lst all.har a.gif |
will run Harris-Laplace on multiple images followed by the creation of one huge image with all the detected patches. The number of patches per line the distance between patches the maximum number of patches can all be controlled by different options.
Possible selection criteria:
SelectCorners can output to the same file as its input. It will overwrite the file the previous content will be lost (so make a backup if you are not sure). For precise options see SelectCorners -help and also see the examples in Section 6.
Dumps the content of a file (output of Detect or ComputeDescriptor) to the standard output in ASCII format. This is basically one way to convert the data to text format. with the -size option the program only gives the number of interest points.
Creates an ascii text file where each line corresponds to a detection. the structure of the line is the following: x y scale fileid m1,1 m1,2 m2,2 d0 d1 d2 dn
where the first 4 fields (location, scale, file identifier) always appear and the others can be requested by the options: -add affine for the affine second moment matrix (m1,1 m1,2 m2,2) and -adddesc for the descriptor values (dx).
This program needs two arguments:
The converter creates a struct array with fields:
x: | the horizontal coordinate of the center of the interest point |
y: | the vertical coordinate of the center of the interest point |
scale: | the detected scale level |
angle: | the computed orientation (dominant gradient) |
fileid: | the file id (numbered from 0, e.g. 2 means the 3rd image in the listfile used for Detect. |
affine: | the normalized 2nd moment matrix of the affine estimation |
descriptor: | vector with the dimension of 0, 15 or 128 in case of no, localjet or sift computation. |
note: | string, the image name, in case of list file only the variable part of the names |