The objective of the optimised robust correlation is to find the global
extremum in a multi-dimensional search space that corresponds to the best
match between a pair of images. This search space is defined by the set of
all valid geometric and photometric transformations. In our implementation
of the proposed method the geometric transformations are translation,
scaling and rotation
.
Given a point in the multi-dimensional search space, a combined score function is
evaluated. This function and some of its properties are described in Section 2.1.
To find the global optimum of the score function a search technique based on random
exponential perturbations is employed. This optimisation method, which was shown to be
particularly suitable for the given search problem (see Section 3),
is discussed in Section 2.2. Finally in Section 2.3, a
quasi-random sampling technique used for estimating the value of the score function is
outlined.
Given a transformation t, a match score s is computed as a weighted sum of two scores, an
area score
and a grey level score
:
where
denotes a constant in the interval [0,1]
.
The area score is included to encourage large overlaps and is defined as
where
and
denote the sampling sets corresponding to the reference and
test images,
and
, respectively. The derivation of this function is
based on probabilistic arguments [8]. The grey level score, which measures
the similarity between the intensity distributions, is defined as
where
denotes the robust kernel,
the intensity transformation,
the
projection function and
the maximum response of the robust kernel.
The kernel function
used in the experiments reported in Section 3 is
a simple quadratic function and is defined as
where
and
denote the compared grey levels and
the cut-off distance.
The latter defines the width of the kernel and grey level differences greater than
this constant will not contribute to the final score.
The intensity transformation
implements a linear mapping between grey levels in
the reference and test images. It is defined as
where g denotes a grey level. Pixels are projected from the reference image to the
test image using an affine projection function
where p
denotes the pixel to be projected and
the result of the projection. The horizontal and vertical coordinates of the projection are
defined as
where
denotes the centre of gravity computed from the sampling set.
The search technique we employ is based on random exponential perturbations (see Algorithm 1). In each iteration, the transformation between reference and test image is perturbed by adding a random vector drawn from an exponential distribution. The new transformation is accepted only if the score was increased.
The approach described above is similar to simulated annealing [10] at zero temperature. Successful applications of simulated annealing within the areas of object detection and recognition have been reported in [9] and [1].
If the object under consideration can be adequately represented using only a fraction of the pixels then this is clearly advantageous since the execution time of the method is directly dependent on the number of samples used. In our application this is certainly the case since an image of a face contains a high degree of redundant information, and large regions can often be represented with only a few points.
A sampling technique commonly used in Monte Carlo integration is based on Sobol
sequences [15]. A Sobol sequence is a quasi-random sequence of numbers
maximally spread out over a given hyper-cube. The sequence is generated number-theoretically,
rather than randomly, and successive points at any stage fill in the gaps in the
previously generated distribution.
The use of Sobol sequences leads to faster convergence compared to uniformly distributed random
numbers since the fractional error of the approximation decreases as
instead
of
[15], where N is the number of samples and d the dimensionality
of the approximated function.
In Figure 1, three Sobol sequences are shown.
Figure 1: Two-dimensional Sobol sequences at three different sampling rates.
Combining the optimised robust correlation method with random sampling yields several benefits. The quasi-random nature of the process implies that the sampling points will not interfere with (and possibly cancel out) a specific frequency. In contrast, if sampling points are positioned on a grid, aliasing is much more likely. Furthermore, it is possible to continuously increase the sampling density -- until some convergence criterion is met -- and at the same time maintain approximately uniform density throughout the image. This is because the sampling points are avoiding the chance clustering that occurs with random points drawn from a uniform distribution. A direct consequence of this property is that Sobol sequences can be used for continuous multi-resolution matching. By varying the sampling density the image can be represented in different resolutions.