next up previous
Next: 3 Experiments Up: Fast Face Localisation and Previous: 1 Introduction

2 Optimised Robust Correlation

  The objective of the optimised robust correlation is to find the global extremum in a multi-dimensional search space that corresponds to the best match between a pair of images. This search space is defined by the set of all valid geometric and photometric transformations. In our implementation of the proposed method the geometric transformations are translation, scaling and rotationgif. Given a point in the multi-dimensional search space, a combined score function is evaluated. This function and some of its properties are described in Section 2.1. To find the global optimum of the score function a search technique based on random exponential perturbations is employed. This optimisation method, which was shown to be particularly suitable for the given search problem (see Section 3), is discussed in Section 2.2. Finally in Section 2.3, a quasi-random sampling technique used for estimating the value of the score function is outlined.

2.1 Score Function

  Given a transformation t, a match score s is computed as a weighted sum of two scores, an area score tex2html_wrap_inline732 and a grey level score tex2html_wrap_inline734 :

displaymath38

where tex2html_wrap_inline736 denotes a constant in the interval [0,1]gif. The area score is included to encourage large overlaps and is defined as

displaymath43

where tex2html_wrap_inline742 and tex2html_wrap_inline744 denote the sampling sets corresponding to the reference and test images, tex2html_wrap_inline746 and tex2html_wrap_inline748 , respectively. The derivation of this function is based on probabilistic arguments [8]. The grey level score, which measures the similarity between the intensity distributions, is defined as

displaymath57

where tex2html_wrap_inline750 denotes the robust kernel, tex2html_wrap_inline752 the intensity transformation, tex2html_wrap_inline754 the projection function and tex2html_wrap_inline756 the maximum response of the robust kernel. The kernel function tex2html_wrap_inline750 used in the experiments reported in Section 3 is a simple quadratic function and is defined as

displaymath81

where tex2html_wrap_inline760 and tex2html_wrap_inline762 denote the compared grey levels and tex2html_wrap_inline764 the cut-off distance. The latter defines the width of the kernel and grey level differences greater than this constant will not contribute to the final score. The intensity transformation tex2html_wrap_inline752 implements a linear mapping between grey levels in the reference and test images. It is defined as

displaymath104

where g denotes a grey level. Pixels are projected from the reference image to the test image using an affine projection function tex2html_wrap_inline770 where p denotes the pixel to be projected and tex2html_wrap_inline774 the result of the projection. The horizontal and vertical coordinates of the projection are defined as

eqnarray116

where tex2html_wrap_inline776 denotes the centre of gravity computed from the sampling set.

2.2 Optimisation Method

  The search technique we employ is based on random exponential perturbations (see Algorithm 1). In each iteration, the transformation between reference and test image is perturbed by adding a random vector drawn from an exponential distribution. The new transformation is accepted only if the score was increased.

  algorithm184

The approach described above is similar to simulated annealing [10] at zero temperature. Successful applications of simulated annealing within the areas of object detection and recognition have been reported in [9] and [1].

2.3 Random Sampling

  If the object under consideration can be adequately represented using only a fraction of the pixels then this is clearly advantageous since the execution time of the method is directly dependent on the number of samples used. In our application this is certainly the case since an image of a face contains a high degree of redundant information, and large regions can often be represented with only a few points.

A sampling technique commonly used in Monte Carlo integration is based on Sobol sequences [15]. A Sobol sequence is a quasi-random sequence of numbers maximally spread out over a given hyper-cube. The sequence is generated number-theoretically, rather than randomly, and successive points at any stage fill in the gaps in the previously generated distribution. The use of Sobol sequences leads to faster convergence compared to uniformly distributed random numbers since the fractional error of the approximation decreases as tex2html_wrap_inline808 instead of tex2html_wrap_inline810  [15], where N is the number of samples and d the dimensionality of the approximated function. In Figure 1, three Sobol sequences are shown.

   figure232
Figure 1: Two-dimensional Sobol sequences at three different sampling rates.

Combining the optimised robust correlation method with random sampling yields several benefits. The quasi-random nature of the process implies that the sampling points will not interfere with (and possibly cancel out) a specific frequency. In contrast, if sampling points are positioned on a grid, aliasing is much more likely. Furthermore, it is possible to continuously increase the sampling density -- until some convergence criterion is met -- and at the same time maintain approximately uniform density throughout the image. This is because the sampling points are avoiding the chance clustering that occurs with random points drawn from a uniform distribution. A direct consequence of this property is that Sobol sequences can be used for continuous multi-resolution matching. By varying the sampling density the image can be represented in different resolutions.


next up previous
Next: 3 Experiments Up: Fast Face Localisation and Previous: 1 Introduction

Kenneth Jonsson
Mon Jul 14 14:56:41 BST 1997