Implementational Improvements for Active Region Models

 

D.C. Alexander and B.F. Buxton

Dept. Computer Science, University College London,

Gower Street, London. WC1E 6BT.

Abstract

In this paper, existing region based active contours – active region models (ARMs) -are investigated. We have found ARMs to be an effective solution to the problem of tracking the boundaries of country lanes in sequences of images of the type we would expect to obtain from a camera mounted on the front of an autonomous cross country vehicle, see [1]. However, we have found some problems with existing implementations. This paper describes some of these problems and some modifications that have proved effective solutions when applying ARMs within this domain. The new models are shown to compare favourably in terms of computation time and convergence properties.

1. Introduction

In recent years, active contour models (ACMs) have emerged as an effective mechanism for segmentation and tracking of object boundaries in images or image sequences, [3,7,9]. Central to the implementation of any ACM is the minimisation of a functional that describes the energy of the contour. This energy functional typically has two components, an internal energy component that applies shape constraints to the model, and an external energy derived from the data to which the model is being applied. Since the original formulation, in [7], many variations and improvements have been suggested both in the form of the energy functional itself and the techniques used for minimisation, [2,3,8,12]. However, ACMs in general are sensitive to initial conditions and, although judicious use of a scale space can help, see [9], are only really effective when the initial estimate of the contour in the image is good. The older segmentation technique of region growing, [14], has a distinct advantage here. The initial region specified in the image, as a starting point for the algorithm, does not have to approximate the true boundary in such a precise way. So long as the data contained in the initial region is representative of the region as a whole, a region growing algorithm should yield a good approximation to the true region. The disadvantage of region growing strategies is the lack of built in constraints on connectivity of the extracted region and smoothness of its boundary. The resultant sensitivity to noise causes the boundary of the extracted region to be highly irregular and regions extracted often include holes. Techniques such as mathematical morphology can help but at the cost of further processing which introduces its own artefacts and increases computation times.

Recently, algorithms have been developed, [5,10,13], which combine the techniques of ACMs and region growing. The external part of the ACM energy functional is replaced by a term derived from local region information. Points on the contour are allowed to expand or contract according to the fit between local region information and a global model of the region derived from the initial configuration. The resulting active region model (ARM) retains the desirable characteristics of both techniques. Regularity of the contour can be controlled by the shape constraints in the energy functional; and, by examining local region information, boundary points are able to traverse large homogeneous areas of the image providing robustness to the initial configuration.

In the next section we review the ARM implementation of Ivins and Porrill – the "statistical snake", [5]. The simplicity of this implementation makes it preferable to the "anticipating snake" of Ronfard, [10], as a starting point. Also, although in essence the algorithms are quite similar, the fact that the statistical snake is already geared towards tracking applications (of particular interest to the authors, see [1]) makes it preferable to the "region competition" algorithm of Zhu and Yuille, [13]. Section 3 presents simple heuristic solutions to some practical problems associated with current ARM implementations. In section 4, some analysis of the energy functional used in [5] is provided and some changes to the original suggested. In section 5, we discuss alternative energy minimisation techniques that have been used to drive ACMs, in particular, greedy algorithms, and show how they can be used with ARMs. Some experiments and results are detailed in section 6 and conclusions are drawn in section 7.

2. The Statistical Snake

Ivins and Porrill’s statistical snake, [5,6] is based, to a large extent, on the original ACM implementation described in [8]. An energy functional, E, is specified:

.

(1)

The first two terms in (1) correspond respectively to the tension and stiffness energies of the contour model u, parametrised by l , and together comprise the internal energy. The third term is the external energy derived from the image data. G is a goodness functional that returns a measure of the likelihood that the pixel, indexed by x and y in the image, I, is part of the region of interest. R is the interior of the contour and a , b and r are parameters used to weight these three energy terms. Thus as the energy is minimised, the contour deforms to enclose as many pixels with positive goodness as possible while excluding those with negative goodness. The algorithm is initially given a region of the image known to belong to the object of interest. This seed region serves two purposes. It is used as the initial configuration of the model, and to construct a statistical model of the attributes (e.g., intensity, colour, texture) of the data comprising the region as a whole from which the goodness functional in (1) is derived, see [1,6].

In practice the contour is described by a discrete set of elements on the boundary, u(l ), l =1, …, n. The co-ordinates of each element are stored to sub-pixel accuracy. The energy functional is never explicitly calculated but variational calculus is used to derive minimisation forces on each element of the boundary, see [5,7]. In this way, the force on each element is found to be:

(2)

where t is the timestep or viscosity factor. The force due to the external energy is not formally derived but is intuitively defined, as for Cohen’s pressure force, [3], to act along the normal to the contour with magnitude proportional to a local goodness measure. In order to decrease the sensitivity to noise, the goodness of an average pixel value over some neighbourhood is taken in preference to simply taking the goodness of the pixel in which the contour element lies. Ivins and Porrill take the average over the lines joining the element to each of its neighbours (e.g. found by a Bresenham line drawing algorithm). In addition, as originally suggested by Cohen, [3], to aid stability of the model, the force vector, d u is normalised to unity if its magnitude is greater than one. The derivatives in (2) are approximated using finite differences.

Contour elements are updated explicitly, as opposed to the semi-implicit method of Kass, et al, [7], i.e. the position of each element is updated as soon as it is calculated. The explicit method has some computational advantage but the main advantage is the ease with which hard constraints, such as self-intersection avoidance, can be encoded. Neither method is guaranteed to converge, [11], but the semi-implicit method is more stable allowing termination conditions to be applied more effectively, see section 3.2.

2.1. Avoiding Self-Intersection

For this ARM, self-intersection is a serious problem since, if it should occur, the pressure force is reversed causing contraction forces to become expansion forces and vice versa often resulting in uncontrolled expansion. Self-intersection is avoided by keeping a 2D-accumulator array of positions in the image into which sections of the contour cannot move without causing intersection. Before a move is made, it is checked against the accumulator. If the move causes self-intersection, it is reversed, i.e., the vector calculated in (2) is negated; if this move also causes an intersection, the move is cancelled and the element remains where it is. Full details can be found in [6].

2.2. Addition and Deletion of Elements

Whilst deforming from the initial configuration to the true boundary, some parts of the contour may become significantly longer and other parts may become much shorter. In a section of the contour that is expanding, elements will tend to become further and further apart and once the boundary is reached, the resolution with which it can be represented will be low and so the contours approximation will be poor. Also, the tension force greatly increases and eventually outweighs the pressure force stopping the expanding contour before the true boundary is reached. This situation is resolved by adding new elements to the contour whenever a threshold distance between two consecutive elements is exceeded. Conversely, as a portion of the contour contracts, elements can become bunched needlessly increasing computation. This is similarly resolved by setting a lower threshold, below which one of the elements is removed from the contour.

2.3. Problems

In practice, we have found some problems with this ARM.

3. Some Heuristic Improvements

In this section heuristic solutions to the problems described above are proposed.

3.1. Deadlock

The solution to the problem of deadlock is remarkably simple. Rather than reversing the move, we propose that the offending element should simply be removed from the contour. Obviously, it is possible that the deletion of such an element itself could cause another self-intersection. This must be checked, and, if such a situation arises, we revert to the original strategy described in 2.1. This has proved to be extremely effective in almost all situations.

3.2. Non-Convergence

The ability to detect the point at which no further improvement to an ARM’s approximate boundary will occur is very desirable. Firstly, if no termination condition is employed, we must ensure that the fixed number of iterations used is at least comparable to the maximum number of iterations that would be required to find a good approximation to the true boundary for any image in the application domain. Secondly, when a system goes into oscillation about its ideal state, it will pass through many different configurations, some of which will be better than others and with a fixed number of iterations the solution eventually arrived at is essentially random within certain bounds. The ideal termination condition is full convergence, i.e., when further iterations have no effect on the configuration.

When the variational approach is used to minimise the energy of the contour, full convergence is hard to achieve. If a semi-implicit approach is used, it has been reported that, under favourable conditions, most of the elements will converge, but, cycles and chaotic behaviour can occur in part or all of the contour, [2,11]. When the update scheme is explicit, convergence very hard to attain. The dependence of convergence on the timestep, t , in (2), has been noted, [3,6,11], and empirical estimates of t ~ 0.01 to ensure convergence have been suggested, [6]. This increases the number of iterations required to reach the region boundary considerably and is not practical in many situations. Rather than seeking full convergence of ARMs based on the variational approach, in practice we need to find some other termination condition. An effective termination condition should allow us to detect good configurations of our model whilst keeping the value of the timestep close to unity so as to allow the model to expand and contract quickly when it is far away from the region boundary.

Figure 1 Typical graphs of the average force magnitude versus iteration number. The internal parameters a , b , and r (see(1)) have the values given in the legend, in that order.

A good candidate is the average force magnitude (AFM) around the elements of the contour. This approximates the energy change and so should decrease as the contour approaches its minimum energy configuration. Using the average value avoids complications introduced when elements are added to or deleted from the contour. In practice, the graph of the AFM versus iteration number is roughly monotonic but noisy, making selection of a suitable threshold value difficult. This problem is compounded by the fact that suitable values are highly dependent (though in no obvious mathematical way) on the value of the internal model parameters and, to some extent, on properties of the region of interest, see figure 1. For this reason, we adopt an adaptive threshold scheme. As we are tracking a single object through a temporal sequence of images, we assume that conditions do not change drastically and suitable values for the threshold remain fairly constant throughout the sequence. If the object or background changes significantly, or the internal parameter settings are altered, the scheme must be restarted from the beginning. The scheme proceeds as follows:

4. The Energy Functional

In this section we undertake an analysis of the Ivins and Porrill ARM energy functional in an attempt to improve stability and overall performance of the model. We are primarily concerned with correctly calculating the external energy (pressure force) and do not attempt to improve accuracy or stability by examining the internal energy. It should be noted, however, that several alternative internal energy formulations, [4,8,12,13], have been proposed that may improve results.

4.1. Calculating the Pressure Force

Consider an idealised snake, u(l ), continuously parametrised by the real-valued l . The pressure energy of u(l ), when applied to an image, I, is given by

.

(3)

If we displace the contour by a small amount d u(l ) and make a Taylor expansion we have:

.

(4)

The rate of change of pressure energy with respect to the contour itself is the sum of the energy around the contour. Thus, the first approximation to the change of pressure energy caused by a small displacement of the contour is the integral of the goodness around the contour, multiplied by the displacement. If the displacement is local to one part of the contour (c.f. moving a single element), we perform the integral only over the part of the contour that is being displaced. The magnitude of the force is thus

,

(5)

where d C is the parameter range over the portion of the contour that is displaced. In practice when l is discrete, we approximate u by a polygon around which we can still sum the goodness. To find the pressure force on an individual discrete element, we thus look at the properties of the image along the lines joining the element to its neighbours. An average along these lines is taken and used as the value at the element. This follows Ivins and Porrill’s scheme apart from the fact that (5) suggests that the goodness itself should be averaged rather than the pixel values.

4.2. Continuous Goodness Functionals

Another potential cause of oscillation is the discrete nature of the calculation of the pressure force line integral, (5). Despite the fact that the snake element positions are stored to sub-pixel accuracy, discreteness is introduced into the system by the grid based line drawing algorithm used to find the set of pixels to be averaged over. A move of just one pixel by a single element can alter the set of pixels on the line between it and its neighbours considerably, changing the average goodness drastically, especially if the contour is close to the edge of the region of interest.

To combat this problem, a continuous goodness functional can be created that allows us to evaluate the true goodness integral along the line joining two exact element positions. We can create the continuous functional by using bilinear interpolation between the goodness values of each pixel.

(6)

where and , m = int(x) and n = int(y), GI(x, y) abbreviates G(I(x, y)). An expression for the line integral of (6) between two points within a pixel is not hard to derive and the results are summed using a marching-squares process to yield the whole line integral.

5. Energy Minimisation

So far we have only considered the variational approach to energy minimisation. There are, however, some alternative energy minimisation techniques that have become increasing popular for use with ACMs. Two of the most popular are dynamic programming, [2], and greedy algorithms, [12,8]. The dynamic programming approach is computationally expensive and in this work we restrict our investigation to the application of greedy algorithms to ARMs energy minimisation.

The use of greedy algorithms, to minimise the energy of an ACM, originally suggested by Williams and Shah, [12], and later enhanced by Lai and Chin, [8], has become increasingly popular. Update of the elements is explicit and so allows the simple inclusion of hard constraints into the model. A neighbourhood (typically the 3x3 square with the pixel containing the element at the centre) is defined around each element. For each element in turn the overall energy change caused by a move to each candidate position is calculated. The position that minimises the resultant contour energy is chosen as the new position of that element. As each move is chosen so that the overall energy of the contour cannot increase, convergence is assured. Another favourable feature of this approach is speed. All the element positions are now stored as integers decreasing computation times. Moreover, no high order discrete derivatives have to be calculated, which can also reduce noise in the system.

In contrast to ACMs, the region energy in an ARM cannot be divided into contributions from each element of the contour as it is a property not of the contour itself but the region enclosed by the contour. We can however investigate the overall change in energy that occurs when an element is moved to a new position. The internal energy change is simple to calculate by noting that, using standard finite difference approximations for the first and second derivative, in (1), a move of one element effects the internal energy of that element and both immediate neighbours only, see [6,7]. The discrete nature of the image makes it impossible to obtain an exact value for the region energy, see figure 2, but we can make an approximation.

5.1. A Simple Greedy ARM

For the region energy term we can use methods similar to those employed in the variational approach. A simple and computationally cheap approach is to calculate the average goodness along the lines (discretely or continuously) to each neighbour from the starting position of the element. Then, for each candidate position, calculate the area (signed to indicate expansion/contraction of the boundary) of the two triangular regions in figure 2. The average of the two goodness values, weighted by the corresponding areas and scaled by the pressure coefficient, r , then gives an approximation to the change in external energy created by that move:

,

(7)

D l -1 and D l +1 denote the areas of the two triangles and d denotes Euclidean distance.

Figure 2. Three elements of a contour, u, on an image are shown. The grey square around the central element outlines the set of candidate new positions for that element. Consider moving the element up one pixel. The contour expands to include the two triangular regions D l -1 and D l +1. The resultant change in external energy is the goodness contained within those regions, which cannot be calculated exactly because the goodness is discretely defined with the image.

5.2. An Improved Strategy

In the above implementation, convergence is not guaranteed. Cycles, oscillations and even chaotic behaviour may still occur. The computed external energy change is inconsistent in the sense that the goodness calculated for a move from one position to another is to different to that calculated when moving back. By carefully selecting the end points of the lines along which the goodness is averaged for each candidate position, we can make the energy changes more consistent. Consider the 3x3 square pixel neighbourhood with the current element position at the centre. If we number these candidates as in figure 3, then the end points suggested in table 1 can be used. There are other possible choices, but the one given suffices.

This scheme is computationally more expensive (in terms of accesses to the underlying statistical data model) by a factor of the size of the elemental neighbourhood, but, the convergence properties of the model are greatly enhanced. However, although it is now the most common outcome, full convergence is still not guaranteed. It is still possible for longer cycles to occur.

5.3. A consistent measure of energy change

We can obtain a consistent value for the change in external energy by using bilinear interpolation. A continuous goodness functional is constructed as in (6). Using a marching-squares process around the boundary of the triangular regions created by a single element move, each region is decomposed into a set of rectangular and right-angled triangular sub-regions, each completely contained within one pixel. Expressions for the surface integral of (6) over such regions are again not hard to derive. These are summed to yield the external energy over the region as a whole. This implementation is consistent so convergence is assured. Computation time, however, is high and such a model may only be useful as a yardstick against which to measure performance of other simpler models.

6. Experiments and Results

This section presents results of a comparison of the convergence properties and computation times of some of the different ARM implementations described above. Table 2 provides a summary of the different ARMs that will be investigated.

Snake Type and Initials

Description

Original Ivins and Porrill

OIP

The original implementation, as described in section 2, with the termination condition of section 3.2 applied.

Collision Implies Deletion

CID

As above but with the collision resolution strategy of section 3.1.

Discrete Average Goodness

DAG

As CID, but averaging the goodness along the discrete lines joining two elements rather than the pixel values.

Interpolated Line Integral

ILI

As DAG but the line integral of the bilinearly interpolated goodness functional is used to generate the pressure force, see section 4.2.

Single Line Greedy

SLG

The fast greedy implementation where only one goodness average is calculated. See section 5.1.

Multi-Line Greedy

MLG

The improved model described in section 5.2.

Surface Integral Greedy

SIG

Greedy ARM using bilinear interpolation to calculate consistent external energy changes. See 5.3.

Table 2. Description of ARM implementations used in this work.

Each model is applied to a sequence of colour images of a country lane, as might be captured by a camera mounted on the front of an autonomous land vehicle, see [1]. Gaussian models for the colour data corresponding to objects of this type have been shown to be effective, [1], thus each ARM uses an internal Gaussian statistical colour model. In [1], optimal values for the parameters a , b and r were found for the original Ivins and Porrill ARM. These values, and the values used for other parameters, were the same for each ARM.

The variational models used the adaptive averaged force magnitude threshold described in section 3.2 as a termination condition. The greedy snakes have two termination conditions - full convergence, and simple cycle detection. Full convergence is identified by the fact that no contour elements changed position during the last iteration. To detect simple cycles a threshold is set on the number of consecutive iterations during which the same numbers of elements have moved. In this work, that threshold is set to 8, which works well in practice. No attempt to detect more complex cycles is made.

The results from this single image sequence are tabulated in table 3. These results are typical of findings on a wide range of similar image sequences.

 

Snake

Full Con.

Con. Detected

Non-con.

Total number of iterations

Relative Computation time

OIP

0

22

4

2460

0.33

CID

0

22

4

2460

0.33

DAG

0

21

5

2985

0.66

ILI

0

20

6

2247

1.48

SLG

0

11

15

3667

0.24

MLG

23

3

0

1992

0.43

SIG

26

0

0

1902

10.7

OIP-NC

0

0

26

5200

1

DAG-NC

0

0

26

5200

1.13

ILI-NC

0

0

26

5200

3.36

Table 3. Results of applying each snake to a single tracking task. The last three rows show results obtained for the variational models without the termination condition of section 3.2.

7. Conclusions

For the variational models, the rate of detection of convergence, i.e., termination conditions are satisfied, is fairly high. In this example, the OIP and CID models perform identically, no situation producing behavioural differences arises. In cases where such situations do arise, the CID model tends to be slightly faster. For the OIP and CID models, the original threshold found on the first image is 4.0, this rises to 7.6 by the 11th image of the sequence and remains at that value. For the DAG model, the original threshold is 4.1, rising to 4.6 by the 11th image and for the ILI, 3.7 rising to 7.0 again by the 11th image. The DAG model, whose threshold value changes little during the sequence, is the most stable in this case. On some image sequences, however, we have found the ILI model to be more stable. In general the ILI and DAG models are more stable than the OIP and CID models.

Although full convergence is never reached by the SLG model and detectable cycles occur in less than half the images of the sequence, the model is very fast and, unlike the variational models, overall performance does not deteriorate due to an adaptive termination condition. As expected, convergence of the MLG model is much better than the SLG and, despite the added complexity, computational performance is still good. Non-convergence is rare and in fact, over a range of image sequences comprising over 120 images, undetectable cycles arose only 6 times. However this number increases if the pressure coefficient, r , is increased. For the SIG model, convergence is assured. The overall number of iterations required is reduced, but at significant computational cost.

To decide which model is most suitable for a particular application, a full performance characterisation of each is required. Such a comparison, using the methodology outlined in [1], within the context of road/lane tracking, is currently being undertaken. Preliminary results indicate that using the termination condition for the variational models increases the accuracy of the resultant boundaries and that the DAG model provides the best results. The simple greedy ARM (SLG) provides performance comparable to the variational models without the termination condition, but at significantly reduced computational cost. The accuracy of the MLG model appears to be a slight improvement over the best variational model. For high-speed applications, the SLG ARM is an effective solution. For a little more accuracy with a small computational overhead, the MLG model can be used. Note, however, that as the fixed number of iterations increases, the computational performance of the MLG model becomes comparable to that of the SLG. The SIG model, although reliably convergent, is extremely slow and, in practice, offers little over the MLG.

The computation times given in table 4 are relative to the OIP model with a fixed number of 200 iterations per image. In our unoptimised implementation, none of the models runs in real time. However, using specialised hardware, Ivins’ original implementation runs at frame rate, [6].

References

[1]

DC Alexander, BF Buxton. "Modelling of Single Mode Distributions of Colour Data Using Directional Statistics". Proc. CVPR’97. Pp. 319-324, 1997

[2]

AA Amini, TE Weymouth, RC Jain, "Using dynamic programming for solving variational problems in vision." IEEE Trans. PAMI. Vol. 12. No. 9. 1990.

[3]

LD Cohen. "On Active Contours and Balloons". Computer Vision, Graphics and Image Processing - Image Understanding. Vol. 53. No. 2. Pp. 211-218. 1991.

[4]

LD Cohen, I Cohen, "Finite-Element Methods for Active Contour Models and Balloons for 2-D and 3-D Images" IEEE Trans. PAMI. Vol. 15. Pp. 1131-1147. 1993.

[5]

J. Ivins and J. Porrill, "Active Region Models for Segmenting Colours and Textures" Image and Vision Computing, 1995, Vol. 13, No. 5. Pp. 431-438.

[6]

J Ivins. "Statistical Snakes: Active Region Models". PhD Thesis. University of Sheffield. 1996.

[7]

M Kass, A Witkin, D Terzopoulos. "Snakes: Active Contour Models". International Journal of Computer Vision. Vol. 1. No. 4. Pp. 321-330. 1987.

[8]

KF Lai, RT Chin, "Deformable Contours: Modelling and Extraction." Proc. IEEE Conference CVPR, 1994.

[9]

F Leymarie, MD Levine. "Tracking Deformable Objects in the Plane Using an Active Contour Model". IEEE Trans. PAMI. Vol. 15. No. 6. Pp. 617-634. 1993.

[10]

R Ronfard. "Region-Based Strategies for Active Contour Models". International Journal of Computer Vision. Vol. 13. No. 2. Pp. 229-251. 1994.

[11]

R Samadani. "Adaptive snakes: control of damping and material parameters". SPIE Geometric Methods in Computer Vision. Vol. 1570. Pp. 202-213. 1991.

[12]

DJ Williams, M Shah. "A fast algorithm for active contours and curvature estimation". CVGIP - Image Understanding. Vol. 55. No. 1. Pp. 14-26. 1992.

[13]

SC Zhu, A Yuille. "Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for multiband Image Segmentation". IEEE Trans PAMI. Vol. 18. No. 9. Pp. 884-900. 1996.

[14]

SW Zucker. "Region Growing: Childhood and Adolescence." CGIP. Vol. 5. Pp. 382-399, 1976.