Optimization of local shape and appearance probabilities for segmentation of knee cartilage in 3-D MR images

https://doi.org/10.1016/j.cviu.2011.05.014Get rights and content

Abstract

We propose a fully automatic method for segmenting knee cartilage in 3-D MR images which consists of bone segmentation, bone-cartilage interface (BCI) classification, and cartilage segmentation. For bone segmentation, we propose a modified version of the recently presented branch-and-mincut method, and for classifying the BCI, we propose a voxel classification method based on binary classifiers of position and local appearance. The core contribution of this paper is the cartilage segmentation method where localized Markov random fields (MRF) are separately constructed and optimized for local image patches. The region and boundary potentials of the MRFs are computed from the retrieved segmentation results of training images that are relevant to each local patch. Here, local shape and appearance cues are adaptively combined depending on the local image characteristics. For experimentation, a dataset comprising MR images of ten different subjects and another comprising the baseline and two-year follow-up scans for nine different subjects are constructed. Both qualitative and quantitative comparisons of the results of the proposed method with semi-automatic segmentation methods demonstrate the potential of the proposed method for clinical application.

Highlights

► We propose a fully automatic segmentation method of knee cartilage. ► Relevant local shape and appearance information is determined from training images. ► Localized region and boundary probabilities are computed and adaptively integrated. ► Segmentation is the collective result of MRF optimizations on multiple local patches. ► Qualitative and quantitative evaluations support potential clinical application.

Introduction

Many discrete optimization algorithms have recently been developed to solve a wide range of computer vision problems. Though binary segmentation of an image into foreground and background is one of the simplest problems, it is of vital importance in various applications including medical image analysis, image editing, and reconstruction of 3-D object structure. Graph cut methods [1], [2] have been widely applied as a solution to this problem of binary segmentation due to their advantages, such as practical efficiency, global optimality, and flexibility of the underlying Markov random field (MRF). However, graph cut algorithms can optimize pairwise MRFs which can only model the characteristics of single voxels and voxel pairs. Modeling of only these low-level visual cues, however, cannot handle the ill-posedness of segmentation problems mostly caused by weak boundaries, inhomogeneous intra-category intensity distributions and overlapped inter-category intensity distributions. These difficulties hinder valid segmentation in medical images, where many organs and lesions of interest have diffused edges and are close to other objects with similar intensity distributions. In many cases high-level cues must be incorporated to guide the segmentation toward the real boundary of the object. Most of the researches on integrating high-level cues are focusing on either optimizing MRFs with higher-order cliques [3], [4], [5] or utilizing prior information on shape and appearance [6], [7], [8].

Higher order cliques comprising more pixels are able to represent more complex structural information than the level of pixel pairs. In [4], Kohli et al. proposed a Pn Potts model that assigns a smaller potential energy to higher-order cliques when the labels of all pixels in each clique are the same (consistent), thus enforcing label consistency among cliques. This work was augmented in [5] by relaxing the all-or-nothing property of the Pn Potts model to the robust Pn model where the potential of a clique now depends on the number of pixels that have labels different from the majority label. This extension allows segments of arbitrary sizes obtained from unsupervised segmentation methods such as mean-shift [9] and enhances segmentation performance compared to when the clique size was fixed as in [4] by preserving detailed edge components. However, the dependence on unsupervised segmentation may cause problems for 3-D medical images where unsupervised segmentation is especially difficult due to low contrast between boundaries.

A different approach to exploit higher-order cliques is to learn the clique statistics based on training images. In [3], Roth and Black proposed a method to learn the representative appearances of a specific clique from training images. Although a similar approach has been applied to segmentation such as the patch dictionary from user-segmented images to define specific clique potentials in [4], this process is generally very expensive computationally and has yet to be applied to 3-D medical images.

On the other hand, segment characteristics can also be represented by statistical information regarding the shape and appearance of the object or the background. In their approach termed as OBJCUT [6], Kumar et al. incorporated layered pictorial structures (LPS) with MRFs to model a realistic shape prior of four-legged animals by a set of latent shape parameters. Since the weight terms in their energy must be updated whenever any parameter of the MRF is changed, OBJCUT has a high computational demand. Kohli et al. [7] suggested POSECUT which uses an articulated stickman model as a shape prior and dynamic graph cuts [10] for efficient computation. Though the stickman model is simple and easier to manipulate than the LPS, it is useful only when a relevant initial pose is given. Further, their intrinsic shape models are limited to human-like objects. For objects of interest in medical images, it is difficult to create a structural model since their local shape and appearance are subject to unstructured variations. For medical images, Freedman and Zhang [8] proposed a method that deals with these unstructured variations by computing pairwise MRF potentials based on the distance transform from the shape template as a prior. However, the descriptive power of the distance transform is not strong enough to fully represent non-rigid transformations of flexible and foldable organs. The limitations of these methods show that global shape priors including the LPS, stickman model and distance transform, are not effectively applicable to sophisticated segmentation problems. Furthermore, integration of prior information on appearance based on these global models may render the appearance statistics irrelevant by global averaging.

Our focus is on the problem of segmenting knee cartilage in 3-D magnetic resonance (MR) images. As illustrated in Fig. 1 knee articular joint is composed of three bones, i.e., femur, tibia, and patella, and cartilage corresponding to each bone compartment. The knee cartilage segmentation is very challenging due to the inhomogeneities, small size, low tissue contrast, and shape irregularity. In most of the current clinical practices, cartilage boundaries are identified though manual delineation by an expert in most clinical practices. This process is extremely laborious requiring hours per case.

Much of the previous work regarding knee cartilage segmentation methods have been focused on semi-automatic methods. These methods combine sparse user annotations with various techniques such as active shape models [11], b-spline snakes [12] and graph cuts [13]. Although these techniques are much more efficient than manual boundary delineation, they still require tens of minutes and careful annotations.

Recent research has mainly been concerned with developing a fully-automatic method with no user interaction. Folkesson et al. [14] proposed a method based on training k-nearest neighbor (k-NN) voxel classifiers by selecting features such as voxel position, raw and Gaussian smoothed intensities, and intensity derivatives. However, since the features cannot sufficiently represent cartilage voxel characteristics, its accuracy is somewhat limited. More recent methods by Fripp et al. [15] and Yin et al. [16] are both based on a framework which first segments each bone compartment in the knee joint and then segments the corresponding cartilage compartments. Both methods construct a mesh model of each bone surface and identify vertices on the bone-cartilage interface (BCI) by training. In the method of Fripp et al. [15] a dense BCI is explicitly constructed and the outer cartilage boundary is determined by examining the intensity profile of a constrained local region in the direction normal to the BCI. On the other hand, in the method of Yin et al. [16], intensity distributions of tissue are learned and applied to an optimal multi-surface segmentation method [17] to resolve the interactions between cartilage compartments. Both methods are concentrated on global characteristics such as the mean and variance of intensity or estimated thickness of cartilage.

In this paper, we propose a method for fully automatic segmentation of three compartments of knee cartilage i.e., femoral, tibial, and patellar which is composed of bone segmentation, BCI classification, and cartilage segmentation as in [15], [16]. However, for each subprocess new methods designed to increase flexibility and accuracy are proposed. Specifically, a simple and efficient method based on a modified version of the recently presented branch-and-mincut method [18] based on shape priors is proposed for automatic segmentation of bones. Also, an efficient method for classification of bone surface voxels into BCI and non-BCI by constructing a binary classifier based on position and local appearance is proposed. Finally, a cartilage segmentation method based on MRF optimization of localized region and boundary probabilities acquired using relevant local shape and appearance information is proposed. We note that cartilage segmentation method is the core contribution of the paper.

Section snippets

The proposed method

A flowchart summarizing the whole process is presented in Fig. 2. We assume that a training set Ω comprising N cases is established. Here, each case ωn  Ω, n = 1,  , N, comprises the MR image In, cartilage label mask Cn, bone label mask Bn, and bone-cartilage interface points BCIn which can be computed from Cn and Bn. We note that the collective set of all MR images nNIn, cartilage label masks nNCn, bone label masks nNBn, and bone-cartilage interface points nNBCIn of the training set Ω are

Setup

For experiments, we used MR images of seventeen subjects from the database provided by the osteoarthritis initiative (OAI, http://www.oai.ucsf.edu). Each MR image is scanned by double-echo and steady-state (DESS) MR sequence and has 384 × 384 × 160 voxels and 0.36 × 0.36 × 0.70mm3 voxel resolution. The OAI has three subcohorts, namely, Progression, Incidence and Normal. All the 17 subjects of our experiments belong to the Progression subcohort which showed symptoms of osteoarthritis at the baseline

Conclusion and future works

We have proposed a new fully-automatic method for accurate knee cartilage segmentation based on learned local shape and appearance. The proposed method is able to model appearance and shape more flexibly and thus allows for a more tailored representation of relative characteristics for different positions. This is demonstrated in the separate computation of region and boundary probabilities for each patch based on locally relevant statistics retrieved from the training set. Experimental

Acknowledgment

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0012006 and 2007-0053539).

References (27)

  • Y. Boykov, M.-P. Jolly, Interactive graph cuts for optimal boundary amp; region segmentation of objects in n–d images,...
  • Y. Boykov et al.

    Graph cuts and efficient n–d image segmentation

    IJCV

    (2006)
  • S. Roth et al.

    Fields of experts

    International Journal of Computer Vision

    (2009)
  • P. Kohli et al.

    P3 & beyond: move making algorithms for solving higher order functions

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2009)
  • P. Kohli et al.

    Robust higher order potentials for enforcing label consistency

    International Journal of Computer Vision

    (2009)
  • M. Kumar, P. Torr, A. Zisserman, Obj cut, in: IEEE Conference on Computer Vision and Pattern Recognition,...
  • P. Kohli et al.

    Simultaneous segmentation and pose estimation of humans using dynamic graph cuts

    International Journal on Computer Vision

    (2008)
  • D. Freedman, T. Zhang, Interactive graph cut based segmentation with shape priors, in: IEEE Conference on Computer...
  • D. Comaniciu et al.

    Mean shift: a robust approach toward feature space analysis

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2002)
  • P. Kohli, P. Torr, Efficiently solving dynamic markov random fields using graph cuts, in: IEEE International Conference...
  • S. Solloway et al.

    The use of active shape models for making thickness measurements of articular cartilage from mr images

    Magnetic Resonance in Medicine

    (1997)
  • C. Kauffmann et al.

    Computer-aided method for quantification of cartilage thickness and volume changes using MRI: validation study using a synthetic model

    IEEE Transactions on Biomedical Engineering

    (2003)
  • H. Shim et al.

    Knee cartilage: efficient and reproducible segmentation of high-spatial-resolution mr images with the semiautomated graph-cuts algorithm method

    Radiology

    (2009)
  • Cited by (0)

    View full text