Optimization of local shape and appearance probabilities for segmentation of knee cartilage in 3-D MR images
Highlights
► We propose a fully automatic segmentation method of knee cartilage. ► Relevant local shape and appearance information is determined from training images. ► Localized region and boundary probabilities are computed and adaptively integrated. ► Segmentation is the collective result of MRF optimizations on multiple local patches. ► Qualitative and quantitative evaluations support potential clinical application.
Introduction
Many discrete optimization algorithms have recently been developed to solve a wide range of computer vision problems. Though binary segmentation of an image into foreground and background is one of the simplest problems, it is of vital importance in various applications including medical image analysis, image editing, and reconstruction of 3-D object structure. Graph cut methods [1], [2] have been widely applied as a solution to this problem of binary segmentation due to their advantages, such as practical efficiency, global optimality, and flexibility of the underlying Markov random field (MRF). However, graph cut algorithms can optimize pairwise MRFs which can only model the characteristics of single voxels and voxel pairs. Modeling of only these low-level visual cues, however, cannot handle the ill-posedness of segmentation problems mostly caused by weak boundaries, inhomogeneous intra-category intensity distributions and overlapped inter-category intensity distributions. These difficulties hinder valid segmentation in medical images, where many organs and lesions of interest have diffused edges and are close to other objects with similar intensity distributions. In many cases high-level cues must be incorporated to guide the segmentation toward the real boundary of the object. Most of the researches on integrating high-level cues are focusing on either optimizing MRFs with higher-order cliques [3], [4], [5] or utilizing prior information on shape and appearance [6], [7], [8].
Higher order cliques comprising more pixels are able to represent more complex structural information than the level of pixel pairs. In [4], Kohli et al. proposed a Pn Potts model that assigns a smaller potential energy to higher-order cliques when the labels of all pixels in each clique are the same (consistent), thus enforcing label consistency among cliques. This work was augmented in [5] by relaxing the all-or-nothing property of the Pn Potts model to the robust Pn model where the potential of a clique now depends on the number of pixels that have labels different from the majority label. This extension allows segments of arbitrary sizes obtained from unsupervised segmentation methods such as mean-shift [9] and enhances segmentation performance compared to when the clique size was fixed as in [4] by preserving detailed edge components. However, the dependence on unsupervised segmentation may cause problems for 3-D medical images where unsupervised segmentation is especially difficult due to low contrast between boundaries.
A different approach to exploit higher-order cliques is to learn the clique statistics based on training images. In [3], Roth and Black proposed a method to learn the representative appearances of a specific clique from training images. Although a similar approach has been applied to segmentation such as the patch dictionary from user-segmented images to define specific clique potentials in [4], this process is generally very expensive computationally and has yet to be applied to 3-D medical images.
On the other hand, segment characteristics can also be represented by statistical information regarding the shape and appearance of the object or the background. In their approach termed as OBJCUT [6], Kumar et al. incorporated layered pictorial structures (LPS) with MRFs to model a realistic shape prior of four-legged animals by a set of latent shape parameters. Since the weight terms in their energy must be updated whenever any parameter of the MRF is changed, OBJCUT has a high computational demand. Kohli et al. [7] suggested POSECUT which uses an articulated stickman model as a shape prior and dynamic graph cuts [10] for efficient computation. Though the stickman model is simple and easier to manipulate than the LPS, it is useful only when a relevant initial pose is given. Further, their intrinsic shape models are limited to human-like objects. For objects of interest in medical images, it is difficult to create a structural model since their local shape and appearance are subject to unstructured variations. For medical images, Freedman and Zhang [8] proposed a method that deals with these unstructured variations by computing pairwise MRF potentials based on the distance transform from the shape template as a prior. However, the descriptive power of the distance transform is not strong enough to fully represent non-rigid transformations of flexible and foldable organs. The limitations of these methods show that global shape priors including the LPS, stickman model and distance transform, are not effectively applicable to sophisticated segmentation problems. Furthermore, integration of prior information on appearance based on these global models may render the appearance statistics irrelevant by global averaging.
Our focus is on the problem of segmenting knee cartilage in 3-D magnetic resonance (MR) images. As illustrated in Fig. 1 knee articular joint is composed of three bones, i.e., femur, tibia, and patella, and cartilage corresponding to each bone compartment. The knee cartilage segmentation is very challenging due to the inhomogeneities, small size, low tissue contrast, and shape irregularity. In most of the current clinical practices, cartilage boundaries are identified though manual delineation by an expert in most clinical practices. This process is extremely laborious requiring hours per case.
Much of the previous work regarding knee cartilage segmentation methods have been focused on semi-automatic methods. These methods combine sparse user annotations with various techniques such as active shape models [11], b-spline snakes [12] and graph cuts [13]. Although these techniques are much more efficient than manual boundary delineation, they still require tens of minutes and careful annotations.
Recent research has mainly been concerned with developing a fully-automatic method with no user interaction. Folkesson et al. [14] proposed a method based on training k-nearest neighbor (k-NN) voxel classifiers by selecting features such as voxel position, raw and Gaussian smoothed intensities, and intensity derivatives. However, since the features cannot sufficiently represent cartilage voxel characteristics, its accuracy is somewhat limited. More recent methods by Fripp et al. [15] and Yin et al. [16] are both based on a framework which first segments each bone compartment in the knee joint and then segments the corresponding cartilage compartments. Both methods construct a mesh model of each bone surface and identify vertices on the bone-cartilage interface (BCI) by training. In the method of Fripp et al. [15] a dense BCI is explicitly constructed and the outer cartilage boundary is determined by examining the intensity profile of a constrained local region in the direction normal to the BCI. On the other hand, in the method of Yin et al. [16], intensity distributions of tissue are learned and applied to an optimal multi-surface segmentation method [17] to resolve the interactions between cartilage compartments. Both methods are concentrated on global characteristics such as the mean and variance of intensity or estimated thickness of cartilage.
In this paper, we propose a method for fully automatic segmentation of three compartments of knee cartilage i.e., femoral, tibial, and patellar which is composed of bone segmentation, BCI classification, and cartilage segmentation as in [15], [16]. However, for each subprocess new methods designed to increase flexibility and accuracy are proposed. Specifically, a simple and efficient method based on a modified version of the recently presented branch-and-mincut method [18] based on shape priors is proposed for automatic segmentation of bones. Also, an efficient method for classification of bone surface voxels into BCI and non-BCI by constructing a binary classifier based on position and local appearance is proposed. Finally, a cartilage segmentation method based on MRF optimization of localized region and boundary probabilities acquired using relevant local shape and appearance information is proposed. We note that cartilage segmentation method is the core contribution of the paper.
Section snippets
The proposed method
A flowchart summarizing the whole process is presented in Fig. 2. We assume that a training set Ω comprising N cases is established. Here, each case ωn ∈ Ω, n = 1, … , N, comprises the MR image , cartilage label mask , bone label mask , and bone-cartilage interface points BCIn which can be computed from and . We note that the collective set of all MR images , cartilage label masks , bone label masks , and bone-cartilage interface points of the training set Ω are
Setup
For experiments, we used MR images of seventeen subjects from the database provided by the osteoarthritis initiative (OAI, http://www.oai.ucsf.edu). Each MR image is scanned by double-echo and steady-state (DESS) MR sequence and has 384 × 384 × 160 voxels and 0.36 × 0.36 × 0.70mm3 voxel resolution. The OAI has three subcohorts, namely, Progression, Incidence and Normal. All the 17 subjects of our experiments belong to the Progression subcohort which showed symptoms of osteoarthritis at the baseline
Conclusion and future works
We have proposed a new fully-automatic method for accurate knee cartilage segmentation based on learned local shape and appearance. The proposed method is able to model appearance and shape more flexibly and thus allows for a more tailored representation of relative characteristics for different positions. This is demonstrated in the separate computation of region and boundary probabilities for each patch based on locally relevant statistics retrieved from the training set. Experimental
Acknowledgment
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0012006 and 2007-0053539).
References (27)
- Y. Boykov, M.-P. Jolly, Interactive graph cuts for optimal boundary amp; region segmentation of objects in n–d images,...
- et al.
Graph cuts and efficient n–d image segmentation
IJCV
(2006) - et al.
Fields of experts
International Journal of Computer Vision
(2009) - et al.
P3 & beyond: move making algorithms for solving higher order functions
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2009) - et al.
Robust higher order potentials for enforcing label consistency
International Journal of Computer Vision
(2009) - M. Kumar, P. Torr, A. Zisserman, Obj cut, in: IEEE Conference on Computer Vision and Pattern Recognition,...
- et al.
Simultaneous segmentation and pose estimation of humans using dynamic graph cuts
International Journal on Computer Vision
(2008) - D. Freedman, T. Zhang, Interactive graph cut based segmentation with shape priors, in: IEEE Conference on Computer...
- et al.
Mean shift: a robust approach toward feature space analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2002) - P. Kohli, P. Torr, Efficiently solving dynamic markov random fields using graph cuts, in: IEEE International Conference...