Elsevier

Pattern Recognition

Volume 45, Issue 4, April 2012, Pages 1255-1264
Pattern Recognition

Cytoplasm and nucleus segmentation in cervical smear images using Radiating GVF Snake

https://doi.org/10.1016/j.patcog.2011.09.018Get rights and content

Abstract

A Radiating Gradient Vector Flow (RGVF) Snake aiming at accurate extraction of both the nucleus and cytoplasm from a single-cell cervical smear image is proposed. After preprocessing, the areas in the image are roughly clustered into nucleus, cytoplasm and the background by a spatial K-means clustering algorithm. After initial contours are extracted, the image is segmented using RGVF. RGVF involves a new edge map computation method and a stack-based refinement, and is thus robust to contaminations and can effectively locate the obscure boundaries. The boundaries can also be correctly traced even if there are interferences near the cytoplasm and nucleus regions. Experiments performed on the Herlev dataset, which contains 917 images show the effectiveness of the proposed algorithm.

Highlights

► A coarse-to-fine segmentation framework is proposed for single-cell cervical cell images. ► Radiating GVF Snake is proposed based on GVF Snake. ► A new edge map computation method and a stack-based refinement are introduced into Radiating GVF Snake. ► Radiating GVF Snake is robust to contaminations and can effectively locate the relatively obscure boundaries.

Introduction

Cervical cancer is one of the most common malignancies among women. More than 200 thousands women died from this disease every year [1]. Unlike other cancers that have distinguished specific symptoms, cervical cancer needs several years (generally 8–10 years) to develop from pre-cancerous stage to severe stage without any symptom. When tell-tale symptoms appear, it is usually unresponsive to treatment [2]. Fortunately, this disease can be easily detected and preventable in its early stage with the Pap smear test [1], [2], which has been commonly used to screen out abnormal cervical cells for couple of years. Abnormal cervical cells that have undergone precancerous changes are called dysplastic cells, which have three main phrases. The first phrase is called mildly dysplastic, in which the nucleus becomes larger and brighter than a normal one. The second phrase is called moderate dysplastic, in which nucleus is much larger and darker. At the final phrase called the severe dysplastic, both the nucleus and cytoplasm change their size and texture; the nucleus is larger and darker with a grotesque shape; and the cytoplasm is usually darker and smaller [3]. Hence, cells in cancers and pre-cancers are characterized by many morphologic and architectural alterations, including shape and size of the cytoplasm and nucleus, an increasing nuclear-cytoplasmic ratio, etc. Fig. 1 shows some representative examples of dysplastic and normal cells.

Manpower-based smear analysis is a tedious, time-consuming and error-prone job. The lack of pathologist work force makes things worse. Therefore, machine-assisted automatic screening and diagnosing system brings significant benefits to help women prevent cervical cancer. An effective segmentation algorithm to detect the contours of cytoplasm and nucleus, which affects the final accuracy of the diagnostic result, plays a critical role in such kind of automatic system.

Several segmentation methods have been adopted for the cytoplasm and nucleus extraction from cervical smear images. Earlier researchers attempt to detect and segment cells in cervical smear images through image thresholding techniques [4]. Morphological watersheds are also used to separate the cytoplasm and nucleus of each cell [5]. However, these methods cannot get satisfying results due to the complexity of cervical smear images. Recent researches lie in two aspects: (1) extract the nuclei boundaries only, in either single-cell or overlapping cells cervical smear images [6], [7], [8], [9], [10]; (2) extract boundaries of both nucleus and cytoplasm in single-cell cervical smear images [2], [11]. The first topic has attracted considerable researches. The second topic, however, has not been effectively solved. The challenges are: (1) (partially or entirely) obscure boundary of the cytoplasm because of low intensity contrast with the background; (2) the contaminations caused by inflammatory cells, blood stains, etc.

We briefly surveyed the up-to-date researches of two aspects. Kale and Aksoy [6] proposed a hybrid method to find out the nuclei of overlapping cervical smear images. In the first phase, a hierarchy tree is constructed using multi-scale watershed segmentation with the local gradient information. Then the most meaningful regions are selected from the tree by considering the spectral homogeneity and circularity of each segment. Nuclei of the cervical cells on the given image are detected by applying an SVM-based classification on the meaningful regions in the second phase. Experimental results show a fairly good performance of their segmentation method, which achieves 96% accuracy of the final classification result. However, their segmentation technique requires high computational cost since the meaningful regions selection would go through each node in the tree.

Yang-Mao et al. [2] proposed a contour enhancement approach to the extraction of nucleus and cytoplasm boundaries. The trim-meaning filter is used to remove the Gaussian noise and impulse noise. Then the gradient of edges is enhanced through bi-group approach. The main contribution of their solution is the mean vector difference (MVD) enhancer approach, in which a probability of the direction of gradient vector flows (GVFs) [12] is assigned to each pixel of the image. Finally the nucleus and cytoplasm contours are extracted by the Otsu method [13]. However, their solution is based on the edge detection and cannot guarantee closed and continuous contours.

It is important to segment the cytoplasm accurately. Once the true boundary of cytoplasm is located, the quantitative metrics (e.g. the diameter of cytoplasm, the nuclear-cytoplasmic ratio, etc.) can be calculated. This paper targets at extraction of accurate boundaries of both the nucleus and cytoplasm from single-cell cervical smear images. We investigate cervical smear image segmentation using Snake model, driven by GVF [12]. GVF Snake is robust to the initialization of the Snake and is able to converge to the boundary concavity. However, it is still sensitive to noises. This is the reason why GVF sometimes performs badly in complex scenarios. Tang [14] improved the GVF Snake in skin cancer image segmentation by exploiting the direction information of the gradients. The basic idea of Tang's method is to compute the intensity gradient for each pixel along one of the eight directions (e.g. 0°, 45°, 90°,…, and 315°). We extend the multi-direction GVF Snake in Tang [14] to the Radiating GVF Snake, in which the intensity gradient for each pixel is computed along a radiating line, which starts from the intensity-weighted centroid of the rough nucleus region and passes through the pixel itself. The radiating way improves the ability of GVF to locate obscure boundaries. Furthermore, the intensity distribution along the radiating line may be used to diminish false radiating gradients caused by the inconsistent staining and contaminations (inflammatory cells, blood stains, etc.).

A coarse-to-fine framework is proposed. First, each single-cell cervical smear image is converted to CIELAB color space and the L dimension is normalized to form the grayscale image. The non-local means filter [15] is used to remove noises. A spatial K-means clustering algorithm is then proposed to extract the initial contours of the nucleus and cytoplasm. These initial rough contours serve for the initialization of the proposed RGVF Snake. Finally, RGVF Snake is used to estimate accurate boundaries of the nucleus and cytoplasm.

This paper is organized as follows: Section 2 introduces necessary preprocessing techniques, including conversion to grayscale image and denoising using non-local means filter. In Section 3, we adopt the spatial K-means clustering algorithm to extract the initial rough contours of the nucleus and cytoplasm. The proposed Radiating Gradient Vector Flow (RGVF) Snake is presented in Section 4 to obtain the final nucleus and cytoplasm boundaries. The effectiveness of the proposed coarse-to-fine framework is verified by a set of experiments in Section 5. Finally, we conclude in Section 6.

Section snippets

Conversion to grayscale image

In order to make it easier for examination under microscope, the specimen of cervical cells is colored with dyes in a procedure called staining. There are several dyes commonly used, including H&E stain, Romanowsky–Giemsa stain and the Papanicolaou stain. These dyes generate quite a variety of colors. Furthermore, inflammatory cells, blood stains, changes in cellular constituents, even the human operation of staining, etc., also affect the intensity and distribution of the stains. Since the

Rough segmentation using spatial K-means

Extracting the initial rough contours of the nucleus and cytoplasm, as well as labeling the intensity-weighted centroid of the nucleus, is the first step in our coarse-to-fine framework. This step serves as the basis for the rest steps. To accomplish this objective, the cervical smear image should be roughly divided into three classes first.

GVF Snake

A Snake [23] is defined as a controlled continuity contour and can be represented parametrically by r(s)=(x(s),y(s)), where s∈[0,1]. The contour is attracted to salient image features (lines, edges, terminations, etc.) by minimizing the energy function of Snake belowEsnake=01E(r(s))ds=01[Eint(r(s))+Eext(r(s))]ds,where Eint(r(s)) is the internal deformation energy and Eext(r(s)) is the external energy, which is designed based on the application scenario. Eint(r(s)) controls the smoothness of

Experiments and results

The experiments in this paper focused on evaluating the performance of RGVF in both the nucleus and cytoplasm segmentation from single-cell cervical smear images. The Herlev dataset [3] containing 917 cervical smear images is used as the test data. It also provides manual segmentation ground truth for all images.

We divide the experiments into two parts: (1) Experiments on nucleus segmentation. Our method's result is compared with that of [6], which also tested its nuclei segmentation algorithm

Conclusion

This paper proposed a coarse-to-fine framework to segment the nucleus and cytoplasm from cervical smear images. Since the cervical cells may be stained in different colors, we transform the color images into grayscale first. The L dimension of CIELAB color space, which represents lightness (contrast), is extracted and normalized to form the grayscale images. A non-local means filter is then used to remove noises from the grayscale images while preserving the edge sharpness of the objects.

Acknowledgment

We thank Professor George Dounias for the help of providing the Herlev dataset and thank the anonymous reviewers for their constructive comments. The work described in this paper was fully supported by a grant from the City University of Hong Kong (Project no. 7002696), the Natural Science Foundation of China (under Grant nos. 91024012; 60603015; 60970034) and the Foundation for the Author of National Excellent Doctoral Dissertation (Grant no. 2007B4).

KUAN LI received his M.S. degree in control science and engineering from the National University of Defense Technology in 2007. He is currently working toward a Ph.D. degree in the College of Computer Science at the National University of Defense Technology. His research interests include medical image processing and pattern recognition.

References (24)

  • M. Plissiti et al.

    Automated detection of cell nuclei in Pap smear images using morphological reconstruction and clustering

    IEEE Transactions on Information Technology in Biomedicine

    (2011)
  • F. Vaschetto, E. Montseny, P. Sobrevilla, E. Lerma, THREECOND: an automated and unsupervised three colour fuzzy-based...
  • Cited by (0)

    KUAN LI received his M.S. degree in control science and engineering from the National University of Defense Technology in 2007. He is currently working toward a Ph.D. degree in the College of Computer Science at the National University of Defense Technology. His research interests include medical image processing and pattern recognition.

    ZHI LU received his M.Sc. and M.Phil. degrees in computer science from the City University of Hong Kong, Hong Kong S.A.R. He is currently a first year Ph.D. student in the Department of Computer Science at the City University of Hong Kong. His research interests include medical image processing and pattern recognition.

    WENYIN LIU received the B.E. and M.E. degrees in computer science from Tsinghua University, Beijing, and the D.Sc. degree from the Technion-Israel Institute of Technology, Haifa. He is an assistant professor in the Department of Computer Science at the City University of Hong Kong. Earlier he was a full-time researcher at Microsoft Research China/Asia. His research interests include question answering, anti-phishing, graphics recognition, and performance evaluation. In 2003, he was awarded the International Conference on Document Analysis and recognition Outstanding Young Researcher Award by the International Association for Pattern Recognition (IAPR). He had been the TC10 Chair of the IAPR for 2006–2010 and a guest professor at the University of Science and Technology of China (USTC) since 2005. He is on the editorial board of the International Journal of Document Analysis and Recognition (IJDAR). He is a fellow of the IAPR and a senior member of the IEEE.

    JIANPING YIN received his M.S. degree and Ph.D. degree in Computer Science from the National University of Defense Technology, China, in 1986 and 1990, respectively. He is a full professor of computer science in the National University of Defense Technology. His research interests involve artificial intelligence, pattern recognition, algorithm design, and information security.

    View full text