Elsevier

Pattern Recognition

Volume 47, Issue 11, November 2014, Pages 3614-3629
Pattern Recognition

Smoothing of HMM parameters for efficient recognition of online handwriting

https://doi.org/10.1016/j.patcog.2014.04.019Get rights and content

Highlights

  • An HMM based unconstrained online handwriting recognition is presented.

  • A novel approach to smoothing of HMM parameters has been considered.

  • A circular distribution is used to exploit the directional nature of input data.

  • A fully connected non-homogeneous HMM is used.

  • It is used for limited vocabulary recognition of unconstrained Bangla handwriting.

Abstract

In this paper, we propose a novel approach to limited vocabulary recognition of unconstrained (mixed cursive) handwriting based on a hidden Markov model (HMM). Here, an input word sample is segmented into sub-strokes. However, instead of recognizing individual sub-strokes, we recognize the whole word sample as a basic unit. In this study, we consider both circular and linear features. The von Mises distribution is used to model circular feature components while the Gaussian distribution is used for the linear features. We implement fully connected non-homogeneous HMMs considering the enormous variability in the present handwriting style which often includes delayed strokes. Since such an HMM involves a large number of parameters, we implement a simple but novel approach to smoothing of HMM parameters to avoid possible over-fitting and poor generalization. Finally, in the proposed approach, we combine recognition results of two HMMs: one considering the input sample in the natural order and the other considering the sample in the reverse order. A major goal of the present study is to develop an efficient recognition scheme for online unconstrained handwritten words of Bangla, a popular Indic script.

Introduction

With the availability of pen computers at increasingly affordable prices, online handwriting recognition studies have received much attention. In the literature, there are several reports [1], [2], [3], [4], [5], [6], [7] on such online handwriting recognition studies of different scripts such as Latin, Japanese, Korean, Chinese and Arabic. Although the majority of existing online handwriting recognition studies [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21] on the Indic scripts considered only isolated characters, recently online handwritten word recognition studies [22], [23], [24], [25], [26], [27], [28] of a few Indic scripts have gained some momentum. The earliest work on Bangla (a major Indic script) online handwriting recognition [20] was based on its isolated basic characters. Later, Bhattacharya et al. [14] studied a direction code based feature vector for recognition of online handwritten Bangla basic characters. In 2008, Parui et al. [12] first identified the set of strokes used to write the basic characters of Bangla. Hidden Markov models were used to identify these strokes and a look-up table was used to recognize the characters in the second stage. In another study [11], strokes used to write isolated Bangla characters were recognized using a DTW-based technique and the characters were identified by a rule based method. In [10], a sub-stroke based recognition study of online handwritten Bangla basic characters was presented. In this study, recognition performance of an HMM-based classifier was compared with that of a DTW-based nearest-neighbor classifier. The first benchmark recognition results on databases of online handwritten isolated characters of major Indic scripts including Bangla were reported in [9]. In a recent study [8] of online handwritten isolated Bangla characters, a database of stroke level samples was semi-automatically developed from the isolated character database presented in [9] and a hidden Markov model based character classifier was studied where the stroke classes formed the state space.

Unconstrained handwriting in Bangla is mixed cursive in nature unlike other Indic scripts. This style of handwriting is a general combination of cursively written connected word fragments and discretely written isolated characters. This is the most difficult style of handwriting from the view point of its automatic recognition although it is the most realistic style required to be studied. On the other hand, Bangla has other common peculiarities of an Indic script compared to several other scripts as discussed in [23], [29]. This script is used by around 230 million people in the Indian subcontinent. It has a set of 50 basic character shapes 11 of which are vowels and the remaining 39 are consonants. Modified shapes of 10 characters among the basic vowels and another two among the basic consonants occur frequently. Shapes of these 62 characters (50 basic characters, 10 vowel modifiers and 2 consonant modifiers) are shown in Fig. 1. These modifiers are in fact diacritics and these are the modified shapes of a few basic characters when they appear in proximity of other characters such as the vowel modifier ‘AAKAR’ is the modified shape of the basic vowel ‘AA’ or ‘IKAR’ is the modified shape of the basic vowel ‘I’ as shown in Fig. 1. Two or more from the above 62 shapes often combine forming a large number of compound characters with complicated shapes. Although there are around 250 different compound characters, their occurrence percentage in a typical corpus is very low [30].

Only a few recognition studies [24], [25], [26], [27] of unconstrained Bangla online handwritten words are available in the literature. In the earliest such study [27], a segmentation-based strategy was used and the sub-strokes of the input word were recognized using a modified quadratic discriminant function (MQDF) classifier. A verification module used a set of rules to recognize the word from MQDF outputs. Fink et al. [26] studied a sub-stroke level feature representation and four hidden Markov model based writing models for recognition of these words. In this study, an input word sample was segmented wherever the pen along the trajectory turns more than a threshold value. Among different writing models studied in this work the largest model considered each pseudo-character in the actual context of its left and right neighbors as a distinct unit. On the other hand, a hybrid recognition strategy described in [25] used a multilayer perceptron (MLP) architecture for extraction of features from input word and a support vector machine (SVM) classifier trained on a fixed lexicon was used to recognize the word. Another study [24] identified 85 stroke classes used in cursive Bangla handwriting and recognized the strokes obtained from an input word using chain code histogram features and the SVM classifier. In the present attempt, we studied a non-homogeneous fully connected HMM-based classifier and a combination of linear and circular feature components to develop an efficient holistic recognition scheme for unconstrained cursive Bangla handwriting which often includes delayed strokes. The present approach has the advantage of incremental learning towards addition of new word classes into the lexicon. Additionally, we studied a novel two-level smoothing during estimation of HMM parameters to tackle the common problem of existence of several strokes occurring rarely. The lexicon used in the experimentation of the present study is so designed that it includes various difficult situations occurring in the real life scenario.

In the literature, both analytical and holistic approaches to machine reading of unconstrained handwriting have been studied. In an analytical approach, the input word sample is first segmented into characters or pseudo-characters which are recognized at a later stage to build the word-level interpretation. On the other hand, in the holistic approach, an input sample is considered as a single entity and it is recognized using certain global features of the word sample as a whole. The latter may successfully recognize poorly written word samples by avoiding segmentation ambiguities. Justifications of such holistic approaches to word recognition can be found in psychological studies of human reading, which shows that humans do not, in general, read words letter by letter. The importance of holistic approaches to handwriting recognition has been discussed in several reports [31], [32], [33]. In the present study of online unconstrained handwritten Bangla word recognition, the input sample is segmented into a variable number of sub-strokes each of which is either a character or part of a character. Unlike a typical holistic approach, here we do not consider a feature representation of the whole word sample as the basic element. On the other hand, unlike an analytic scheme our recognition strategy does not attempt to recognize the individual characters of the input word sample but recognizes the whole word sample as one entity.

The directional slope angle θ is a natural choice as a feature component of online handwriting samples. However, since θ is inherently circular in nature, commonly used statistical distributions cannot model them. As an alternative, the directional quantity θ is sometimes replaced [34] by two linear quantities sinθ and cosθ although these two are significantly dependent and also they increase the feature dimension. On the other hand, methodologies such as the von Mises distribution [35] are available to deal with directional or circular feature components. In the present study, we use both circular and linear feature components. We use the von Mises distribution to model circular feature components while the Gaussian distribution is used to model linear feature components. The von Mises distribution has practical benefits regarding parameter estimation as described in Section 3.4.10 of the book [35].

HMM based recognition algorithms for such unconstrained handwriting are considered to be more flexible and potentially more robust due to the enormous variability inherent in similar input data [36]. Moreover, vital contextual information such as morphological, lexical, and syntactical may be successfully exploited in the recognition process with the help of such a stochastic modeling. In a number of reports [2], [7], [23], [26], [31], [37], [38], the general understanding of the applicability of HMM in handwriting recognition tasks has been verified. Jung et al. [5] used a fully connected hidden Markov model with continuous observation probabilities to classify on-line Korean character space into clusters. Advantages of using such an HMM architecture include (i) it can achieve the highest likelihood due to its larger number of parameters, (ii) it can achieve the fastest convergence compared to the simpler models [39], (iii) it can exploit a long observation sequence by using experiences in longer past in comparison to simple left-to-right HMM. Although fully connected HMMs have been used before, we did not find any such use of non-homogeneous HMM in handwriting recognition tasks. Such a non-homogeneous HMM allows the transition probabilities to be different at different time instances. In the present task of unconstrained Bangla handwriting recognition, considering the enormous variation in the stroke order, we implemented a fully connected non-homogeneous HMM.

Determination of the number of states of an HMM is a crucial design issue. In [40], the same number of states is used irrespective of the class while in [41], intuitive knowledge was applied to decide the number of states in each underlying class. Successive state-splitting and state-merging according to maximum likelihood criterion were studied in [42] and [43] respectively. Lee et al. [44] studied a state-tying approach based on the structural similarity of patterns in a class. This approach was simulated for recognition of online Hangul handwriting. In modeling complex shape patterns of unconstrained handwriting of a script involving a large number of characters of widely varying cursive shapes such as those of Bangla script, an adequate number of states is usually high. However, finding a proper choice of the same is usually tricky and here we have taken a two-stage approach for this purpose. In the first stage, we pool together all the sub-strokes obtained from the whole set of training samples of the underlying classes, cluster them by K-means algorithm and use these clusters to determine the states of individual classes. A validation set of samples is used to obtain an optimal value of K. In the second stage, for each individual class we identify the clusters that are relevant to it. The number of states of a new (word) class consisting of characters appearing in the present lexicon may be obtained by computing only the latter stage of the above.

Finally, in this work, we have proposed a novel approach to estimation of initial and transition probabilities of an HMM. For a given observation, the state is determined so that it has the maximum posterior probability. In the classical approach, given an observation pair, the winner pair of states takes all where even the closely competing pairs of states do not get any contribution. This may cause overfitting of the training data affecting the generalization capability. In the proposed approach, we use a certain diffusion technique in estimating the transition and initial probabilities. Here, a pair of observations, instead of generating only one pair of winning states, contributes probabilities to all pairs of states such that a pair of states with higher posterior probabilities gets a higher contribution. Similarly, the initial state probabilities are also computed from the posterior probabilities, rather than determining the initial state of a training sample on the basis of the state giving the maximum posterior probability. Also, we diffuse a transition probability at a time instance over the preceding and the following time instances.

The rest of the paper is organized as follows. We describe our preprocessing operations in Section 2. Feature extraction approach is detailed in Section 3. Section 4 provides the detailed description of underlying HMMs, and Section 5 describes the estimation of the parameters of these HMMs. Section 7 provides the results of our simulations. Section 8 concludes the paper.

Section snippets

Preprocessing

As usual, the aim of our preprocessing operations is to reduce variations among samples of the same class. First of all, the repeated sample points (i.e., successive occurrences of the same point) are removed. This is followed by application of moving average operation for two times using window size 3. It helps to reduce noise. In the next step, we resample the points along each stroke separately so that the resulting sample points become equidistant. The size normalization of the word sample

Feature extraction

Generally, in a holistic recognition approach, the input word sample is treated as a single, indivisible entity for its feature representation [32]. However, the object recognition process among humans considers both the entire shape and the shapes of its various local parts. In [45], it has been observed that although geometrical features are important clues for human reading of cursive handwriting, special attentions are paid to specific parts of the word. In the present study, we first

Designing of HMM classifier

Several existing studies [48], [49], [50], [51] of online cursive handwriting recognition are based on HMMs. Such an HMM is a doubly embedded stochastic process with an underlying hidden process which can be observed through a second set of stochastic processes producing a sequence of observations [52]. A hidden process has several connected states. An HMM is completely specified by its number of states (K) and three probability distributions (π, A and B). The following notations will be used

Smoothing of HMM parameters

Often estimation of the initial and transition probabilities is adversely affected due to insufficient number of the training samples. In particular, some of these probabilities may be estimated as zero on the basis of the training samples, which may lead to over-fitting. To avoid this, that is, to make the estimated probabilities to have better generalization capabilities, certain smoothing operations are performed on the initial and transition probabilities. A trivial strategy is to smooth

Incremental learning

As described in Section 4, the generation of the proposed HMMs for the present classification task has two distinct steps. In the first step, the set of states is determined in an overall manner. In this step the underlying word classes are not dealt with individually but we pool together all the sub-strokes obtained from the entire set of training samples from all the word classes. This set of sub-strokes is clustered to identify the states of individual HMMs each representing a distinct word

Experimental results

For simulation purpose, we designed a lexicon (Lexe) of 110 Bangla words and developed a database of online handwritten samples on the basis of this lexicon. A few samples from this database are shown in Fig. 9. Lexe represents a general scenario in the sense that it includes various typical situations such as pairs of words with identical right tails like

and
or identical left tails such as
and
or identical at both the tails like
and
or one complete word appearing as the right tail of

Conclusions

In this work, we studied limited vocabulary recognition of unconstrained online handwritten Bangla words. A similar study of mixed cursive handwriting involving a large character set is rarely found. The lexicon set used in the present study has been carefully selected to include various possible difficulties. Difficult situations include lexicon set (i) consisting of words of lengths as small as 2 and also as large as 10, (ii) consisting of several groups of words with identical left or right

Conflict of interest

None declared.

Acknowledgments

This research work has been partially supported by the Technology Development for Indian Languages (TDIL) Programme of the Dept. of Information Technology, Govt. of India.

Oendrilla Samanta received her Bachelor degree with Hons. in Mathematics from Calcutta University in 2005 and Master of Computer Application from West Bengal University of Technology in 2008. She joined the Indian Statistical Institute, Kolkata in 2008 as a research scholar. She is working on online handwriting recognition of Bangla towards her Ph.D. thesis.

References (54)

  • H.J. Kim et al.

    An HMM-based character recognition network using level building

    Pattern Recognit.

    (1997)
  • J. Hu et al.

    Writer independent on-line handwriting recognition using an HMM approach

    Pattern Recognit.

    (2000)
  • C.L. Liu et al.

    Online recognition of Chinese charactersthe state-of-the-art

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2004)
  • S. Jaeger et al.

    The state of art in Japanese online handwriting recognition compared to techniques in western handwriting recognition

    Int. J. Doc. Anal. Recognit.

    (2003)
  • R. Plamondon et al.

    On-line and off-line handwriting recognition: a comprehensive survey

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • C. Biswas, U. Bhattacharya, S.K. Parui, HMM based online handwritten Bangla character recognition using Dirichlet...
  • T. Mondal, U. Bhattacharya, S.K. Parui, K. Das, D. Mandalapu, On-line handwriting recognition of Indian scripts—the...
  • T. Mondal, U. Bhattacharya, S.K. Parui, K. Das, V. Roy, Database generation and recognition of online handwritten...
  • A. Bandyopadhyay, B. Chakraborty, Development of online handwriting recognition system: a case study with handwritten...
  • S.K. Parui, K. Guin, U. Bhattacharya, B.B. Chaudhuri, Online handwritten Bangla character recognition using HMM, in:...
  • A. Bharath, S. Madhvanath, FreePad: a novel handwriting based text input for pen and touch interfaces, in: Proceedings...
  • U. Bhattacharya, B.K. Gupta, S.K. Parui, Direction code based features for recognition of online handwritten characters...
  • V.J. Babu, L. Prashanth, R.R. Sharma, G.V.P. Rao, A. Bharath, HMM-based online handwriting recognition system for...
  • H. Swethalakshmi, C.C. Sekhar, V.S. Chakravarthy, Spatiostructural features for recognition of online handwritten...
  • H. Swethalakshmi, A. Jayaraman, V.S. Chakravarthy, C.C. Sekhar, Online handwritten character recognition of Devanagari...
  • N. Joshi, G. Sita, A.G. Ramakrishnan, V. Deepu, S. Madhvanath, Machine recognition of online handwritten Devanagari...
  • K.H. Aparna, V. Subramanian, M. Kasirajan, G.V. Prakash, V.S. Chakravarthy, S. Madhvanath, Online handwriting...
  • Cited by (41)

    • Age, gender and handedness prediction using handwritten text: A comprehensive survey

      2024, Engineering Applications of Artificial Intelligence
    • RNN based online handwritten word recognition in Devanagari and Bengali scripts using horizontal zoning

      2019, Pattern Recognition
      Citation Excerpt :

      After relating the popularity distribution of Indian languages shown in this figure with the scripts used to write these languages, it can be established that Devanagari and Bengali are the most and second most popular Indian scripts respectively. Some research works are available on online handwritten word recognition in different Indian scripts as well such as Devanagari [15,16], Bengali [12–15], and Tamil scripts [16–18]. Online recognition of handwritten text in Indian scripts faces some unique challenges as compared to Latin script due to large number of symbols, symbol order variations—especially the variations in temporal ordering of vowel modifiers occur in different samples of the same word.

    • An HMM framework based on spherical-linear features for online cursive handwriting recognition

      2018, Information Sciences
      Citation Excerpt :

      On the other hand, Schomaker and Segers [44] observed that although geometrical shape of a word is an important clue for human reading of cursive handwriting, special attentions are paid to its specific parts. In fact, explicit segmentation of the input word sample was used as a preprocessing step in some holistic handwriting recognition studies [27,41,42]. Segmentation-based and segmentation-free recognition approaches have been compared in [50].

    View all citing articles on Scopus

    Oendrilla Samanta received her Bachelor degree with Hons. in Mathematics from Calcutta University in 2005 and Master of Computer Application from West Bengal University of Technology in 2008. She joined the Indian Statistical Institute, Kolkata in 2008 as a research scholar. She is working on online handwriting recognition of Bangla towards her Ph.D. thesis.

    Ujjwal Bhattacharya received B. Sc. (Hons.), M. Sc. and M. Phil. degrees in Pure Mathematics from the Calcutta University in the years 1986, 1989 and 1991 respectively. He did Post-graduate Diploma in Computer Applications in 1990. He joined the Indian Statistical Institute, Kolkata in 1991 as a research scholar. He received National Scholarship (1980, 1982), ISCA Young Scientist Award (1995) and Amiya K. Pujari Award for best paper (2006). The topic of his Ph.D. thesis was offline handwritten Bangla character recognition. He is a member of the IUPRAI affiliated to IAPR. He joined the Computer Vision and Pattern Recognition Unit of Indian Statistical Institute as a faculty member in 1999. He is currently working at the same place in an Associate Professor (equiv.) position. His research interests include online/offline handwriting recognition, analysis of camera captured scene texts, historical document analysis and machine learning. He was one of the Guest Editors of the forthcoming Special Issue of Pattern Recognition Letter Journal on Frontiers of Handwriting Processing.

    S.K. Parui received his Master׳s degree in Statistics and his Ph.D. degree from the Indian Statistical Institute in 1975 and 1986, respectively. He is now a Professor in Computer Vision and Pattern Recognition Unit of the Indian Statistical Institute. His current research interests include image processing, handwriting recognition, statistical pattern recognition, neural networks and information retrieval. Prof. Parui has published over 130 papers in refereed journals, edited volumes and conference proceedings. He has supervised seven Ph.D. students and is currently supervising the Ph.D. work of three students.

    View full text