Using generative models for handwritten digit recognition

M. Revow, C. K. I. Williams, G. E. Hinton

    Research output: Contribution to journalArticle

    Abstract

    We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.
    Original languageEnglish
    Pages (from-to)592-606
    Number of pages15
    JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
    Volume18
    Issue number6
    Publication statusPublished - Jun 1996

    Fingerprint

    Handwritten Digit Recognition
    Generative Models
    Splines
    Digit
    Spline
    Model Fitting
    Expectation-maximization Algorithm
    B-spline
    Local Minima
    Optical character recognition
    Normalization
    Likelihood
    Segmentation
    Ink
    Maximise
    Likely
    Scaling
    Generator
    Arbitrary
    Model

    Bibliographical note

    ©1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    Keywords

    • deformable model
    • elastic net
    • optical character recognition
    • generative model
    • probabilistic model
    • mixture model

    Cite this

    Revow, M. ; Williams, C. K. I. ; Hinton, G. E. / Using generative models for handwritten digit recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996 ; Vol. 18, No. 6. pp. 592-606.
    @article{d2a5e6c4a031425aa44248611cdfda55,
    title = "Using generative models for handwritten digit recognition",
    abstract = "We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.",
    keywords = "deformable model, elastic net, optical character recognition, generative model, probabilistic model, mixture model",
    author = "M. Revow and Williams, {C. K. I.} and Hinton, {G. E.}",
    note = "{\circledC}1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.",
    year = "1996",
    month = "6",
    language = "English",
    volume = "18",
    pages = "592--606",
    journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
    issn = "0162-8828",
    publisher = "IEEE",
    number = "6",

    }

    Using generative models for handwritten digit recognition. / Revow, M.; Williams, C. K. I.; Hinton, G. E.

    In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No. 6, 06.1996, p. 592-606.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Using generative models for handwritten digit recognition

    AU - Revow, M.

    AU - Williams, C. K. I.

    AU - Hinton, G. E.

    N1 - ©1996 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    PY - 1996/6

    Y1 - 1996/6

    N2 - We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

    AB - We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques.

    KW - deformable model

    KW - elastic net

    KW - optical character recognition

    KW - generative model

    KW - probabilistic model

    KW - mixture model

    UR - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=506410

    M3 - Article

    VL - 18

    SP - 592

    EP - 606

    JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

    JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

    SN - 0162-8828

    IS - 6

    ER -