A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus

Deyu Zhou, Yulan He

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

Abstract

We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.
Original languageEnglish
Title of host publicationCOLING '08
Subtitle of host publicationproceedings of the 22nd international conference on computational linguistics
EditorsDonia Scott, Hans Uszkoreit
Place of PublicationStroudsburg, PA (US)
PublisherAssociation for Computational Linguistics
Pages1113-1120
Number of pages8
Volume1
ISBN (Print)978-1-905593-44-6
Publication statusPublished - 1 Jan 2008

Fingerprint

Support vector machines
Semantics
Hidden Markov models

Bibliographical note

© 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
Some rights reserved.

Cite this

Zhou, D., & He, Y. (2008). A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus. In D. Scott, & H. Uszkoreit (Eds.), COLING '08: proceedings of the 22nd international conference on computational linguistics (Vol. 1, pp. 1113-1120). Stroudsburg, PA (US): Association for Computational Linguistics.
Zhou, Deyu ; He, Yulan. / A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus. COLING '08: proceedings of the 22nd international conference on computational linguistics. editor / Donia Scott ; Hans Uszkoreit. Vol. 1 Stroudsburg, PA (US) : Association for Computational Linguistics, 2008. pp. 1113-1120
@inbook{b42c6dfa851242d7b9eabceadb505eb0,
title = "A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus",
abstract = "We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.",
author = "Deyu Zhou and Yulan He",
note = "{\circledC} 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/). Some rights reserved.",
year = "2008",
month = "1",
day = "1",
language = "English",
isbn = "978-1-905593-44-6",
volume = "1",
pages = "1113--1120",
editor = "Donia Scott and Hans Uszkoreit",
booktitle = "COLING '08",
publisher = "Association for Computational Linguistics",

}

Zhou, D & He, Y 2008, A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus. in D Scott & H Uszkoreit (eds), COLING '08: proceedings of the 22nd international conference on computational linguistics. vol. 1, Association for Computational Linguistics, Stroudsburg, PA (US), pp. 1113-1120.

A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus. / Zhou, Deyu; He, Yulan.

COLING '08: proceedings of the 22nd international conference on computational linguistics. ed. / Donia Scott; Hans Uszkoreit. Vol. 1 Stroudsburg, PA (US) : Association for Computational Linguistics, 2008. p. 1113-1120.

Research output: Chapter in Book/Report/Conference proceedingOther chapter contribution

TY - CHAP

T1 - A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus

AU - Zhou, Deyu

AU - He, Yulan

N1 - © 2008. Licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported license (http://creativecommons.org/licenses/by-nc-sa/3.0/). Some rights reserved.

PY - 2008/1/1

Y1 - 2008/1/1

N2 - We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.

AB - We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.

UR - http://www.scopus.com/inward/record.url?scp=80053418275&partnerID=8YFLogxK

M3 - Other chapter contribution

AN - SCOPUS:80053418275

SN - 978-1-905593-44-6

VL - 1

SP - 1113

EP - 1120

BT - COLING '08

A2 - Scott, Donia

A2 - Uszkoreit, Hans

PB - Association for Computational Linguistics

CY - Stroudsburg, PA (US)

ER -

Zhou D, He Y. A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus. In Scott D, Uszkoreit H, editors, COLING '08: proceedings of the 22nd international conference on computational linguistics. Vol. 1. Stroudsburg, PA (US): Association for Computational Linguistics. 2008. p. 1113-1120