How weak categorizers based upon different principles strengthen performance

Victoria S. Uren; Thomas R. Addis

doi:10.1093/comjnl/45.5.511

How weak categorizers based upon different principles strengthen performance

Victoria S. Uren, Thomas R. Addis

Operations & Information Management

Research output: Contribution to journal › Article › peer-review

Abstract

Combining the results of classifiers has shown much promise in machine learning generally. However, published work on combining text categorizers suggests that, for this particular application, improvements in performance are hard to attain. Explorative research using a simple voting system is presented and discussed in the light of a probabilistic model that was originally developed for safety critical software. It was found that typical categorization approaches produce predictions which are too similar for combining them to be effective since they tend to fail on the same records. Further experiments using two less orthodox categorizers are also presented which suggest that combining text categorizers can be successful, provided the essential element of ‘difference’ is considered.

Original language	English
Pages (from-to)	511-524
Number of pages	14
Journal	Computer Journal
Volume	45
Issue number	5
DOIs	https://doi.org/10.1093/comjnl/45.5.511
Publication status	Published - 2002

Access to Document

10.1093/comjnl/45.5.511

Cite this

@article{6d2ebb5f0fb74d319712e7de2debfc51,

title = "How weak categorizers based upon different principles strengthen performance",

abstract = "Combining the results of classifiers has shown much promise in machine learning generally. However, published work on combining text categorizers suggests that, for this particular application, improvements in performance are hard to attain. Explorative research using a simple voting system is presented and discussed in the light of a probabilistic model that was originally developed for safety critical software. It was found that typical categorization approaches produce predictions which are too similar for combining them to be effective since they tend to fail on the same records. Further experiments using two less orthodox categorizers are also presented which suggest that combining text categorizers can be successful, provided the essential element of {\textquoteleft}difference{\textquoteright} is considered. ",

author = "Uren, {Victoria S.} and Addis, {Thomas R.}",

year = "2002",

doi = "10.1093/comjnl/45.5.511",

language = "English",

volume = "45",

pages = "511--524",

journal = "Computer Journal",

issn = "0010-4620",

publisher = "Oxford University Press",

number = "5",

}

TY - JOUR

T1 - How weak categorizers based upon different principles strengthen performance

AU - Uren, Victoria S.

AU - Addis, Thomas R.

PY - 2002

Y1 - 2002

N2 - Combining the results of classifiers has shown much promise in machine learning generally. However, published work on combining text categorizers suggests that, for this particular application, improvements in performance are hard to attain. Explorative research using a simple voting system is presented and discussed in the light of a probabilistic model that was originally developed for safety critical software. It was found that typical categorization approaches produce predictions which are too similar for combining them to be effective since they tend to fail on the same records. Further experiments using two less orthodox categorizers are also presented which suggest that combining text categorizers can be successful, provided the essential element of ‘difference’ is considered.

AB - Combining the results of classifiers has shown much promise in machine learning generally. However, published work on combining text categorizers suggests that, for this particular application, improvements in performance are hard to attain. Explorative research using a simple voting system is presented and discussed in the light of a probabilistic model that was originally developed for safety critical software. It was found that typical categorization approaches produce predictions which are too similar for combining them to be effective since they tend to fail on the same records. Further experiments using two less orthodox categorizers are also presented which suggest that combining text categorizers can be successful, provided the essential element of ‘difference’ is considered.

UR - http://comjnl.oxfordjournals.org/content/45/5/511.abstract

U2 - 10.1093/comjnl/45.5.511

DO - 10.1093/comjnl/45.5.511

M3 - Article

SN - 0010-4620

VL - 45

SP - 511

EP - 524

JO - Computer Journal

JF - Computer Journal

IS - 5

ER -

How weak categorizers based upon different principles strengthen performance

Abstract

Access to Document

Other files and links

Fingerprint

Cite this