Does preprocessing really impact automatically generated taxonomy

R. Hafeez, S. Khan, I.A. Khan, M.A. Abbas

Research output: Chapter in Book/Published conference outputConference publication

Abstract

Preprocessing is an essential and primary step in generating taxonomy automatically for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different combinations of preprocessing techniques have been applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation does not guarantee high quality.
Original languageEnglish
Title of host publicationProceedings - 2017 13th International Conference on Emerging Technologies, ICET2017
PublisherIEEE
Number of pages6
ISBN (Electronic)978-1-5386-2260-5
DOIs
Publication statusPublished - 8 Feb 2018

Fingerprint

Dive into the research topics of 'Does preprocessing really impact automatically generated taxonomy'. Together they form a unique fingerprint.

Cite this