Does preprocessing really impact automatically generated taxonomy

R. Hafeez; S. Khan; I.A. Khan; M.A. Abbas

doi:10.1109/ICET.2017.8281710

Does preprocessing really impact automatically generated taxonomy

R. Hafeez, S. Khan, I.A. Khan, M.A. Abbas

Research output: Chapter in Book/Published conference output › Conference publication

Abstract

Preprocessing is an essential and primary step in generating taxonomy automatically for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different combinations of preprocessing techniques have been applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation does not guarantee high quality.

Original language	English
Title of host publication	Proceedings - 2017 13th International Conference on Emerging Technologies, ICET2017
Publisher	IEEE
Number of pages	6
ISBN (Electronic)	978-1-5386-2260-5
DOIs	https://doi.org/10.1109/ICET.2017.8281710
Publication status	Published - 8 Feb 2018

Access to Document

10.1109/ICET.2017.8281710

Cite this

@inproceedings{a018c00d7fb742e0838ad81067eaa86c,

title = "Does preprocessing really impact automatically generated taxonomy",

abstract = "Preprocessing is an essential and primary step in generating taxonomy automatically for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different combinations of preprocessing techniques have been applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation does not guarantee high quality.",

author = "R. Hafeez and S. Khan and I.A. Khan and M.A. Abbas",

year = "2018",

month = feb,

day = "8",

doi = "10.1109/ICET.2017.8281710",

language = "English",

booktitle = "Proceedings - 2017 13th International Conference on Emerging Technologies, ICET2017",

publisher = "IEEE",

address = "United States",

}

TY - GEN

T1 - Does preprocessing really impact automatically generated taxonomy

AU - Hafeez, R.

AU - Khan, S.

AU - Khan, I.A.

AU - Abbas, M.A.

PY - 2018/2/8

Y1 - 2018/2/8

N2 - Preprocessing is an essential and primary step in generating taxonomy automatically for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different combinations of preprocessing techniques have been applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation does not guarantee high quality.

AB - Preprocessing is an essential and primary step in generating taxonomy automatically for text documents because text data is unstructured; and more inconsistent and noisy than structured data. Different combinations of preprocessing techniques have been applied in generating taxonomy to amplify pertinent information for further analysis and processing. This research investigates the impact of various preprocessing techniques on the quality of the generated taxonomy. Various combinations of preprocessing techniques have been applied in taxonomy generation on two text data sets, selected from different domains. The experimental results revealed that selecting a suitable combination of preprocessing techniques can improve the quality of automated taxonomy. However applying all preprocessing techniques in the generation does not guarantee high quality.

UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85050507561&partnerID=MN8TOARS

UR - https://ieeexplore.ieee.org/document/8281710

U2 - 10.1109/ICET.2017.8281710

DO - 10.1109/ICET.2017.8281710

M3 - Conference publication

BT - Proceedings - 2017 13th International Conference on Emerging Technologies, ICET2017

PB - IEEE

ER -

Does preprocessing really impact automatically generated taxonomy

Abstract

Access to Document

Other files and links

Fingerprint

Cite this