Research project selection is an important task for government and private research funding agencies. When a large number of research proposals are received, it is common to group them according to their similarities in research disciplines. The grouped proposals are then assigned to the appropriate experts for peer review. Current methods for grouping proposals are based on manual matching of similar research discipline areas and/or keywords. However, the exact research discipline areas of the proposals cannot often be accurately designated by the applicants due to their subjective views and possible misinterpretations. Therefore, rich information in the proposals' full text can be used effectively. Text-mining methods have been proposed to solve the problem by automatically classifying text documents, mainly in English. However, these methods have limitations when dealing with non-English language texts, e.g., Chinese research proposals. This paper presents a novel ontology-based text-mining approach to cluster research proposals based on their similarities in research areas. The method is efficient and effective for clustering research proposals with both English and Chinese texts. The method also includes an optimization model that considers applicants' characteristics for balancing proposals by geographical regions. The proposed method is tested and validated based on the selection process at the National Natural Science Foundation of China. The results can also be used to improve the efficiency and effectiveness of research project selection processes in other government and private research funding agencies.
|Pages (from-to)||784 - 790|
|Journal||IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans|
|Publication status||Published - 19 Mar 2012|
Ma, J., Xu, W., Sun, Y., Turban, E., Wang, S., & Liu, O. (2012). An Ontology-Based Text-Mining Method to Cluster Proposals for Research Project Selection. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans, 42(3), 784 - 790. https://doi.org/10.1109/TSMCA.2011.2172205