StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence

Pin Ni; Gangmin Li; Patrick C.K. Hung; Victor Chang

doi:10.1016/j.asoc.2021.107975

StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence

Pin Ni, Gangmin Li, Patrick C.K. Hung, Victor Chang^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

As a task requiring strong professional experience as supports, predictive biomedical intelligence cannot be separated from the support of a large amount of external domain knowledge. By using transfer learning to obtain sufficient prior experience from massive biomedical text data, it is essential to promote the performance of specific downstream predictive and decision-making task models. This is an efficient and convenient method, but it has not been fully developed for Chinese Natural Language Processing (NLP) in the biomedical field. This study proposes a Stacked Residual Gated Recurrent Unit-Convolutional Neural Networks (StaResGRU-CNN) combined with the pre-trained language models (PLMs) for biomedical text-based predictive tasks. Exploring related paradigms in biomedical NLP based on transfer learning of external expert knowledge and comparing some Chinese and English language models. We have identified some key issues that have not been developed or those present difficulties of application in the field of Chinese biomedicine. Therefore, we also propose a series of Chinese bioMedical Language Models (CMedLMs) with detailed evaluations of downstream tasks. By using transfer learning, language models are introduced with prior knowledge to improve the performance of downstream tasks and solve specific predictive NLP tasks related to the Chinese biomedical field to serve the predictive medical system better. Additionally, a free-form text Electronic Medical Record (EMR)-based Disease Diagnosis Prediction task is proposed, which is used in the evaluation of the analyzed language models together with Clinical Named Entity Recognition, Biomedical Text Classification tasks. Our experiments prove that the introduction of biomedical knowledge in the analyzed models significantly improves their performance in the predictive biomedical NLP tasks with different granularity. And our proposed model also achieved competitive performance in these predictive intelligence tasks.

Original language	English
Article number	107975
Number of pages	14
Journal	Applied Soft Computing
Volume	113
Early online date	13 Oct 2021
DOIs	https://doi.org/10.1016/j.asoc.2021.107975
Publication status	Published - 1 Dec 2021

Bibliographical note

Copyright: © 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/

Funding Information:
This research is partly supported by VC Research (VCR 0000130) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China (KSF-P-02, KSF-A-17). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources. In addition, we would like to thank all colleagues who participated in this research project, especially Ms. Yuming Li and Mr. Zhenjin Dai. We would also like to express our sincere thanks to Mr. Thomas Cilloni for providing English language support to the manuscript.

Funding Information:
This research is partly supported by VC Research ( VCR 0000130 ) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China ( KSF-P-02 , KSF-A-17 ). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840 ). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources.

Keywords

Biomedical text mining
Named Entity Recognition
Natural language processing
Pre-trained language model
Predictive intelligence
Text classification
Transfer learning

Access to Document

10.1016/j.asoc.2021.107975

ASOC_accepted
© 2021 Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/
Accepted author manuscript, 4.02 MBLicence: CC BY-NC-ND 4.0

Cite this

@article{765170ed43dd433d9da2542876e42da2,

title = "StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence",

abstract = "As a task requiring strong professional experience as supports, predictive biomedical intelligence cannot be separated from the support of a large amount of external domain knowledge. By using transfer learning to obtain sufficient prior experience from massive biomedical text data, it is essential to promote the performance of specific downstream predictive and decision-making task models. This is an efficient and convenient method, but it has not been fully developed for Chinese Natural Language Processing (NLP) in the biomedical field. This study proposes a Stacked Residual Gated Recurrent Unit-Convolutional Neural Networks (StaResGRU-CNN) combined with the pre-trained language models (PLMs) for biomedical text-based predictive tasks. Exploring related paradigms in biomedical NLP based on transfer learning of external expert knowledge and comparing some Chinese and English language models. We have identified some key issues that have not been developed or those present difficulties of application in the field of Chinese biomedicine. Therefore, we also propose a series of Chinese bioMedical Language Models (CMedLMs) with detailed evaluations of downstream tasks. By using transfer learning, language models are introduced with prior knowledge to improve the performance of downstream tasks and solve specific predictive NLP tasks related to the Chinese biomedical field to serve the predictive medical system better. Additionally, a free-form text Electronic Medical Record (EMR)-based Disease Diagnosis Prediction task is proposed, which is used in the evaluation of the analyzed language models together with Clinical Named Entity Recognition, Biomedical Text Classification tasks. Our experiments prove that the introduction of biomedical knowledge in the analyzed models significantly improves their performance in the predictive biomedical NLP tasks with different granularity. And our proposed model also achieved competitive performance in these predictive intelligence tasks.",

keywords = "Biomedical text mining, Named Entity Recognition, Natural language processing, Pre-trained language model, Predictive intelligence, Text classification, Transfer learning",

author = "Pin Ni and Gangmin Li and Hung, {Patrick C.K.} and Victor Chang",

note = "Copyright: {\textcopyright} 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ Funding Information: This research is partly supported by VC Research (VCR 0000130) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China (KSF-P-02, KSF-A-17). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources. In addition, we would like to thank all colleagues who participated in this research project, especially Ms. Yuming Li and Mr. Zhenjin Dai. We would also like to express our sincere thanks to Mr. Thomas Cilloni for providing English language support to the manuscript. Funding Information: This research is partly supported by VC Research ( VCR 0000130 ) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China ( KSF-P-02 , KSF-A-17 ). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840 ). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources. ",

year = "2021",

month = dec,

day = "1",

doi = "10.1016/j.asoc.2021.107975",

language = "English",

volume = "113",

journal = "Applied Soft Computing",

issn = "1568-4946",

publisher = "Elsevier",

}

TY - JOUR

T1 - StaResGRU-CNN with CMedLMs

T2 - A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence

AU - Ni, Pin

AU - Li, Gangmin

AU - Hung, Patrick C.K.

AU - Chang, Victor

N1 - Copyright: © 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ Funding Information: This research is partly supported by VC Research (VCR 0000130) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China (KSF-P-02, KSF-A-17). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources. In addition, we would like to thank all colleagues who participated in this research project, especially Ms. Yuming Li and Mr. Zhenjin Dai. We would also like to express our sincere thanks to Mr. Thomas Cilloni for providing English language support to the manuscript. Funding Information: This research is partly supported by VC Research ( VCR 0000130 ) for Prof. Chang. At the same time, this study is also partially supported by the AI University Research Center (AI-URC) through the XJTLU Key Program Special Fund, China ( KSF-P-02 , KSF-A-17 ). And this work has received support from the Suzhou Bureau of Science and Technology through the Key Industrial Technology Innovation Program, China (No. SYG201840 ). We also appreciate Google TensorFlow Research Cloud (TFRC) for providing support in computing resources.

PY - 2021/12/1

Y1 - 2021/12/1

N2 - As a task requiring strong professional experience as supports, predictive biomedical intelligence cannot be separated from the support of a large amount of external domain knowledge. By using transfer learning to obtain sufficient prior experience from massive biomedical text data, it is essential to promote the performance of specific downstream predictive and decision-making task models. This is an efficient and convenient method, but it has not been fully developed for Chinese Natural Language Processing (NLP) in the biomedical field. This study proposes a Stacked Residual Gated Recurrent Unit-Convolutional Neural Networks (StaResGRU-CNN) combined with the pre-trained language models (PLMs) for biomedical text-based predictive tasks. Exploring related paradigms in biomedical NLP based on transfer learning of external expert knowledge and comparing some Chinese and English language models. We have identified some key issues that have not been developed or those present difficulties of application in the field of Chinese biomedicine. Therefore, we also propose a series of Chinese bioMedical Language Models (CMedLMs) with detailed evaluations of downstream tasks. By using transfer learning, language models are introduced with prior knowledge to improve the performance of downstream tasks and solve specific predictive NLP tasks related to the Chinese biomedical field to serve the predictive medical system better. Additionally, a free-form text Electronic Medical Record (EMR)-based Disease Diagnosis Prediction task is proposed, which is used in the evaluation of the analyzed language models together with Clinical Named Entity Recognition, Biomedical Text Classification tasks. Our experiments prove that the introduction of biomedical knowledge in the analyzed models significantly improves their performance in the predictive biomedical NLP tasks with different granularity. And our proposed model also achieved competitive performance in these predictive intelligence tasks.

AB - As a task requiring strong professional experience as supports, predictive biomedical intelligence cannot be separated from the support of a large amount of external domain knowledge. By using transfer learning to obtain sufficient prior experience from massive biomedical text data, it is essential to promote the performance of specific downstream predictive and decision-making task models. This is an efficient and convenient method, but it has not been fully developed for Chinese Natural Language Processing (NLP) in the biomedical field. This study proposes a Stacked Residual Gated Recurrent Unit-Convolutional Neural Networks (StaResGRU-CNN) combined with the pre-trained language models (PLMs) for biomedical text-based predictive tasks. Exploring related paradigms in biomedical NLP based on transfer learning of external expert knowledge and comparing some Chinese and English language models. We have identified some key issues that have not been developed or those present difficulties of application in the field of Chinese biomedicine. Therefore, we also propose a series of Chinese bioMedical Language Models (CMedLMs) with detailed evaluations of downstream tasks. By using transfer learning, language models are introduced with prior knowledge to improve the performance of downstream tasks and solve specific predictive NLP tasks related to the Chinese biomedical field to serve the predictive medical system better. Additionally, a free-form text Electronic Medical Record (EMR)-based Disease Diagnosis Prediction task is proposed, which is used in the evaluation of the analyzed language models together with Clinical Named Entity Recognition, Biomedical Text Classification tasks. Our experiments prove that the introduction of biomedical knowledge in the analyzed models significantly improves their performance in the predictive biomedical NLP tasks with different granularity. And our proposed model also achieved competitive performance in these predictive intelligence tasks.

KW - Biomedical text mining

KW - Named Entity Recognition

KW - Natural language processing

KW - Pre-trained language model

KW - Predictive intelligence

KW - Text classification

KW - Transfer learning

UR - https://www.sciencedirect.com/science/article/pii/S1568494621008978?via=ihub

UR - http://www.scopus.com/inward/record.url?scp=85118700862&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2021.107975

DO - 10.1016/j.asoc.2021.107975

M3 - Article

AN - SCOPUS:85118700862

SN - 1568-4946

VL - 113

JO - Applied Soft Computing

JF - Applied Soft Computing

M1 - 107975

ER -

StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this