Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data

Bo An; Bo An

doi:10.3934/mbe.2023292

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 4: 6776-6799. doi: 10.3934/mbe.2023292

Previous Article Next Article

Research article Special Issues

Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data

Bo An ^{1,2
,
,}

1.
Institute of Ethnology and Anthropology, Chinese Academy of Social Sciences, Beijing 100732, China
2.
Beijing Academy of Artificial Intelligence, Beijing 100084, China

Academic Editor: Jinshan Tang

Received: 02 September 2022 Revised: 14 January 2023 Accepted: 17 January 2023 Published: 06 February 2023

The knowledge graph is a critical resource for medical intelligence. The general medical knowledge graph tries to include all diseases and contains much medical knowledge. However, it is challenging to review all the triples manually. Therefore the quality of the knowledge graph can not support intelligence medical applications. Breast cancer is one of the highest incidences of cancer at present. It is urgent to improve the efficiency of breast cancer diagnosis and treatment through artificial intelligence technology and improve the postoperative health status of breast cancer patients. This paper proposes a framework to construct a breast cancer knowledge graph from heterogeneous data resources in response to this demand. Specifically, this paper extracts knowledge triple from clinical guidelines, medical encyclopedias and electronic medical records. Furthermore, the triples from different data resources are fused to build a breast cancer knowledge graph (BCKG). Experimental results demonstrate that BCKG can support knowledge-based question answering, breast cancer postoperative follow-up and healthcare, and improve the quality and efficiency of breast cancer diagnosis, treatment and management.

Keywords:

Citation: Bo An. Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 6776-6799. doi: 10.3934/mbe.2023292

Related Papers:

[1]	Yuheng Wang, Yongquan Zhou, Qifang Luo . Parameter optimization of shared electric vehicle dispatching model using discrete Harris hawks optimization. Mathematical Biosciences and Engineering, 2022, 19(7): 7284-7313. doi: 10.3934/mbe.2022344
[2]	Dandan Fan, Dawei Li, Fangzheng Cheng, Guanghua Fu . Effects of congestion charging and subsidy policy on vehicle flow and revenue with user heterogeneity. Mathematical Biosciences and Engineering, 2023, 20(7): 12820-12842. doi: 10.3934/mbe.2023572
[3]	Xiangyang Ren, Shuai Chen, Liyuan Ren . Optimization of regional emergency supplies distribution vehicle route with dynamic real-time demand. Mathematical Biosciences and Engineering, 2023, 20(4): 7487-7518. doi: 10.3934/mbe.2023324
[4]	Fulin Dang, Chunxue Wu, Yan Wu, Rui Li, Sheng Zhang, Huang Jiaying, Zhigang Liu . Cost-based multi-parameter logistics routing path optimization algorithm. Mathematical Biosciences and Engineering, 2019, 16(6): 6975-6989. doi: 10.3934/mbe.2019350
[5]	Peng Zheng, Jingwei Gao . Damping force and energy recovery analysis of regenerative hydraulic electric suspension system under road excitation: modelling and numerical simulation. Mathematical Biosciences and Engineering, 2019, 16(6): 6298-6318. doi: 10.3934/mbe.2019314
[6]	Hamid Mofidi . New insights into the effects of small permanent charge on ionic flows: A higher order analysis. Mathematical Biosciences and Engineering, 2024, 21(5): 6042-6076. doi: 10.3934/mbe.2024266
[7]	Smita Shandilya, Ivan Izonin, Shishir Kumar Shandilya, Krishna Kant Singh . Mathematical modelling of bio-inspired frog leap optimization algorithm for transmission expansion planning. Mathematical Biosciences and Engineering, 2022, 19(7): 7232-7247. doi: 10.3934/mbe.2022341
[8]	Jana Zatloukalova, Kay Raum . High frequency ultrasound assesses transient changes in cartilage under osmotic loading. Mathematical Biosciences and Engineering, 2020, 17(5): 5190-5211. doi: 10.3934/mbe.2020281
[9]	Xiangyang Ren, Xinxin Jiang, Liyuan Ren, Lu Meng . A multi-center joint distribution optimization model considering carbon emissions and customer satisfaction. Mathematical Biosciences and Engineering, 2023, 20(1): 683-706. doi: 10.3934/mbe.2023031
[10]	Massimo Fioranelli, O. Eze Aru, Maria Grazia Roccia, Aroonkumar Beesham, Dana Flavin . A model for analyzing evolutions of neurons by using EEG waves. Mathematical Biosciences and Engineering, 2022, 19(12): 12936-12949. doi: 10.3934/mbe.2022604

Abstract

Creating this inaugural special issue on Engineering Applications of Artificial Intelligence (AI) is important due to the rapid technology advancement and the aim to reduce the manpower by incorporating Artificial Intelligence in various Industry 4.0 applications. As my research reflects the multi-disciplinarily of systems (consisting of mechanical, electrical, electronics, acoustical and marine engineering) from initial concepts to the modelling and AI simulation, creating graphical-user interface and their actual implementations and testing on sites. The special issue provides a good platform to share applied research results from different researchers around the world.

For example, the phase partition-based ensemble learning framework upon least squares supports vector regression (LSSVR) was used for soft sensor modeling to improve the prediction accuracy in chemical and biological processes. As a result, the robotic grasping based on improved Gaussian mixture model was also proposed using the virtual robot experimentation platform. The face image recognition algorithm based on two-dimensional (2D) Gabor wavelet transform and Local Binary Pattern (LBP) was presented. It provides a better classification performance in different scales and directions affected by illumination, gesture, expression, and other factor's variation. With more consciousness in cyber-security, the paper that used the Kalman filter-based attack detection model was proposed. The block withholding delay attack and the countermeasure were also proposed in a similar occasion. The well-known convolutional neural network (CNN) based approach was applied to detect the obstacle for the unmanned surface vehicle. Subsequently, an effective classifier based on the CNN and regularized extreme learning machine (ELM) was adopted to reduce the classification time in the training and testing.

In summary, this issue concluded with different engineering applications of AI. It is imperative that we continue to progress in our search for better engineering systems design and simulation using AI. The progress reported in this special issue suggests that achieving these aims is an attainable one. I hope that we can stay in contact and make this world a better place for a "deep" collaborative research.

References

[1]	X. Zou, A survey on application of knowledge graph, J. Phys. Conf. Ser., 1487 (2020), 12016. https://doi.org/10.1088/1742-6596/1487/1/012016 doi: 10.1088/1742-6596/1487/1/012016
[2]	M. Kejriwal, Knowledge graphs and COVID-19: opportunities, challenges, and implementation, Harv. Data Sci. Rev., 11 (2020), 300.
[3]	Q. H. Nguyen, T. T. Do, Y. Wang, S. S. Heng, K. Chen, W. H. M. Ang, et al., Breast cancer prediction using feature selection and ensemble voting, in 2019 International Conference on System Science and Engineering (ICSSE), IEEE, (2019), 250–254.
[4]	K. Zhang, X. Ren, L. Zhuang, H. Zan, W. Zhang, Z. Sui, Construction of chinese medicine knowledge base, in Workshop on Chinese Lexical Semantics, Springer, (2020), 665–675. https://doi.org/10.1007/978-3-030-81197-6_56
[5]	P. H. Martins, Z. Marinho, A. Martins, Joint learning of named entity recognition and entity linking, preprint, arXiv: 1907.08243.
[6]	J. Noh, R. Kavuluru, Joint learning for biomedical ner and entity normalization: encoding schemes, counterfactual examples, and zero-shot evaluation, in Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, (2021), 1–10.
[7]	L. Liu, M. Wang, M. Zhang, L. Qing, X. He, Uamner: uncertainty-aware multimodal named entity recognition in social media posts, Appl. Intell., 52 (2022), 4109–4125. https://doi.org/10.1007/s10489-021-02546-5 doi: 10.1007/s10489-021-02546-5
[8]	S. S. Paliwal, D. Vishwanath, R. Rahul, M. Sharma, L. Vig, Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images, in 2019 International Conference on Document Analysis and Recognition (ICDAR), IEEE, (2019), 128–133.
[9]	W. Xiang, B. Wang, A survey of event extraction from text, IEEE Access, 7 (2019), 173111–173137. https://doi.org/10.1109/ACCESS.2019.2956831 doi: 10.1109/ACCESS.2019.2956831
[10]	Y. Lu, Q. Liu, D. Dai, X. Xiao, H. Lin, X. Han, et al., Unified structure generation for universal information extraction, preprint, arXiv: 2203.12277.
[11]	B. P. Nguyen, H. N. Pham, H. Tran, N. Nghiem, Q. H. Nguyen, T. T. Do, et al., Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records, Comput. Methods Programs Biomed., 182 (2019), 105055. https://doi.org/10.1016/j.cmpb.2019.105055 doi: 10.1016/j.cmpb.2019.105055
[12]	X. Zhao, Y. Jia, A. Li, R. Jiang, Y. Song, Multi-source knowledge fusion: a survey, World Wide Web, 23 (2020), 2567–2592. https://doi.org/10.1007/s11280-020-00811-0 doi: 10.1007/s11280-020-00811-0
[13]	A. Hogan, E. Blomqvist, M. Cochez, C. D'Amato, G. D. Melo, C. Gutierrez, et al., Knowledge graphs, ACM Comput. Surv., 54 (2021), 1–37. https://doi.org/10.1145/3466817
[14]	M. Wang, X. He, L. Liu, L. Qing, H. Chen, Y. Liu, et al., Medical visual question answering based on question-type reasoning and semantic space constraint, Artif. Intell. Med., 131 (2022), 102346. https://doi.org/10.1016/j.artmed.2022.102346 doi: 10.1016/j.artmed.2022.102346
[15]	X. Zhu, Z. Li, X. Wang, X. Jiang, P. Sun, X. Wang, et al., Multi-modal knowledge graph construction and application: A survey, preprint, arXiv: 2202.05786.
[16]	L. Liu, M. Wang, X. He, L. Qing, H. Chen, Fact-based visual question answering via dual-process system, Knowl. Based Syst., 237 (2022), 107650. https://doi.org/10.1016/j.knosys.2021.107650 doi: 10.1016/j.knosys.2021.107650
[17]	A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka, T. M. Mitchell, Toward an architecture for never-ending language learning, in Twenty-Fourth AAAI Conference on Artificial Intelligence, 24 (2010), 1306–1313.
[18]	D. Vrandečić, Wikidata: A new platform for collaborative data collection, in Proceedings of the 21st International Conference on World Wide Web, (2012), 1063–1064.
[19]	L. Liu, M. Wang, X. He, L. Qing, J. Zhang, Extracting relational facts based on hybrid syntax-guided transformer and pointer network, J. Intell. Fuzzy Syst., 40 (2021), 12167–12183. https://doi.org/10.3233/JIFS-210281 doi: 10.3233/JIFS-210281
[20]	H. Lv, H. Liang, F. Ma, Constructing knowledge graph for financial equities, Data Anal. Knowl. Discovery, 4 (2020), 27–37.
[21]	F. Sovrano, M. Palmirani, F. Vitali, Legal knowledge extraction for knowledge graph based question-answering, in Legal Knowledge and Information Systems, IOS Press, (2020), 143–153.
[22]	Y. Wei, H. Wang, J. Zhao, Y. Liu, Y. Zhang, B. Wu, Gelaigelai: a visual platform for analysis of classical chinese poetry based on knowledge graph, in 2020 IEEE International Conference on Knowledge Graph (ICKG), IEEE, (2020), 513–520.
[23]	F. Gong, M. Wang, H. Wang, S. Wang, M. Liu, Smr: medical knowledge graph embedding for safe medicine recommendation, Big Data Res., 23 (2021), 100174. https://doi.org/10.1016/j.bdr.2020.100174 doi: 10.1016/j.bdr.2020.100174
[24]	H. Chen, N. Hu, G. Qi, H. Wang, Z. Bi, J. Li, et al., Openkg chain: A blockchain infrastructure for open knowledge graphs, Data Intell., 3 (2021), 205–227.
[25]	A. Chatterjee, C. Nardi, C. Oberije, P. Lambin, Knowledge graphs for COVID-19: An exploratory review of the current landscape, J. Pers. Med., 11 (2021), 300. https://doi.org/10.3390/jpm11040300 doi: 10.3390/jpm11040300
[26]	S. Ji, S. Pan, E. Cambria, P. Marttinen, S. Y. Philip, A survey on knowledge graphs: Representation, acquisition, and applications, IEEE Trans. Neural Networks Learn. Syst., 33 (2021), 494–514. https://doi.org/10.1109/TNNLS.2021.3070843 doi: 10.1109/TNNLS.2021.3070843
[27]	B. Xie, S. Li, F. Lv, C. H. Liu, G. Wang, D. Wu, A collaborative alignment framework of transferable knowledge extraction for unsupervised domain adaptation, IEEE Trans. Knowl. Data Eng., 2022 (2022). https://doi.org/10.1109/TKDE.2022.3185233
[28]	J. Li, A. Sun, J. Han, C. Li, A survey on deep learning for named entity recognition, IEEE Trans. Knowl. Data Eng., 34 (2020), 50–70. https://doi.org/10.1007/s10618-019-00656-w doi: 10.1007/s10618-019-00656-w
[29]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in Neural Information Processing Systems 30 (NIPS 2017), (2017), 30.
[30]	S. Edunov, A. Baevski, M. Auli, Pre-trained language model representations for language generation, preprint, arXiv: 1903.09722.
[31]	L. X. Liang, L. Lin, E. Lin, W. S. Wen, G. Y. Huang, A joint learning model to extract entities and relations for chinese literature based on self-attention, Mathematics, 10 (2022), 2216. https://doi.org/10.3390/math10132216 doi: 10.3390/math10132216
[32]	M. Zhang, Y. Chen, J. Lin, A privacy-preserving optimization of neighborhood-based recommendation for medical-aided diagnosis and treatment, IEEE Internet Things J., 8 (2021), 10830–10842. https://doi.org/10.1109/JIOT.2021.3051060 doi: 10.1109/JIOT.2021.3051060
[33]	B. An, X. Han, C. Fu, L. Sun, Retrofitting soft rules for knowledge representation learning, Big Data Res., 24 (2021), 100156. https://doi.org/10.1016/j.bdr.2020.100156 doi: 10.1016/j.bdr.2020.100156
[34]	J. H. Gennari, M. A. Musen, R. W. Fergerson, W. E. Grosso, M. Crubézy, H. Eriksson, et al., The evolution of protégé: an environment for knowledge-based systems development, Int. J. Human Comput. Stud., 58 (2003), 89–123. https://doi.org/10.1016/S0031-9406(05)60588-3 doi: 10.1016/S0031-9406(05)60588-3
[35]	M. Peleg, Computer-interpretable clinical guidelines: a methodological review, J. Biomed. Inf., 46 (2013), 744–763. https://doi.org/10.1016/j.jbi.2013.06.009 doi: 10.1016/j.jbi.2013.06.009
[36]	Z. Dai, X. Wang, P. Ni, Y. Li, G. Li, X. Bai, Named entity recognition using bert bilstm crf for Chinese electronic health records, in 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), IEEE, (2019), 1–5.
[37]	Z. Ni, L. Ma, H. Zeng, J. Chen, C. Cai, K. K. Ma, Esim: Edge similarity for screen content image quality assessment, IEEE Trans. Image Process., 26 (2017), 4818–4831. https://doi.org/10.1109/TIP.2017.2718185 doi: 10.1109/TIP.2017.2718185
[38]	L. Li, R. Ma, Q. Guo, X. Xue, X. Qiu, Bert-attack: Adversarial attack against bert using bert, preprint, arXiv: 2004.09984.
[39]	E. K. W. Leow, B. P. Nguyen, M. C. H. Chua, Robo-advisor using genetic algorithm and bert sentiments from tweets for hybrid portfolio optimisation, Expert Syst. Appl., 179 (2021), 115060. https://doi.org/10.1016/j.eswa.2021.115060 doi: 10.1016/j.eswa.2021.115060
[40]	T. Nguyen-Vo, Q. H. Trinh, L. Nguyen, T. T. Do, M. C. H. Chua, B. P. Nguyen, Predicting antimalarial activity in natural products using pretrained bidirectional encoder representations from transformers, J. Chem. Inf. Model., 62 (2021), 5050–5058. https://doi.org/10.1021/acs.jcim.1c00584 doi: 10.1021/acs.jcim.1c00584
[41]	Z. Niu, G. Zhong, H. Yu, A review on the attention mechanism of deep learning, Neurocomputing, 452 (2021), 48–62. https://doi.org/10.1007/s43830-021-0173-9 doi: 10.1007/s43830-021-0173-9
[42]	A. E. Patanwala, A practical guide to conducting and writing medical record review studies, Am. J. Health Syst. Pharm., 74 (2017), 1853–1864. https://doi.org/10.2146/ajhp170183 doi: 10.2146/ajhp170183
[43]	M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, et al., Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, preprint, arXiv: 1910.13461.
[44]	Z. Yuan, Z. Zhao, H. Sun, J. Li, F. Wang, S. Yu, Coder: Knowledge-infused cross-lingual medical term embedding for term normalization, J. Biomed. Inf., 126 (2022), 103983. https://doi.org/10.1016/j.jbi.2021.103983 doi: 10.1016/j.jbi.2021.103983
[45]	Y. Shen, N. Ding, H. T. Zheng, Y. Li, M. Yang, Modeling relation paths for knowledge graph completion, IEEE Trans. Knowl. Data Eng., 33 (2020), 3607–3617. https://doi.org/10.1109/TKDE.2020.2970044 doi: 10.1109/TKDE.2020.2970044

This article has been cited by:

Xu Hao, Deyu Zhou, Ruiheng Zhong, Shunxi Li, Xianming Meng, Bo Liu, Electrification pathways for light-duty logistics vehicles based on perceived cost of ownership in Northern China, 2024, 3, 2831-932X, 10.20517/cf.2024.24

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)