Export file:


  • RIS(for EndNote,Reference Manager,ProCite)
  • BibTex
  • Text


  • Citation Only
  • Citation and Abstract

Research on massive information query and intelligent analysis method in a complex large-scale system

Northeast Forestry University, Harbin, 150040, China

Special Issues: Neural Computation and Applications for Sustainable Energy Systems

With the rapid growth of big data and network information, it is particularly important to perform information query and intelligent analysis on unstructured massive data in large-scale complex systems. The existing methods of directly collating, sorting, summarizing, and storing retrieval of documents cannot meet the needs of information management and rapid retrieval of massive data. This paper takes the standardized storage, effective extraction and standardized database construction of massive resume information in social large-scale complex systems as an example, and proposes a massive information query and intelligent analysis method. The method utilizes the semi-structured features of the resume document, constructs the extraction rule model of various resume data to extract the massive resume information. On the basis of HBase distributed storage, with the help of parallel computing technology to optimize the storage and query efficiency, which ensures the intelligent analysis and retrieval of massive resume information. The experimental results show that this method not only greatly improves the extraction accuracy and recall rate of resume information data, but also compared with the traditional methods, there are obvious improvements in the three aspects of massive information retrieval methods, query usage efficiency, and the intelligent analysis of complex systems.
  Article Metrics

Keywords unstructured information; extraction rule model; HBase based distributed storage; information intelligence system

Citation: Dailin Wang, Yunlei Lv, Danting Ren, Linhui Li. Research on massive information query and intelligent analysis method in a complex large-scale system. Mathematical Biosciences and Engineering, 2019, 16(4): 2906-2926. doi: 10.3934/mbe.2019143


  • 1. B. Li, Y. Chen and S. Yu, Review of information extraction research, Comput. Eng. Appl., 10 (2003), 1–5+66. (in Chinese)
  • 2. Y. Liu, R. Jin and J. Y. Chai, et al., A Maximum coherence model for dictionary-based cross-language information retrieval. Proceedings of the 28th Annual International ACMSIGIR Conference; 2005 August 15–19; Salvador, Brazil. New York: ACM; 536–543.
  • 3. A. L. Berger, V. J. D. Pietra and S. A. D. Pietra, A maximum entropy approach to natural language processing, Comput. Linguist., 22 (1996), 39–71.
  • 4. W. Huang and Y. Sun, Chinese short text sentiment analysis based on maximum entropy, Comput. Eng. Des., 38 (2017), 138–143. (in Chinese)
  • 5. Y. Lin, Y. Liu and S. Zhou, Text information extraction based on maximum entropy of hidden Markov model, Acta Electronica Sinic, 33 (2005), 236–240. (in Chinese)
  • 6. K. Seymore, A. Mccallum and R. Rosenfeld, Learning hidden Markov model structure for information extraction, In Aaai'99 Workshop Machine Learning for Information Extraction, (1999), 37–42.
  • 7. C. Chi and Y. Zhang, Information extraction from chinese papers based on hidden Markov model, Adv. Mater. Res., 846 (2014), 1291–1294.
  • 8. Y. Liu, Y. Lin and Z. Chen, Text information extraction based on hidden Markov model, J. Syst. Simulat., 16 (2004), 507–510. (in Chinese)
  • 9. S. Zhe, Research and application of hidden Markov model in web page information extraction, Ph.D thesis, East China Normal University, 2016. (in Chinese)
  • 10. S. Zhou, Y. Lin and Y. Wang, et al., Text information extraction based on clustered hidden Markov model, J. Syst. Simulat., 19 (2007), 4926–4931.
  • 11. Q. Du, H. Wang and Z. Shao, et al., Research on the extraction method of literature metadata based on hybrid HMM, Comput. and Digit. Eng., 45 (2017), 101–106. (in Chinese)
  • 12. F. Ciravegna and A. Lavelli, Learning Pinocchio: adaptive information extraction for real world applications, J. Nat. Lang. Eng., 10 (2004), 145–165.
  • 13. W. Yu, G. Guan and M. Zhou, et al., CV information extraction based on two-level cascade text classification, J. Chinese Inform. Process. 20 (2006), 59–66.
  • 14. K. Yu, G. Guan and M. Zhou, Resume information extraction with Cascaded Hybrid Model. Proceddings of the 43th Annual Meeting of the ACL; 2005 June; Ann Arbor, Michigan. Association for Computational Linguistics; 499–506. (in Chinese)
  • 15. Q. Wang and F. Li, Wikipedia-based resume extraction of personal name information, Comput. Appl. Softw., 28 (2011), 170–174. (in Chinese)
  • 16. N. Ren, Research on the extraction of character title information in large-scale real texts, Ph.D thesis, Beijing Language and Culture University, 2008.
  • 17. N. Gu, W. Feng and X. Sun, et al., Chinese resume automatic analysis and recommendation algorithm, Comput. Eng. Appl., 53 (2017), 141–148+270. (in Chinese)


Reader Comments

your name: *   your email: *  

© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution Licese (http://creativecommons.org/licenses/by/4.0)

Download full text in PDF

Export Citation

Copyright © AIMS Press All Rights Reserved