Export file:

Format

  • RIS(for EndNote,Reference Manager,ProCite)
  • BibTex
  • Text

Content

  • Citation Only
  • Citation and Abstract

Topic-based automatic summarization algorithm for Chinese short text

1 Nanjing University of Information Science and Technology, Nanjing 210044, China
2 Nanjing Institute of Technology, Nanjing 211167, China
3 King Saud University, Riyadh 11362, Saudi Arabia

Special Issues: Recent Achievements in Multimedia Data Processing

Most current automatic summarization methods are for English texts. The distinction between words in Chinese text is large, the types of parts of speech are many and complex, and polysemy or ambiguous words appear frequently. Therefore, compared with English text, Chinese text is more difficult to extract useful feature words. Due to the complex syntax of Chinese, there are currently relatively few automatic summarization methods for Chinese text. In the past, only the important sentences in the original text can be selected and simply arranged to obtain a summary with chaotic sentences and insufficient coherence. Meanwhile, because Chinese short text usually contains more redundant information and the sentence structure is not neat, we propose a topic-based automatic summary method for Chinese short text. Firstly, a key sentence selection method is proposed combining topic words and TF-IDF to obtain the score of each text corresponding to the topic in the original text data. Then the sentence with the highest score as the topic sentence of the topic is selected. Considering that the short text of Weibo may contain a lot of irrelevant information and sometimes even lack some important components of topic, three retouching mechanisms are proposed to improve the conciseness, richness and readability of topic sentence extraction results. We validate our approach on natural disaster and social hot event datasets from Sina Weibo. The experimental results show that the polished topic summary not only reflects the exact relationship between topic sentences and natural disasters or social hot events, but also has rich semantic information. More importantly, we can almost grasp the basic elements of natural disaster or social hot event from the topic sentence, so as to help the government guide disaster relief or meet the needs of users for quickly obtaining information of social hot events.
  Figure/Table
  Supplementary
  Article Metrics
Download full text in PDF

Export Citation

Article outline

Copyright © AIMS Press All Rights Reserved