Research article

Linguistic summarisation of multiple entities in RDF graphs

  • Received: 27 November 2023 Revised: 23 January 2024 Accepted: 25 January 2024 Published: 02 February 2024
  • Methods for producing summaries from structured data have gained interest due to the huge volume of available data in the Web. Simultaneously, there have been advances in natural language generation from Resource Description Framework (RDF) data. However, no efforts have been made to generate natural language summaries for groups of multiple RDF entities. This paper describes the first algorithm for summarising the information of a set of RDF entities in the form of human-readable text. The paper also proposes an experimental design for the evaluation of the summaries in a human task context. Experiments were carried out comparing machine-made summaries and summaries written by humans, with and without the help of machine-made summaries. We develop criteria for evaluating the content and text quality of summaries of both types, as well as a function measuring the agreement between machine-made and human-written summaries. The experiments indicated that machine-made natural language summaries can substantially help humans in writing their own textual descriptions of entity sets within a limited time.

    Citation: Elizaveta Zimina, Kalervo Järvelin, Jaakko Peltonen, Aarne Ranta, Kostas Stefanidis, Jyrki Nummenmaa. Linguistic summarisation of multiple entities in RDF graphs[J]. Applied Computing and Intelligence, 2024, 4(1): 1-18. doi: 10.3934/aci.2024001

    Related Papers:

  • Methods for producing summaries from structured data have gained interest due to the huge volume of available data in the Web. Simultaneously, there have been advances in natural language generation from Resource Description Framework (RDF) data. However, no efforts have been made to generate natural language summaries for groups of multiple RDF entities. This paper describes the first algorithm for summarising the information of a set of RDF entities in the form of human-readable text. The paper also proposes an experimental design for the evaluation of the summaries in a human task context. Experiments were carried out comparing machine-made summaries and summaries written by humans, with and without the help of machine-made summaries. We develop criteria for evaluating the content and text quality of summaries of both types, as well as a function measuring the agreement between machine-made and human-written summaries. The experiments indicated that machine-made natural language summaries can substantially help humans in writing their own textual descriptions of entity sets within a limited time.



    加载中


    [1] V. Christophides, V. Efthymiou, K. Stefanidis, Entity resolution in the Web of data, Synthesis lectures on the Semantic Web: theory and technology, Morgan & Claypool Publishers, 2015. https://doi.org/10.1007/978-3-031-79468-1
    [2] H. Shah, P. Fränti, Combining statistical, structural, and linguistic features for keyword extraction from web pages, Applied computing and intelligence, 2 (2022), 115–132. https://doi.org/10.3934/aci.2022007 doi: 10.3934/aci.2022007
    [3] G. Cheng, T. Tran, Y. Qu, RELIN: relatedness and informativeness-based centrality for entity summarization, The Semantic Web–ISWC 2011, The Semantic Web–ISWC 2011: 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I 10, (2011), 114–129. https://doi.org/10.1007/978-3-642-25073-6_8
    [4] A. Thalhammer, A. Rettinger, Browsing DBPedia entities with summaries, The Semantic Web: ESWC 2014 Satellite Events, (2014), 511–515. https://doi.org/10.1007/978-3-319-11955-7_76
    [5] A. Thalhammer, N. Lasierra, A. Rettinger, LinkSUM: using link analysis to summarize entity data, International Conference on Web Engineering, (2016), 244–261. https://doi.org/10.1007/978-3-319-38791-8_14
    [6] G. Cheng, D. Xu, Y. Qu, Summarizing entity descriptions for effective and efficient human-centered entity linking, Proceedings of the 24th International Conference on World Wide Web, (2015), 184–194. https://doi.org/10.1145/2736277.2741094
    [7] G. Cheng, D. Xu, Y. Qu, C3d+ p: a summarization method for interactive entity resolution, Web Semantics: Science, Services and Agents on the World Wide Web, 35 (2015), 203–213. https://doi.org/10.1016/j.websem.2015.05.004 doi: 10.1016/j.websem.2015.05.004
    [8] J. Huang, W. Hu, H. Li, Y. Qu, Automated comparative table generation for facilitating human intervention in multi-entity resolution, The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, (2018), 585–594.
    [9] K. Gunaratna, A. H. Yazdavar, K. Thirunarayan, A. Sheth, G. Cheng, Relatedness-based multi-entity summarization, Proceedings of the Twenty-national Joint Conference on Artificial Intelligence, (2017), 1060–1066. https://doi.org/10.24963/ijcai.2017/147
    [10] G. Troullinou, H. Kondylakis, K. Stefanidis, D. Plexousakis, Exploring RDFS KBs using summaries, The Semantic Web – ISWC, (2018), 268–284. https://doi.org/10.1007/978-3-030-00671-6_16
    [11] A. Aker, R. Gaizauskas, Generating descriptive multi-document summaries of geo-located entities using entity type models, J. Assoc. Inf. Sci. Tech., 66 (2015), 721–738. https://doi.org/10.1002/asi.23211 doi: 10.1002/asi.23211
    [12] H.Chen, J. Kuo, S. Huang, C. Lin, H. Wung, A summarization system for Chinese news from multiple sources, J. Am. Soc. Inf. Sci. Tech., 54 (2003), 1224–1236. https://doi.org/10.1002/asi.10315 doi: 10.1002/asi.10315
    [13] E. Baralis, L. Cagliero, S. Jabeen, A. Fiori, S. Shah, Multi-document summarization based on the Yago ontology, Expert Syst. Appl. 40 (2013), 6976–6984. https://doi.org/10.1016/j.eswa.2013.06.047
    [14] K. Gunaratna, K. Thirunarayan, A. Sheth, FACES: diversity-aware entity summarization using incremental hierarchical conceptual clustering, Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, (2015), 116–122. https://doi.org/10.1609/aaai.v29i1.9180
    [15] M. Sydow, M. Pikuła, R. Schenkel, The notion of diversity in graphical entity summarisation on semantic knowledge graphs, J. Intell. Inf. Syst., 41 (2013), 109–149. https://doi.org/10.1007/s10844-013-0239-6 doi: 10.1007/s10844-013-0239-6
    [16] B. Schäfer, P. Ristoski, H. Paulheim, What is special about Bethlehem, Pennsylvania? Identifying unusual facts about DBpedia entities, Proceedings of the ISWC 2015 Posters & Demonstrations Track, 2015.
    [17] N. Yan, S. Hasani, A. Asudeh, C. Li, Generating preview tables for entity graphs, Proceedings of the 2016 International Conference on Management of Data, (2016), 1797–1811. https://doi.org/10.1145/2882903.2915221
    [18] D. Xu, G. Cheng, Y. Qu, Facilitating human intervention in coreference resolution with comparative entity summaries, The Semantic Web: Trends and Challenges, ESWC 2014, Lecture Notes in Computer Science, (2014), 535–549. https://doi.org/10.1007/978-3-319-07443-6_36
    [19] D. Wei, Y. Liu, F. Zhu, L. Zang, W. Zhou, J. Han, et al., ESA: Entity Summarization with Attention, arXiv preprint arXiv: 1905.10625, 2019.
    [20] Q. Liu, G. Cheng, Y. Qu, DeepLENS: Deep Learning for Entity Summarization, arXiv preprint arXiv: 2003.03736, 2020.
    [21] Q. Liu, Y. Chen, G. Cheng, E. Kharlamov, J. Li, Y. Qu, Entity Summarization with User Feedback, ESWC 2020: The Semantic Web, (2020), 376–392. https://doi.org/10.1007/978-3-030-49461-2_22
    [22] A. Chisholm, W. Radford, B. Hachey, Learning to generate one-sentence biographies from Wikidata, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, (2017), 633–642. https://doi.org/10.18653/v1/E17-1060
    [23] R. Lebret, D. Grangier, M. Auli, Neural Text Generation from Structured Data with Application to the Biography Domain, Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, (2016), 1203–1213. https://doi.org/10.18653/v1/D16-1128
    [24] P. Vougiouklis, H. Elsahar, L. Kaffee, C. Gravier, F. Laforest, J. Hare, et al., Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples, Journal of Web Semantics, 52 (2018), 1–15. https://doi.org/10.1016/j.websem.2018.07.002 doi: 10.1016/j.websem.2018.07.002
    [25] C. Jumel, A. Louis, J. C. K. Cheung, TESA: A Task in Entity Semantic Aggregation for Abstractive Summarization, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, (2020), 8031–8050. https://doi.org/10.18653/v1/2020.emnlp-main.646
    [26] A. R. Fabbri, W. Kryściński, B. McCann, C. Xiong, R. Socher, D. Radev, SummEval: Re-evaluating Summarization Evaluation, Transactions of the Association for Computational Linguistics, 9 (2021), 391–409. https://doi.org/10.1162/tacl_a_00373 doi: 10.1162/tacl_a_00373
    [27] E. Zimina, J. Nummenmaa, K. Järvelin, J. Peltonen, K. Stefanidis, H. Hyyrö, GQA: grammatical question answering for RDF data, Semantic Web Challenges: 5th SemWebEval Challenge at ESWC, (2018), 82–97. https://doi.org/10.1007/978-3-030-00072-1_8
    [28] T. Saracevic, Measuring the degree of agreement between searchers, Proceedings of the 47th Annual Meeting of the American Society for Information Science, 21 (1984), 227–230.
    [29] M. Azmy, P. Shi, I. Ilyas, J. Lin, Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia, Proceedings of the 27th international conference on computational linguistics (2018), 2093–2103.
    [30] T. Tanon, D. Vrandečić, S. Schaffert, T. Steiner, L. Pintscher, From Freebase to Wikidata: The Great Migration, Proceedings of the 25th International Conference on World Wide Web, (2016), 1419–1428.
    [31] M. Dubey, D. Banerjee, A. Abdelkawi, J. Lehmann, LC-QuAD 2.0: A Large Dataset for Complex Question Answering over Wikidata and DBpedia, International Semantic Web Conference, (2019), 69–78. https://doi.org/10.1007/978-3-030-30796-7_5
    [32] M. Damova, D. Dannélls, R. Enache, M. Mateva, A. Ranta, Multilingual Natural Language Interaction with Semantic Web Knowledge Bases and Linked Open Data, in Towards the Multilingual Semantic Web: Principles, Methods and Applications, Buitelaar, P., Cimiano, P., Eds., Springer Berlin Heidelberg, (2014), 211–226. https://doi.org/10.1007/978-3-662-43585-4_13
    [33] D. Dannélls, Multilingual text generation from structured formal representations. PhD Thesis. University of Gothenburg, 2012.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(371) PDF downloads(62) Cited by(0)

Article outline

Figures and Tables

Tables(9)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog