Research article

Efficient summarization with lightweight LLMs through sparse input activation and adaptive prompting

  • Published: 09 April 2026
  • Large language models (LLMs) are designed to read, reason, and generate natural language, enabling improved access to information, problem solving, and communication. However, their performance depends strongly on prompt design and model scale. Lightweight LLMs (fewer than 5B parameters) often struggle to summarize complex technical documents such as research publications due to domain-specific terminology, dense citations, and mathematical notation, which can increase hallucinations and reduce reliability. We have introduced a hybrid framework that combines Natural Language Processing (NLP) driven preprocessing techniques, sparse input activation, and prompt engineering to enhance the summarization capacity of small-scale models by minimizing hallucinations and maximizing factual accuracy. The pipeline starts with cleaning and segmenting full articles into sections, removing references, citations, and formulas. On this normalized output, salient sentence extraction and keyphrase extraction are performed as part of the sparse input activation module. The activated input and optimized prompt are fed to LLMs of different scales and the summaries are benchmarked with respect to different prompting strategies and model sizes. This workflow significantly enhances the factual accuracy and reliability of summaries generated by lightweight LLMs, making them competitive for complex scientific and technical summarization tasks. Our adaptive prompting and sparse input activation approach significantly boosted summarization quality in lightweight LLMs, improving average ROUGE-1 recall from about 32% to 46% (44% relative gain), human evaluation scores from 66% to 91% (39% relative gain), and coverage ratio from 39% to 60% (53% relative gain) over the baseline.

    Citation: Srinivasan Subramanian, Ramya Madhuri Narapureddy, Mohana Priya Palanisamy, Kazi Aminul Islam, Md. Abdullah Al Hafiz Khan. Efficient summarization with lightweight LLMs through sparse input activation and adaptive prompting[J]. Applied Computing and Intelligence, 2026, 6(1): 58-78. doi: 10.3934/aci.2026004

    Related Papers:

  • Large language models (LLMs) are designed to read, reason, and generate natural language, enabling improved access to information, problem solving, and communication. However, their performance depends strongly on prompt design and model scale. Lightweight LLMs (fewer than 5B parameters) often struggle to summarize complex technical documents such as research publications due to domain-specific terminology, dense citations, and mathematical notation, which can increase hallucinations and reduce reliability. We have introduced a hybrid framework that combines Natural Language Processing (NLP) driven preprocessing techniques, sparse input activation, and prompt engineering to enhance the summarization capacity of small-scale models by minimizing hallucinations and maximizing factual accuracy. The pipeline starts with cleaning and segmenting full articles into sections, removing references, citations, and formulas. On this normalized output, salient sentence extraction and keyphrase extraction are performed as part of the sparse input activation module. The activated input and optimized prompt are fed to LLMs of different scales and the summaries are benchmarked with respect to different prompting strategies and model sizes. This workflow significantly enhances the factual accuracy and reliability of summaries generated by lightweight LLMs, making them competitive for complex scientific and technical summarization tasks. Our adaptive prompting and sparse input activation approach significantly boosted summarization quality in lightweight LLMs, improving average ROUGE-1 recall from about 32% to 46% (44% relative gain), human evaluation scores from 66% to 91% (39% relative gain), and coverage ratio from 39% to 60% (53% relative gain) over the baseline.



    加载中


    [1] R. Mihalcea, P. Tarau, TextRank: bringing order into texts, Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004,404–411.
    [2] M. Grootendorst, KeyBERT: a minimal keyword extraction tool using BERT embeddings, GitHub/Zenodo repository, 2020. Available from: https://github.com/MaartenGr/KeyBERT.
    [3] I. Beltagy, M. Peters, A. Cohan, Longformer: the long-document transformer, arXiv: 2004.05150. https://doi.org/10.48550/arXiv.2004.05150
    [4] M. Zaheer, G. Guruganesh, A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, et al., Big bird: transformers for longer sequences, Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 17283–17297.
    [5] T. Zhang, Y. Wang, D. Z. Wang, SCOPE: a generative approach for LLM prompt compression, arXiv: 2508.15813. https://doi.org/10.48550/arXiv.2508.15813
    [6] Z. Zhao, E. Wallace, S. Feng, D. Klein, S. Singh, Calibrate before use: improving few-shot performance of language models, Proceedings of the 38th International Conference on Machine Learning, 2021, 12697–12706.
    [7] T. Shin, Y. Razeghi, R. Logeswaran, E. Wallace, S. Singh, Autoprompt: eliciting knowledge from language models with automatically generated prompts, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
    [8] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, et al., Chain-of-thought prompting elicits reasoning in large language models, Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022, 24824–24837.
    [9] T. Kojima, S. Gu, M. Reid, Y. Matsuo, Y. Iwasawa, Large language models are zero-shot reasoners, Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022, 22199–22213.
    [10] G. Adams, A. R. Fabbri, F. Ladhak, E. Lehman, N. Elhadad, From sparse to dense: GPT-4 summarization with chain of density prompting, arXiv: 2309.04269. https://doi.org/10.48550/arXiv.2309.04269
    [11] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, et al., Retrieval-augmented generation for knowledge-intensive nlp tasks, Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 9459–9474.
    [12] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, et al., BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
    [13] J. Zhang, Y. Zhao, M. Saleh, P. Liu, Pegasus: pre-training with extracted gap-sentences for abstractive summarization, Proceedings of the 37th International Conference on Machine Learning, 2020, 11328–11339.
    [14] C. Y. Lin, Rouge: a package for automatic evaluation of summaries, in: Text summarization branches out, Barcelona: Association for Computational Linguistics, 2004, 74–81.
    [15] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, Y. Artzi, Bertscore: evaluating text generation with bert, Proceedings of ICLR, 2020, 1–41.
    [16] A. R. Fabbri, C. S. Wu, W. Liu, C. Xiong, QAFactEval: improved QA-based factual consistency evaluation for summarization, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2022, 2587–2601. https://doi.org/10.18653/v1/2022.naacl-main.187
    [17] Y. Liu, K. Shi, K. S. He, L. Ye, A. R. Fabbri, P. Liu, et al., On learning to summarize with large language models as references, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2024, 8647–8664. https://doi.org/10.18653/v1/2024.naacl-long.478
    [18] N. Gali, R. Mariescu-Istodor, D. Hostettler, P. Fränti, Framework for syntactic string similarity measures, Expert Syst. Appl., 129 (2019), 169–185. https://doi.org/10.1016/j.eswa.2019.03.048 doi: 10.1016/j.eswa.2019.03.048
    [19] T. Titipata, SciPDF Parser: a Python parser for scientific PDF files, GitHub repository, 2022. Available from: https://github.com/titipata/scipdf_parser.
    [20] C. D. Manning, P. Raghavan, H. Schütze, Introduction to information retrieval, Cambridge: Cambridge University Press, 2008. https://doi.org/10.1017/CBO9780511809071
    [21] J. Carbonell, J. Goldstein, The use of MMR, diversity-based reranking for reordering documents and producing summaries, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998,335–336. https://doi.org/10.1145/290941.291025
    [22] Y. Liu, M. Lapata, Text summarization with pretrained encoders, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, 3730–3740. https://doi.org/10.18653/v1/D19-1387
    [23] H. Shah, P. Fränti, Combining statistical, structural, and linguistic features for keyword extraction from web pages, Applied Computing and Intelligence, 2 (2022), 115–132. https://doi.org/10.3934/aci.2022007 doi: 10.3934/aci.2022007
    [24] P. Fränti, R. Mariescu-Istodor, Soft precision and recall, Pattern Recogn. Lett., 167 (2023), 115–121. https://doi.org/10.1016/j.patrec.2023.02.005
  • Reader Comments
  • © 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(521) PDF downloads(21) Cited by(0)

Article outline

Figures and Tables

Figures(7)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog