Efficient summarization with lightweight LLMs through sparse input activation and adaptive prompting

Srinivasan Subramanian; Ramya Madhuri Narapureddy; Mohana Priya Palanisamy; Kazi Aminul Islam; Md. Abdullah Al Hafiz Khan; Srinivasan Subramanian; Ramya Madhuri Narapureddy; Mohana Priya Palanisamy; Kazi Aminul Islam; Md. Abdullah Al Hafiz Khan

doi:10.3934/aci.2026004

Applied Computing and Intelligence

2026, Volume 6, Issue 1: 58-78. doi: 10.3934/aci.2026004

Previous Article Next Article

Research article

Efficient summarization with lightweight LLMs through sparse input activation and adaptive prompting

Department of Computer Science, Kennesaw State University, 1100 South Marietta Pkwy SE, Marietta, GA 30060, USA

Received: 10 February 2026 Revised: 25 March 2026 Accepted: 02 April 2026 Published: 09 April 2026

Large language models (LLMs) are designed to read, reason, and generate natural language, enabling improved access to information, problem solving, and communication. However, their performance depends strongly on prompt design and model scale. Lightweight LLMs (fewer than 5B parameters) often struggle to summarize complex technical documents such as research publications due to domain-specific terminology, dense citations, and mathematical notation, which can increase hallucinations and reduce reliability. We have introduced a hybrid framework that combines Natural Language Processing (NLP) driven preprocessing techniques, sparse input activation, and prompt engineering to enhance the summarization capacity of small-scale models by minimizing hallucinations and maximizing factual accuracy. The pipeline starts with cleaning and segmenting full articles into sections, removing references, citations, and formulas. On this normalized output, salient sentence extraction and keyphrase extraction are performed as part of the sparse input activation module. The activated input and optimized prompt are fed to LLMs of different scales and the summaries are benchmarked with respect to different prompting strategies and model sizes. This workflow significantly enhances the factual accuracy and reliability of summaries generated by lightweight LLMs, making them competitive for complex scientific and technical summarization tasks. Our adaptive prompting and sparse input activation approach significantly boosted summarization quality in lightweight LLMs, improving average ROUGE-1 recall from about 32% to 46% (44% relative gain), human evaluation scores from 66% to 91% (39% relative gain), and coverage ratio from 39% to 60% (53% relative gain) over the baseline.
- lightweight large language models (LLMs),
- document summarization,
- sparse input activation (SIA),
- adaptive prompting,
- salient sentence extraction,
- keyphrase extraction,
- factual accuracy
Citation: Srinivasan Subramanian, Ramya Madhuri Narapureddy, Mohana Priya Palanisamy, Kazi Aminul Islam, Md. Abdullah Al Hafiz Khan. Efficient summarization with lightweight LLMs through sparse input activation and adaptive prompting[J]. Applied Computing and Intelligence, 2026, 6(1): 58-78. doi: 10.3934/aci.2026004

Related Papers:

Abstract

Large language models (LLMs) are designed to read, reason, and generate natural language, enabling improved access to information, problem solving, and communication. However, their performance depends strongly on prompt design and model scale. Lightweight LLMs (fewer than 5B parameters) often struggle to summarize complex technical documents such as research publications due to domain-specific terminology, dense citations, and mathematical notation, which can increase hallucinations and reduce reliability. We have introduced a hybrid framework that combines Natural Language Processing (NLP) driven preprocessing techniques, sparse input activation, and prompt engineering to enhance the summarization capacity of small-scale models by minimizing hallucinations and maximizing factual accuracy. The pipeline starts with cleaning and segmenting full articles into sections, removing references, citations, and formulas. On this normalized output, salient sentence extraction and keyphrase extraction are performed as part of the sparse input activation module. The activated input and optimized prompt are fed to LLMs of different scales and the summaries are benchmarked with respect to different prompting strategies and model sizes. This workflow significantly enhances the factual accuracy and reliability of summaries generated by lightweight LLMs, making them competitive for complex scientific and technical summarization tasks. Our adaptive prompting and sparse input activation approach significantly boosted summarization quality in lightweight LLMs, improving average ROUGE-1 recall from about 32% to 46% (44% relative gain), human evaluation scores from 66% to 91% (39% relative gain), and coverage ratio from 39% to 60% (53% relative gain) over the baseline.

References

[1]	R. Mihalcea, P. Tarau, TextRank: bringing order into texts, Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004,404–411.
[2]	M. Grootendorst, KeyBERT: a minimal keyword extraction tool using BERT embeddings, GitHub/Zenodo repository, 2020. Available from: https://github.com/MaartenGr/KeyBERT.
[3]	I. Beltagy, M. Peters, A. Cohan, Longformer: the long-document transformer, arXiv: 2004.05150. https://doi.org/10.48550/arXiv.2004.05150
[4]	M. Zaheer, G. Guruganesh, A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, et al., Big bird: transformers for longer sequences, Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 17283–17297.
[5]	T. Zhang, Y. Wang, D. Z. Wang, SCOPE: a generative approach for LLM prompt compression, arXiv: 2508.15813. https://doi.org/10.48550/arXiv.2508.15813
[6]	Z. Zhao, E. Wallace, S. Feng, D. Klein, S. Singh, Calibrate before use: improving few-shot performance of language models, Proceedings of the 38th International Conference on Machine Learning, 2021, 12697–12706.
[7]	T. Shin, Y. Razeghi, R. Logeswaran, E. Wallace, S. Singh, Autoprompt: eliciting knowledge from language models with automatically generated prompts, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
[8]	J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, et al., Chain-of-thought prompting elicits reasoning in large language models, Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022, 24824–24837.
[9]	T. Kojima, S. Gu, M. Reid, Y. Matsuo, Y. Iwasawa, Large language models are zero-shot reasoners, Proceedings of the 36th International Conference on Neural Information Processing Systems, 2022, 22199–22213.
[10]	G. Adams, A. R. Fabbri, F. Ladhak, E. Lehman, N. Elhadad, From sparse to dense: GPT-4 summarization with chain of density prompting, arXiv: 2309.04269. https://doi.org/10.48550/arXiv.2309.04269
[11]	P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, et al., Retrieval-augmented generation for knowledge-intensive nlp tasks, Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020, 9459–9474.
[12]	M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, et al., BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
[13]	J. Zhang, Y. Zhao, M. Saleh, P. Liu, Pegasus: pre-training with extracted gap-sentences for abstractive summarization, Proceedings of the 37th International Conference on Machine Learning, 2020, 11328–11339.
[14]	C. Y. Lin, Rouge: a package for automatic evaluation of summaries, in: Text summarization branches out, Barcelona: Association for Computational Linguistics, 2004, 74–81.
[15]	T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, Y. Artzi, Bertscore: evaluating text generation with bert, Proceedings of ICLR, 2020, 1–41.
[16]	A. R. Fabbri, C. S. Wu, W. Liu, C. Xiong, QAFactEval: improved QA-based factual consistency evaluation for summarization, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), 2022, 2587–2601. https://doi.org/10.18653/v1/2022.naacl-main.187
[17]	Y. Liu, K. Shi, K. S. He, L. Ye, A. R. Fabbri, P. Liu, et al., On learning to summarize with large language models as references, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2024, 8647–8664. https://doi.org/10.18653/v1/2024.naacl-long.478
[18]	N. Gali, R. Mariescu-Istodor, D. Hostettler, P. Fränti, Framework for syntactic string similarity measures, Expert Syst. Appl., 129 (2019), 169–185. https://doi.org/10.1016/j.eswa.2019.03.048 doi: 10.1016/j.eswa.2019.03.048
[19]	T. Titipata, SciPDF Parser: a Python parser for scientific PDF files, GitHub repository, 2022. Available from: https://github.com/titipata/scipdf_parser.
[20]	C. D. Manning, P. Raghavan, H. Schütze, Introduction to information retrieval, Cambridge: Cambridge University Press, 2008. https://doi.org/10.1017/CBO9780511809071
[21]	J. Carbonell, J. Goldstein, The use of MMR, diversity-based reranking for reordering documents and producing summaries, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998,335–336. https://doi.org/10.1145/290941.291025
[22]	Y. Liu, M. Lapata, Text summarization with pretrained encoders, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, 3730–3740. https://doi.org/10.18653/v1/D19-1387
[23]	H. Shah, P. Fränti, Combining statistical, structural, and linguistic features for keyword extraction from web pages, Applied Computing and Intelligence, 2 (2022), 115–132. https://doi.org/10.3934/aci.2022007 doi: 10.3934/aci.2022007
[24]	P. Fränti, R. Mariescu-Istodor, Soft precision and recall, Pattern Recogn. Lett., 167 (2023), 115–121. https://doi.org/10.1016/j.patrec.2023.02.005

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)