Paragraphic Query-Based Summarization Approach for Arabic Extracts

https://doi.org/10.48185/jitc.v5i1.1702

Authors

  • Mohammed Salem BinWahlan Seiyun University
  • Mazin Alkathiri Seiyun university
  • Khaled Mohammed Binabdl

Abstract

The internet's exponential increase in information availability imposes a significant cost on individuals seeking such content. This challenge is being addressed by numerous researchers striving to overcome its complexity. The current study presents an alternative approach to query-based automatic text summarization. Unlike earlier methods, this approach generates the query based on the first paragraph sentences of the material being summarized, eliminating the need for user input during query submission. Applying the generated query to the document results in a final summary. Additionally, the study investigates the impact of different query lengths and similarity measures. The evaluation utilized the ROUGE metric and the EASC dataset. The experimental findings show that the suggested approach, which makes use of the Russel similarity measure and a longer query length, performs better than alternative scenarios that use the cosine, Forbes, and Jaccard similarity measures and shorter query lengths.

Downloads

Download data is not yet available.

References

H. P. Luhn, “The Automatic Creation of Literature Abstracts”. IBM Journal of Research and Development, 2(92) (1958) 159-165.

H. P. Edmundson, “New Methods in Automatic Extracting”. Journal of the Association for Computing Machinery, 16(2) (1969) 264-285.

P. Baxendale, “Machine-made Index for Technical Literature - an Experiment”. IBM Journal of Research Development. 2(4) (1958) 354-361.

Y. T. Sung, K. E. Chang, T. C. Liu, “The effects of integrating mobile devices with teaching and learning on students’ learning performance: A meta-analysis and research synthesis”. Comput. Educ. 94, (2016) 252–275. http://dx.doi.org/10.1016/j.compedu.2015.11.008.

B. Cranganu-Cretu, Z. Chen, T. Uchimoto, K. Miya, “Automatic text summarizing based on sentence extraction: A statistical approach”. Int. J. Appl. Electromagn. Mech. 13 (1–4), (2002) 19–23. http://dx.doi.org/10.3233/JAE-2002-513.

G. Taylor, K. Barabin, K. Sayre, “An application of reinforcement learning to supervised autonomy”. (2015). https://api.semanticscholar.org/CorpusID:1203120.

J. M. Sanchez-Gomez, M. A. Vega-Rodríguez, C. J. Perez, “The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization”. Expert Systems with Applications, vol. 169, (2021) pp. 114510, 2021.

G. Murray, S. Renals, “Term-Weighting for Summarization of Multi-Party Spoken Dialogues”. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2007. Lecture Notes in Computer Science, vol 4892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78155-4_14

R. Khan, Y. Qian, S. Naeem, “Extractive based Text Summarization Using KMeans and TF-IDF”. International Journal of Information Engineering and Electronic Business (IJIEEB), Vol.11, No.3, (2019) pp. 33-44. DOI: 10.5815/ijieeb.2019.03.05

Y. SEKI, “Sentence Extraction by tf/idf and Position Weighting from Newspaper Articles”. Proceedings of the Third NTCIR Workshop (2003), National Institute of Informatics.

R. C. Balabantaray, D. K. Sahoo, B. Sahoo, Swain, M. “Text Summarization using Term Weights" International Journal of Computer Applications (0975 – 8887) Volume 38– No.1, January 2012

M. Afsharizadeh, H. Ebrahimpour-Komleh, A. Bagheri, “Query-oriented text summarization using sentence extraction technique”. in Proc. 4th Int. Conf. Web Res. (ICWR), Apr. 2018, pp. 128–132.

R. Rani, D. K. Lobiyal, “A weighted word embedding based approach for extractive text summarization”, Expert Systems with Applications, Volume 186, 2021, 115867, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2021.115867.

F. Douzidia, G. Lapalme, Lakhas, “an Arabic Summarising System”. In Proceedings of the 4th Document Understanding Conferences, (2004) pages 128–135. DUC.

K. Zechner, “Fast Generation of Abstracts from General Domain Text Corpora by Extracting Relevant Sentences”. In Proceedings of the 16th International Conference on Computational Linguistics, (1996) 986–989, Copenhagen, Denmark.

M. El-Haj, U. Kruschwitz, C. Fox, “Experimenting with Automatic Text Summarization for Arabic”. In Zygmunt Vetulani, editor, 4th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC'09,”Lecture Notes in Artificial Intelligence”, pages 490–499, Poznan, Poland, 2009. Springer.

I. Imam, N. Nounou, A. Hamouda, H. A. Abdul Khalek, “Query Based Arabic Text Summarization”. International Journal of Computer Science and Technology. 4(2), 2013a, Pp. 35-39/

M. Elhaj, “Multi-document Arabic Text Summarisation”. PhD thesis, 2012, University of Essex.

M. S. Binwahlan, “Extractive Summarization Method for Arabic Text – ESMAT”. International Journal of Computer Trends and Technology. 21(2), (2015) pp. 103-109.

M. S. Binwahlan, N. Salim, L. Suanmali, “Swarm based features selection for text summarization”. IJCSNS International Journal of Computer Science and Network Security, 9(1), (2009b) 175–179.

S. Teufel, “Deeper summarization: The second time around: An overview and some practical suggestions”. In: Lecture Notes in Computer Science (Including Sub-series Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): 9624 LNCS, (2018) pp. 581–598. http://dx.doi.org/10.1007/978-3-319-75487-1_44.

K. S. Thakkar, R. V. Dharaskar, M. B. Chandak, “Graph-based algorithms for text summarization”. In: 2010 3rd International Conference on Emerging Trends in Engineering and Technology, (2010) pp. 516–519. http://dx.doi.org/10.1109/ICETET.2010.104.

F. Amato, A. D’Acierno, F. Colace, V. Moscato, A. Penta, A. Picariello, “Semantic Summarization of News from Heterogeneous Sources. In: Xhafa, F., Barolli, L., Amato, F. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2016. Lecture Notes on Data Engineering and Communications Technologies, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-49109-7_29 2017a pp. 305–314. http://dx.doi.org/10.1007/978-3-319-49109-7_29.

F. Amato, V. Moscato, A. Picariello, G. Sperlí, A. D’Acierno, A. Penta, “Semantic summarization of web news”. In: Encyclopedia with Semantic Computing and Robotic Intelligence. Vol. 01, (01), (2017b) http://dx.doi.org/10.1142/S2425038416300068, 1630006.

B. Pant, V. Vimal, “Text Summarization Employing Sentence Ranking Approach”. Webology, Volume 18, Number 5, 2021 ISSN: 1735-188X DOI: 10.29121/WEB/V18I5/52 3178 http://www.webology.org

N. Andhale, L. A. Bewoor, “An overview of text summarization techniques”. In: 2016 International Conference on Computing Communication Control and Automation. ICCUBEA, (2016) pp. 1–7. http://dx.doi.org/10.1109/ICCUBEA.2016.7860024.

A. Bhola, J. Mullapudi, S. Kollipara, T. Sanaka, “Text summarization based on ranking techniques”. In: 2022 5th International Conference on Contemporary Computing and Informatics. IC3I, pp. 1463–1467. http://dx.doi.org/10.1109/IC3I56241.2022.10072962.

P. Verma, A. Verma, P. Sukomal, “An approach for extractive text summarization using fuzzy evolutionary and clustering algorithms”. Applied Soft Computing 120 (2022) 108670

N. Alami, M. E. Mallahi, H. Amakdouf, H. Qjidaa, “Hybrid method for text summarization based on statistical and semantic treatment”. Multimedia Tools Appl. 80 (13), (2021) 19567–19600. http://dx.doi.org/10.1007/s11042-021-10613-9.

S. Singh, J. P. Singh, A. Deepak, “Supervised weight learning-based PSO framework for single document extractive summarization”. Applied Soft Computing Journal 161 (2024) 111678

E. Hovy, C. Y. Lin, “Automatic text summarization in SUMMARIST”. In Proceedings of the ACL’97/EACL’97 workshop on intelligent scalable text summarization (pp. 18–24), Madrid, Spain.

Y. Gong, X. Liu, “Generic text summarization using relevance measure and latent semantic analysis”. In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’01) (pp. 19–25), New Orleans, LA, USA.

S. P. Pati, R. Rautray, “Sentence Selection for Extractive Text Summarization using TOPSIS Approach”. Procedia Computer Science 235 (2024) 1532–1538

J. Kupiec, J. Pedersen, F. Chen, “A trainable document summarizer”. In Proceedings of the ACM. SIGIR conference. July 1995. New York, USA, 68-73.

C. Y. Lin, E. Hovy, “Identifying topics by position”. In Proceedings of the Fifth conference on Applied natural language processing. March. San Francisco, CA, USA, 283-290, 1997.

C. Y. Lin, “Training a selection function for extraction”. In Proceedings of the Eighteenth Annual International ACM Conference on Information and Knowledge Management (CIKM). 2-6 Nov. 1999. Kansas City, Kansas, 55-62.

J. M. Conroy, D. P. O'leary, “Text summarization via hidden markov models”. Proceedings of SIGIR '01. 9-12 September 2001. New Orleans, Louisiana, USA, 406-407.

M. Osborne, “Using maximum entropy for sentence extraction”. Proceedings of the ACL'02 Workshop on Automatic Summarization. July 2002. Morristown, NJ, USA, 1-8.

K. Svore, L. Vanderwende, C. Burges, “Enhancing single document summarization by combining RankNet and third-party sources”. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. June 2007. Prague: Association for Computational Linguistics, 448–457.

M. A. Fattah, F. Ren, “GA, MR, FFNN, PNN and GMM based models for automatic text summarization”. Computer Speech and Language. 2008. 23(1), 126-144.

W. Wang, S. Li, J. Li, W. Li, F. Wei, “Exploring hypergraph-based semi-supervised ranking for query-oriented summarization”, Information Sciences 237 (2013) 271–286.

H. K., Gianey, R. Choudhary, “Comprehensive review on supervised machine learning algorithms”. In: Proceedings - 2017 International Conference on Machine Learning and Data Science. MLDS 2017 2018-Janua, pp. 38–43. http://dx.doi.org/10.1109/MLDS.2017.11.

J. Thomas, A. Sreeraj, A. Sreeraj, M. M. Varghese, T. Kuriakose, “Automatic Text Summarization Using Deep Learning and Reinforcement Learning”. (2022) pp. 769–778. http://dx.doi.org/10.1007/978-981-16-5157-1_60.

S. Kumar, “Machine learning (supervised)”. In: International Series in Operations Research and Management Science. Vol. 264, (2019) pp. 507–568. http://dx.doi.org/10.1007/978-3-319-68837-4_16.

H. Dalianis, M. Hassel, J. Wedekind, D. Haltrup, K. de Smedt, & T. L. Christopher, “From SweSum to ScandSum- Automatic text summarization for the Scandinavian languages”. In Holmboe, H. (ed.) Nordisk Sprogteknologi 2002: Årbog for Nordisk Språkteknologisk Forskningsprogram, 2000-2004 (pp. 153-163). Copenhagen: Museum Tusculanums Forlag

P. R. Dedhia, H. P. Pachgade, A.P. Malani, N. Raul, M. Naik, “Study on abstractive text summarization techniques”. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering. Ic-ETITE, pp. 1–8. http://dx.doi.org/10.1109/ic-ETITE47903.2020.087.

P. Verma, D. P. Tyagi, “Supervised Machine Learning: Predispositions, Practices and Perspectives”, (2020), https://api.semanticscholar.org/CorpusID:233228017.

F. Wu, T. Zheng, L. Yao, H. Feng, “A new unsupervised Algorithm for extracting relationship words between two entities”. In: 2021 3rd International Conference on Advances in Computer Technology, Information Science and Communication, CTISC, pp. 161–165. http://dx.doi.org/10.1109/CTISC52352.2021.00037.

L. Martin, “Automatic Sentence Simplification Using Controllable and Unsupervised Methods”. Computation and Language, Sorbonne Université, (Issue 2021SORUS265). https://theses.hal.science/tel-03543971.

J. Cheng, F. Zhang, X. Guo, “A syntax-augmented and headline-aware neural text summarization method”. IEEE Access 8, 218360–218371(2020), http://dx.doi.org/10.1109/ACCESS.2020.3042886.

V. K. Verma, A. Yadav, T. Jain, “Key Feature Extraction and Machine Learning-Based Automatic Text Summarization”. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds) Emerging Technologies in Data Mining and Information Security. Advances in Intelligent Systems and Computing, (2019), vol 814. Springer, Singapore. https://doi.org/10.1007/978-981-13-1501-5_76

K. Thirumoorthy, J. J. J. Britto, “A hybrid approach for text summarization using social mimic optimization algorithm”. Iranian J. Sci. Technol., Trans. Electr. Eng. 47 (2), (2023) 677–693. http://dx.doi.org/10.1007/s40998-022-00572-8.

M. H. H. Wahab, N. A. W. A. Hamid, S. Subramaniam, R. Latip, M. Othman, “A Decomposition–based multi-objective differential evolution for extractive multi-document automatic text summarization”. Applied Soft Computing Journal 151 (2024) 110994

EASC,” Essex Arabic Summaries Corpus”, [Online] Available: http://privatewww.essex.ac.uk/~melhaj/easc.htm, (14-01-2013(.

C. Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries”. Proceedings of the Workshop on Text Summarization Branches Out, 42nd Annual Meeting of the Association for Computational Linguistics. 25–26 July (2004), Barcelona, Spain, 74-81.

H. Van Lierde, T. W. S. Chow, “Learning with fuzzy hypergraphs: A topical approach to query-oriented text summarization”. Information Sciences 496 (2019) 212-224, https://doi.org/10.1016/j.ins.2019.05.020

G. Salton, A. Wong S. Yang, “A Vector Space Model for Automatic Indexing”. Communications of the ACM, vol. 18, no. 11, (1975) pp. 613–620.

M. Peng, B. Gao, J. Zhu, J. Huang, M. Yuan, F. Li, “High quality information extraction and query-oriented summarization for automatic query-reply in social network”. Expert Systems with Applications 44 (2016) 92–101

M. Yousefi-Azar, L. Hamey, “Text summarization using unsupervised deep learning”. Expert Systems with Applications, 68, (2017) 93–105. https://doi.org/10.1016/j.eswa.2016.10.017

F. Kiyani, O. Tas, “A survey automatic text summarization”. Pressacademia, 5(1), (2017) 205–213. https://doi.org/10.17261/pressacademia.2017.591

Abdi, S. M. Shamsuddin, R. M. Aliguliyev, “QMOS: Query-based multi-documents opinion-oriented summarization”. Information Processing and Management 54 (2018) 318–338.

F. Geng, Q. Liu, P. Zhang, “A time-aware query-focused summarization of an evolving microblogging stream via sentence extraction”. Digital Communications and Networks 6 (2020) 389 – 397.

M. Alhoshan, N. Altwaijry, AUSS: “An Arabic query-based update-summarization system”. Journal of King Saud University – Computer and Information Sciences 34 (2022) 3732–3743.

R. M. Aliguliyev, “Clustering techniques and discrete particle swarm optimization algorithm for multi-document summarization”. Comput. Intell. 26 (4) (2010) 420–448.

R. M. Alguliyev, R. M. Aliguliyev, N. R. Isazade, “An unsupervised approach to generating generic summaries of documents”. Applied Soft Computing 34 (2015) 236–250

N. Salim, “Analysis and Comparison of Molecular Similarity Measures”. University of Sheffield: Ph. D. Thesis, (2002).

S. Khoja, R. Garside, “Stemming Arabic text. Computing Department”, Lancaster University, Lancaster, (1999), <http://www.comp.lancs.ac.uk/computing/users/khoja/stemmer.ps>

G. Salton, C. Buckley, “Term-weighting approaches in automatic text retrieval”. Information Processing Management, 24 (5) (1988) 513–523, http://dx.doi.org/10.1016/0306-4573(88)90021-0.

I. Mani, “Automatic Summarization”. (1st ed.) (2001), Amsterdam: John Benjamins Publishing Company.

Published

2025-09-14

How to Cite

Salem BinWahlan, M., Alkathiri, M. . ., & Mohammed Binabdl, K. . (2025). Paragraphic Query-Based Summarization Approach for Arabic Extracts. Journal of Information Technology and Computing, 5(1), 1–14. https://doi.org/10.48185/jitc.v5i1.1702