Exploring the complexity of natural languages: A fuzzy evaluative perspective on Greenberg universals

Antoni Brosa-Rodríguez; M. Dolores Jiménez-López; Adrià Torrens-Urrutia; Antoni Brosa-Rodríguez; M. Dolores Jiménez-López; Adrià Torrens-Urrutia

doi:10.3934/math.2024109

AIMS Mathematics

2024, Volume 9, Issue 1: 2181-2214. doi: 10.3934/math.2024109

Previous Article Next Article

Research article Special Issues

Exploring the complexity of natural languages: A fuzzy evaluative perspective on Greenberg universals

Universitat Rovira i Virgili, Research Group on Mathematical Linguistics (GRLMC), 43002, Tarragona, Spain

Received: 13 August 2023 Revised: 01 November 2023 Accepted: 19 November 2023 Published: 21 December 2023
MSC : 03B65

In this paper, we introduced a fuzzy model for calculating complexity based on universality, aiming to measure the complexity of natural languages in terms of the degree of universality exhibited in their rules. We validated the model by conducting experiments on a corpus of 143 languages obtained from Universal Dependencies 2.11. To formalize the linguistic universals proposed by Greenberg, we employed the Grew tool to convert them into a formal rule representation. This formalization enables the verification of universals within the corpus. By analyzing the corpus, we extracted the occurrences of each universal in different languages. The obtained results were used to define a fuzzy model that quantifies the degree of universality and complexity of both the Greenberg universals and the languages themselves, employing the mathematical theory of evaluative expressions from fuzzy natural logic (FNL). Our analysis revealed an inversely proportional relationship between the degree of universality and the level of complexity observed in the languages. The implications of our findings extended to various applications in the theoretical analysis and computational treatment of languages. In addition, the proposed model offered insights into the nature of language complexity, providing a valuable framework for further research and exploration.
- linguistic universals,
- linguistic complexity,
- evaluative expressions,
- fuzzy grammar,
- linguistic gradience,
- linguistic constraints
Citation: Antoni Brosa-Rodríguez, M. Dolores Jiménez-López, Adrià Torrens-Urrutia. Exploring the complexity of natural languages: A fuzzy evaluative perspective on Greenberg universals[J]. AIMS Mathematics, 2024, 9(1): 2181-2214. doi: 10.3934/math.2024109

Related Papers:

Abstract

In this paper, we introduced a fuzzy model for calculating complexity based on universality, aiming to measure the complexity of natural languages in terms of the degree of universality exhibited in their rules. We validated the model by conducting experiments on a corpus of 143 languages obtained from Universal Dependencies 2.11. To formalize the linguistic universals proposed by Greenberg, we employed the Grew tool to convert them into a formal rule representation. This formalization enables the verification of universals within the corpus. By analyzing the corpus, we extracted the occurrences of each universal in different languages. The obtained results were used to define a fuzzy model that quantifies the degree of universality and complexity of both the Greenberg universals and the languages themselves, employing the mathematical theory of evaluative expressions from fuzzy natural logic (FNL). Our analysis revealed an inversely proportional relationship between the degree of universality and the level of complexity observed in the languages. The implications of our findings extended to various applications in the theoretical analysis and computational treatment of languages. In addition, the proposed model offered insights into the nature of language complexity, providing a valuable framework for further research and exploration.

References

[1]	G. Deutscher, Through the language glass: Why the world looks different in other languages, New York: Metropolitan Books, 2010.
[2]	P. W. Culicover, Grammar and complexity: Language at the intersection of competence and performance, Oxford: Oxford University Press, 2013.
[3]	T. Givón, M. Shibatani, editors, Syntactic complexity: Diachrony, acquisition, neuro-cognition, evolution, Amsterdam: John Benjamins, 2009. https://doi.org/10.1075/tsl.85
[4]	G. Sampson, D. Gil, P. Trudgill, Language complexity as an evolving variable, Oxford: Oxford University Press, 2009.
[5]	Y. M. Oh, F. Pellegrino, Towards robust complexity indices in linguistic typology: A corpus-based assessment, Stud. Lang., 2022, 18–31. https://doi.org/10.1075/sl.22034.oh doi: 10.1075/sl.22034.oh
[6]	D. Gil, How complex are isolating languages? In: M. Miestamo, K. Sinnemäki, F. Karlsson, editors, Language Complexity: Typology, Contact, Change, Amsterdam: John Benjamins, 2008,109–131. https://doi.org/10.1075/slcs.94.08gil
[7]	S. Leufkens, Measuring redundancy: The relation between concord and complexity, Linguist. Vanguard, 9 (2023), 95–106. https://doi.org/10.1515/lingvan-2020-0143 doi: 10.1515/lingvan-2020-0143
[8]	J. E. Joseph, Why does language complexity resist measurement? Front. Commun., 6 (2021). https://doi.org/10.3389/fcomm.2021.624855 doi: 10.3389/fcomm.2021.624855
[9]	I. Korzen, Are some languages more complex than others? On text complexity and how to measure it, Globe J. Lang. Cult. Commun., 12 (2021), 18–31. https://doi.org/10.5278/ojs.globe.v12i.6665 doi: 10.5278/ojs.globe.v12i.6665
[10]	J. Nichols, Linguistic complexity: A comprehensive definition and survey, In: G. Sampson, D. Gil, P. Trudgill, editors, Language Complexity as an Evolving Variable, Oxford: Oxford University Press, 2009,110–125.
[11]	G. Deutscher, Overall complexity: A wild goose chase? In: G. Sampson, D. Gil, P. Trudgill, editors, Language Complexity as an Evolving Variable, Oxford: Oxford University Press, 2009,243–251.
[12]	Ç. Çöltekin, T. Rama, What do complexity measures measure? Correlating and validating corpus-based measures of morphological complexity, Linguist. Vanguard, 9 (2023), 27–43. https://doi.org/10.1515/lingvan-2021-0007 doi: 10.1515/lingvan-2021-0007
[13]	E. A. Moravcsik, Explaining language universals, In: J. J. Song, editor, The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press, 2010, 69–89. https://doi.org/10.1093/oxfordhb/9780199281251.013.0005
[14]	J. H. Greenberg, Universals of language, Cambridge, MA: MIT Press, 1963.
[15]	G. Palloti, A simple view of linguistic complexity, Second Lang. Res., 31 (2015), 117–134. https://doi.org/10.1177/0267658314536435 doi: 10.1177/0267658314536435
[16]	J. McWhorter, The world's simplest grammars are creole grammars, Linguist. Typol., 6 (2001), 125–166. https://doi.org/10.1515/lity.2001.001 doi: 10.1515/lity.2001.001
[17]	C. Bentz, X. Gutierrez-Vasques, O. Sozinova, T. Samardžić, Complexity trade-offs and equi-complexity in natural languages: A meta-analysis, Linguist. Vanguard, 9 (2023), 9–25. https://doi.org/10.1515/lingvan-2021-0054 doi: 10.1515/lingvan-2021-0054
[18]	O. Shcherbakova, V. Gast, D. Blasi, H. Skirgard, R. Gray, S. Greenhil, A quantitative global test of the complexity trade-off hypothesis: The case of nominal and verbal grammatical marking, Linguist. Vanguard, 9 (2023), 155–167. https://doi.org/10.1515/lingvan-2021-0011 doi: 10.1515/lingvan-2021-0011
[19]	R. Baechler, G. Seiler, Complexity, isolation, and variation, Berlin: De Gruyter, 2016. https://doi.org/10.1515/9783110348965
[20]	B. Baerman, D. Brown, G. G. Corbett, Understanding and measuring morphological complexity, Oxford: Oxford University Press, 2015. https://doi.org/10.1093/acprof: oso/9780198723769.001.0001
[21]	G. Coloma, La Complejidad de los Idiomas, Berlin: Peter Lang, 2017. https://doi.org/10.3726/b10613
[22]	C. C. Jiménez, Complejidad lingüística: Orígenes y revisión crítica del concepto de lengua compleja, Berlin: Peter Lang, 2018. https://doi.org/10.3726/b14515
[23]	E. Di Domenico, Syntactic complexity from a language acquisition perspective, Newcastle upon Tyne: Cambridge Scholars Publishing, 2017.
[24]	B. Kortmann, B. Szmrecsanyi, Linguistic complexity: Second language acquisiton, indigenization, contact, Berlin: Mouton de Gruyter, 2012. https://doi.org/10.1515/9783110229226
[25]	F. L. Mantia, I. Licata, P. Perconti, Language in complexity: The emerging meaning, Berlin: Springer, 2017. https://doi.org/10.1007/978-3-319-29483-4
[26]	J. McWhorter, Linguistic simplicity and complexity: Why do languages undress? Berlin: Mouton de Gruyter, 2012. https://doi.org/10.1515/9781934078402
[27]	F. J. Newmeyer, L. B. Preston, Measuring grammatical complexity, Oxford: Oxford Univesity Press, 2014. https://doi.org/10.1093/acprof: oso/9780199685301.001.0001
[28]	L. Ortega, Z. H. Han, Complexity theory and language development, Amsterdam: John Benjamins, 2017. https://doi.org/10.1075/lllt.48
[29]	M. Miestamo, Grammatical complexity in a cross-linguistic perspective, In: M. Miestamo, K. Sinnemäki, F. Karlsson, editors, Language Complexity: Typology, Contact, Change, Amsterdam: John Benjamins, 2008, 23–42. https://doi.org/10.1075/slcs.94.04mie
[30]	Ö. Dahl, The growth and maintenance of linguistic complexity, Amsterdam: John Benjamins, 2004. https://doi.org/10.1075/slcs.71
[31]	W. Kusters, Linguistic complexity: The influence of social change on verbal inflection, Utrecht: LOT, 2003.
[32]	P. Trudgill, Contact and simplification: Historical baggage and directionality in linguistic change, Linguist. Typol., 5 (2001), 371–374.
[33]	J. A. Hawkins, An efficiency theory of complexity and related phenomena, In: G. Sampson, D. Gil, P. Trudgill, editors, Language Complexity Evolving Variation, Oxford: Oxford University Press, 2009,252–268.
[34]	K. Ehret, Kolmogorov complexity as a universal measure of language complexity, In: Proceedings of the First Shared Task on Measuring Language Complexity, 2018, 8–14.
[35]	A. Andrason, Language complexity: An insight from complex-system theory, Int. J. Lang. Linguist., 2 (2014), 74–89. https://doi.org/10.11648/J.IJLL.20140202.15 doi: 10.11648/J.IJLL.20140202.15
[36]	P. Blache, A computational model for linguistic complexity, In: G. Bel-Enguix, V. Dahl, M. D. Jiménez-López, editors, Biology, Computation and Linguistics, New Interdisciplinary Paradigms, Amsterdam: IOS Press, 2011,155–167. https://doi.org/10.3233/978-1-60750-762-8-155
[37]	B. Bulté, A. Housen, Defining and operationalising L2 complexity, In: A. Housen, F. Kuiken, I. Vedder, editors, Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, Amsterdam: John Benjamins, 2012, 21–46. https://doi.org/10.1075/lllt.32.02bul
[38]	F. Kuiken, Linguistic complexity in second language acquisition, Linguist. Vanguard, 9 (2023), 83–93. https://doi.org/10.1515/lingvan-2021-0112 doi: 10.1515/lingvan-2021-0112
[39]	M. Mohammadi, Complexity of language and SLA, J. Soc. Sci. Human. Res., 8 (2020), 13–17. https://doi.org/10.24200/jsshr.vol8iss03pp13-17 doi: 10.24200/jsshr.vol8iss03pp13-17
[40]	A. Housen, B. De Clercq, F. Kuiken, I. Vedder, Multiple approaches to complexity in second language research, Second Lang. Res., 35 (2019), 3–21. https://doi.org/10.1177/0267658318809765 doi: 10.1177/0267658318809765
[41]	A. Housen, H. Simoens, Cognitive perspectives on difficulty and complexity in L2 acquisition, Stud. Second Lang. Acq., 38 (2016), 163–175. https://doi.org/10.1017/S0272263116000176 doi: 10.1017/S0272263116000176
[42]	A. Housen, F. Kuiken, I. Vedder, Complexity, accuracy and fluency, In: A. Housen, F. Kuiken, I. Vedder, editors, Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, Amsterdam: John Benjamins, 2012, 1–20. https://doi.org/10.1075/lllt.32.01hou
[43]	P. Ramat, The (early) history of linguistic typology, In: J. J. Song, editor, The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press, 2010, 9–24. https://doi.org/10.1093/oxfordhb/9780199281251.013.0002
[44]	C. Mauri, Obiettivi, metodi e strumenti della tipologia, In: N. Grandi, C. Mauri, editors, La tipologia linguistica: unità e diversità nelle lingue del mondo, Roma: Carocci Editore, 2022, 23–54.
[45]	H. O'Horan, Y. Berzak, I. Vulić, R. Reichart, A. Korhonen, Survey on the use of typological information in natural language processing, In: Y. Matsumoto, R. Prasad, editors, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka: COLING, 2016, 1297–1308.
[46]	E. M. Ponti, H. O'Horan, Y. Berzak, I Vulić, R. Reichart, T. Poibeau, et al., Modeling language variation and universals: A survey on typological linguistics for natural language processing, Comput. Linguist., 45 (2019), 1–156. https://doi.org/10.1162/coli_a_00357 doi: 10.1162/coli_a_00357
[47]	N. Levshina, Corpus-based typology: Applications, challenges and some solutions, Linguist. Typol., 26 (2022), 129–160. https://doi.org/10.1515/lingty-2020-0118 doi: 10.1515/lingty-2020-0118
[48]	K. Gerdes, S. Kahane, X. Chen, Typometrics: From implicational to quantitative universals in word order typology, Glossa, 6 (2021), 1–6. https://doi.org/10.5334/gjgl.764 doi: 10.5334/gjgl.764
[49]	B. Bickel, Absolute and statistical universals, In: P. Colm Hogan, editor, The Cambridge Encyclopedia of the Language Sciences, Cambridge: Cambridge University Press, 2010, 77–79.
[50]	J. Nivre, M. C. Marneffe, F. Ginter, J. Hajič, C. D. Manning, S. Pyysalo, et al., Universal dependencies, 2023. Available from: https://universaldependencies.org/.
[51]	S. Petrov, D. Das, R. McDonald, A universal part-of-speech tagset, In: N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis, editors, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), Istanbul: European Language Resources Association, 2012, 2089–2096.
[52]	M. de Marneffe, T. Dozat, N. Silveira, K. Haverinen, F. Ginter, J. Nivre, et al., Universal Stanford dependencies: A cross-linguistic typology, In: N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC), Reykjavik: European Language Resources Association (ELRA), 2014, 4585–4592.
[53]	R. Futrell, R. P. Levy, E. Gibson, Dependency locality as an explanatory principle for word order, Language, 96 (2020), 371–412. https://doi.org/ 10.1353/lan.2020.0019 doi: 10.1353/lan.2020.0019
[54]	B. Guillaume, Graph matching and graph rewriting: GREW tools for corpus exploration, maintenance and conversion, In: D. Gkatzia, D. Seddah, editors, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, 2021,168–175. https://doi.org/10.18653/v1/2021.eacl-demos.21
[55]	V. Novák, Mining information from time series in the form of sentences of natural language, Int. J. Approx. Reason., 78 (2016), 192–209. https://doi.org/10.1016/j.ijar.2016.07.006 doi: 10.1016/j.ijar.2016.07.006
[56]	V. Novák, The concept of linguistic variable revisited, In: M. Sugeno, J. Kacprzyk, S. Shabazova, editors, Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing, Berlin/Heidelberg, Germany: Springer, 2020,105–118.
[57]	V. Novák, Fuzzy logic in natural language processing, In: Proceedings of the 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Naples, Italy: IEEE, 2017. https://doi.org/10.1109/FUZZ-IEEE.2017.8015405
[58]	V. Novák, Mathematical fuzzy logic: From vagueness to commonsese reasoning, In: G. Kreuzbauer, N. Gratzl, E. Hielb, editors, Retorische Wissenschaft: Rede und Argumentation in Theorie und Praxis, Wien, Austria: LIT-Verlag, 2008,191–223.
[59]	V. Novák, What is fuzzy natural logic, In: Integrated Uncertainty in Knowledge Modelling and Decision Making, V. Huynh, M. Inuiguchi, T. Denoeux, editors, Berlin/Heidelberg, Germany: Springer, 2015, 15–18.
[60]	V. Novák, Fuzzy natural logic: Towards mathematical logic of human reasoning, In: R. Seising, E. Trillas, J. Kacprzyk, editors, Fuzzy Logic: Towards the Future, Berlin/Heidelberg, Germany: Springer, 2015,137–165.
[61]	V. Novák, Evaluative linguistic expressions vs. fuzzy categories? Fuzzy Set. Syst., 281 (2015), 81–87.
[62]	A. Torrens-Urrutia, V. Novák, M. D. Jiménez-López, Describing linguistic vagueness of evaluative expressions using fuzzy natural logic and linguistic constraints, Mathematics, 10 (2022), 2760. https://doi.org/10.3390/math10152760 doi: 10.3390/math10152760
[63]	A. Torrens-Urrutia, M. D. Jiménez-López, S. Campillo-Muñoz, Dealing with evaluative expressions and hate speech metaphors with Fuzzy Property Grammar Systems, Axioms, 12 (2023), 484. https://doi.org/10.3390/axioms12050484 doi: 10.3390/axioms12050484
[64]	A. Torrens-Urrutia, V. Novák, M. D. Jiménez-López, Fuzzy property grammars for gradience in natural language, Mathematics, 11 (2023), 735. https://doi.org/10.3390/math11030735 doi: 10.3390/math11030735
[65]	A. Torrens-Urrutia, M. D. Jiménez-López, A. Brosa-Rodríguez, D. Adamczyk, A fuzzy grammar for evaluating universality and complexity in natural language, Mathematics, 10 (2023), 602. https://doi.org/10.3390/math10152602 doi: 10.3390/math10152602
[66]	A. Torrens-Urrutia, M. D. Jiménez-López, A. Brosa-Rodríguez, A fuzzy approach to language universals for NLP, In: Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Luxembourg: IEEE, 2021, 1–6. https://doi.org/10.1109/FUZZ45933.2021.9494516
[67]	M. Daniel, Linguistic typology and the study of language, In: J. J. Song, editor, The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press, 2010, 43–68. https://doi.org/10.1093/oxfordhb/9780199281251.013.0004
[68]	H. Hammarström, Counting languages in dialect continua using the criterion of mutual intelligibility, J. Quant. Linguist., 15 (2008), 34–45. https://doi.org/10.1080/09296170701794278 doi: 10.1080/09296170701794278
[69]	M. Cysouw, Using the world atlas of language structures, Lang. Typol. Univ., 61 (2009), 1–6. https://doi.org/10.1524/stuf.2008.0018 doi: 10.1524/stuf.2008.0018
[70]	D. Bakker, Language sampling, In: J. J. Song, editor, The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press, 2010, 1–26. https://doi.org/10.1093/oxfordhb/9780199281251.013.0007
[71]	M. Miestamo, D. Bakker, A. Arppe, Sampling for variety, Linguist. Typol., 20 (2016), 233–296. https://doi.org/10.1515/lingty-2016-0006 doi: 10.1515/lingty-2016-0006
[72]	M. G. Naranjo, L. Becker, Statistical bias control in typology, Linguist. Typol., 26 (2022), 605–670. https://doi.org/10.1515/lingty-2021-0002 doi: 10.1515/lingty-2021-0002
[73]	A. Brosa-Rodríguez, M. D. Jiménez-López, A typometrical study of Greenberg's linguistic universal 1, In: R. Mehmood, et al., editors, Distributed Computing and Artificial Intelligence, Lecture Notes in Networks and Systems, Berlin: Springer, 741 (2023), 186–196. https://doi.org/10.1007/978-3-031-38318-2_19
[74]	K. Gerdes, S. Kahane, X. Chen, Rediscovering Greenberg's word order universals in UD, In: A. Rademaker, F. Tyers (Editors), editors, Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), Paris: Association for Computational Linguistics, 2019,124–131. https://doi.org/10.18653/v1/W19-8015
[75]	K. Sinnemäki, Language universals and linguistic complexity: Three case studies in core argument marking, Unpublished PhD dissertation, Helskinki: University of Helsinki, 2011.
[76]	M. Miestamo, K. Sinnemäki, F. Karlsson, Language complexity: Typology, contact, change, Amsterdam: John Benjamins, 2008.
[77]	J. F. Newmeyer, More complicated and hence, rarer: A look at grammatical complexity and crosslinguistic rarity, In: V. S. Karimi, V. Samiian, W. K. Wilkins, editors, Phrasal and clausal architecture: Syntactic derivation and interpretation, Berlin: Mouton de Gruyter, 2007,221–242.
[78]	A. C. Harris, On the explanation of typologically unusual structures, In: J. Good, editor, Linguistic universals and language change, Oxford: Oxford University Press, 2008, 54–76.
[79]	B. Edmonds, Syntactic measures of complexity, Unpublished PhD dissertation, Manchester: University of Manchester, 1999.
[80]	A. Torrens-Urrutia, M. D. Jiménez-López, A. Brosa-Rodríguez, D. Adamczyk, A fuzzy grammar for evaluating universality and complexity in natural language, Mathematics, 10 (2022), 2602. https://doi.org/10.3390/math10152602 doi: 10.3390/math10152602
[81]	R. Tomlin, Basic word order: Functional principles, London: Croom Helm, 1986.
[82]	M. Dryer, Why statistical universals are better than absolute universals, In: Papers from the 33rd Annual Meeting of the Chicago Linguistics Society, 1998, 1–23.
[83]	M. Dryer, On the order of demonstrative, numeral, adjective, and noun, Language, 94 (2018), 798–833. https://doi.org/10.1353/lan.2018.0054 doi: 10.1353/lan.2018.0054
[84]	W. Croft, Typology and universals, Cambridge: Cambridge University Press, 2003.
[85]	M. Dryer, M. Haspelmath, The world atlas of language structures online, WALS Online (v2020.3), Data set, Zenodo, 2023. https://doi.org/10.5281/zenodo.7385533
[86]	H. S. Choi, B. Guillaume, K. Fort, Corpus-based language universals analysis using universal dependencies, ACL Anthology, 2021, 1–15.
[87]	H. S. Choi, B. Guillaume, K. Fort, Investigating dominant word order on universal dependencies with graph rewriting, Int. Conf. Recent Adv. Nat. Lang. Proc., 2021,281–290.

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)