Related papers: Quantifying French Document Complexity

Complexity Metric for Code-Mixed Social Media Text

An evaluation metric is an absolute necessity for measuring the performance of any system and complexity of any data. In this paper, we have discussed how to determine the level of complexity of code-mixed social media texts that are…

Computation and Language · Computer Science 2017-07-06 Souvick Ghosh , Satanu Ghosh , Dipankar Das

Difficulty Estimation and Simplification of French Text Using LLMs

We leverage generative large language models for language learning applications, focusing on estimating the difficulty of foreign language texts and simplifying them to lower difficulty levels. We frame both tasks as prediction problems and…

Computation and Language · Computer Science 2024-07-26 Henri Jamet , Yash Raj Shrestha , Michalis Vlachos

Text Complexity Classification Based on Linguistic Information: Application to Intelligent Tutoring of ESL

The goal of this work is to build a classifier that can identify text complexity within the context of teaching reading to English as a Second Language (ESL) learners. To present language learners with texts that are suitable to their level…

Computation and Language · Computer Science 2023-06-22 M. Zakaria Kurdi

What do complexity measures measure? Correlating and validating corpus-based measures of morphological complexity

We present an analysis of eight measures used for quantifying morphological complexity of natural languages. The measures we study are corpus-based measures of morphological complexity with varying requirements for corpus annotation. We…

Computation and Language · Computer Science 2022-04-12 Çağrı Çöltekin , Taraka Rama

Measuring complexity

Complexity is a multi-faceted phenomenon, involving a variety of features including disorder, nonlinearity, and self-organisation. We use a recently developed rigorous framework for complexity to understand measures of complexity. We…

Adaptation and Self-Organizing Systems · Physics 2020-09-22 Karoline Wiesner , James Ladyman

A Readable Read: Automatic Assessment of Language Learning Materials based on Linguistic Complexity

Corpora and web texts can become a rich language learning resource if we have a means of assessing whether they are linguistically appropriate for learners at a given proficiency level. In this paper, we aim at addressing this issue by…

Computation and Language · Computer Science 2016-03-30 Ildikó Pilán , Sowmya Vajjala , Elena Volodina

A German Corpus for Text Similarity Detection Tasks

Text similarity detection aims at measuring the degree of similarity between a pair of texts. Corpora available for text similarity detection are designed to evaluate the algorithms to assess the paraphrase level among documents. In this…

Information Retrieval · Computer Science 2017-03-14 Juan-Manuel Torres-Moreno , Gerardo Sierra , Peter Peinl

Measuring the Overall Complexity of Graphical and Textual IEC 61131-3 Control Software

Software implements a significant proportion of functionality in factory automation. Thus, efficient development and the reuse of software parts, so-called units, enhance competitiveness. Thereby, complex control software units are more…

Software Engineering · Computer Science 2022-12-13 Juliane Fischer , Birgit Vogel-Heuser , Heiko Schneider , Nikolai Langer , Markus Felger , Matthias Bengel

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

We use large language models to aid learners enhance proficiency in a foreign language. This is accomplished by identifying content on topics that the user is interested in, and that closely align with the learner's proficiency level in…

Computation and Language · Computer Science 2023-09-12 Michalis Vlachos , Mircea Lungu , Yash Raj Shrestha , Johannes-Rudolf David

Predicting CEFRL levels in learner English on the basis of metrics and full texts

This paper analyses the contribution of language metrics and, potentially, of linguistic structures, to classify French learners of English according to levels of the Common European Framework of Reference for Languages (CEFRL). The purpose…

Computation and Language · Computer Science 2018-06-29 Taylor Arnold , Nicolas Ballier , Thomas Gaillat , Paula Lissòn

Complexity-entropy analysis at different levels of organization in written language

Written language is complex. A written text can be considered an attempt to convey a meaningful message which ends up being constrained by language rules, context dependence and highly redundant in its use of resources. Despite all these…

Computation and Language · Computer Science 2019-05-20 E. Estevez-Rams , A. Mesa Rodriguez , D. Estevez-Moya

Assessing the Quality of Scientific Papers

A multitude of factors are responsible for the overall quality of scientific papers, including readability, linguistic quality, fluency,semantic complexity, and of course domain-specific technical factors. These factors vary from one field…

Information Retrieval · Computer Science 2019-08-13 Roman Vainshtein , Gilad Katz , Bracha Shapira , Lior Rokach

Measuring Global Similarity between Texts

We propose a new similarity measure between texts which, contrary to the current state-of-the-art approaches, takes a global view of the texts to be compared. We have implemented a tool to compute our textual distance and conducted…

Computation and Language · Computer Science 2014-05-15 Uli Fahrenberg , Fabrizio Biondi , Kevin Corre , Cyrille Jegourel , Simon Kongshøj , Axel Legay

Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

The ability to compare the semantic similarity between text corpora is important in a variety of natural language processing applications. However, standard methods for evaluating these metrics have yet to be established. We propose a set…

Computation and Language · Computer Science 2022-11-30 George Kour , Samuel Ackerman , Orna Raz , Eitan Farchi , Boaz Carmeli , Ateret Anaby-Tavor

Complexity measurement of natural and artificial languages

We compared entropy for texts written in natural languages (English, Spanish) and artificial languages (computer software) based on a simple expression for the entropy as a function of message length and specific word diversity. Code text…

Computation and Language · Computer Science 2015-12-03 Gerardo Febres , Klaus Jaffe , Carlos Gershenson

Estimating Lexical Complexity from Document-Level Distributions

Existing methods for complexity estimation are typically developed for entire documents. This limitation in scope makes them inapplicable for shorter pieces of text, such as health assessment tools. These typically consist of lists of…

Computation and Language · Computer Science 2024-04-02 Sondre Wold , Petter Mæhlum , Oddbjørn Hove

Termhood-based Comparability Metrics of Comparable Corpus in Special Domain

Cross-Language Information Retrieval (CLIR) and machine translation (MT) resources, such as dictionaries and parallel corpora, are scarce and hard to come by for special domains. Besides, these resources are just limited to a few languages,…

Computation and Language · Computer Science 2013-02-20 Sa Liu , Chengzhi Zhang

FrameNet automatic analysis : a study on a French corpus of encyclopedic texts

This article presents an automatic frame analysis system evaluated on a corpus of French encyclopedic history texts annotated according to the FrameNet formalism. The chosen approach relies on an integrated sequence labeling model which…

Computation and Language · Computer Science 2018-12-20 Gabriel Marzinotto , Géraldine Damnati , Frederic Bechet

A Conceptual Model for Measuring the Complexity of Spreadsheets

Spreadsheets are widely used in industry, even for critical business processes. This implies the need for proper risk assessment in spreadsheets to evaluate the reliability and validity of the spreadsheet's outcome. As related research has…

Software Engineering · Computer Science 2017-04-06 Thomas Reschenhofer , Bernhard Waltl , Klym Shumaiev , Florian Matthes

Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores

The diversity across outputs generated by LLMs shapes perception of their quality and utility. High lexical diversity is often desirable, but there is no standard method to measure this property. Templated answer structures and ``canned''…

Computation and Language · Computer Science 2026-02-19 Chantal Shaib , Venkata S. Govindarajan , Joe Barrow , Jiuding Sun , Alexa F. Siu , Byron C. Wallace , Ani Nenkova