English
Related papers

Related papers: Generalised sequential crossover of words and lang…

200 papers

This paper outlines an application of iterated version of generalised sequential crossover of two languages (which in some sense, an abstraction of the crossover of chromosomes in living organisms) in studying some classes of the newly…

Discrete Mathematics · Computer Science 2009-02-24 L. Jeganathan , R. Rama , Ritabrata Sengupta

This work is concerned with regular languages defined over large alphabets, either infinite or just too large to be expressed enumeratively. We define a generic model where transitions are labeled by elements of a finite partition of the…

Logic in Computer Science · Computer Science 2017-01-11 Irini-Eleftheria Mens , Oded Maler

Whether language models (LMs) have inductive biases that favor typologically frequent grammatical properties over rare, implausible ones has been investigated, typically using artificial languages (ALs) (White and Cotterell, 2021;…

Computation and Language · Computer Science 2025-10-15 Nadine El-Naggar , Tatsuki Kuribayashi , Ted Briscoe

Cross-lingual summarization (CLS) has attracted increasing interest in recent years due to the availability of large-scale web-mined datasets and the advancements of multilingual language models. However, given the rareness of naturally…

Computation and Language · Computer Science 2024-05-24 Ruochen Zhang , Carsten Eickhoff

Code-switching is the use of more than one language in the same conversation or utterance. Recently, multilingual contextual embedding models, trained on multiple monolingual corpora, have shown promising results on cross-lingual and…

Computation and Language · Computer Science 2020-05-15 Simran Khanuja , Sandipan Dandapat , Anirudh Srinivasan , Sunayana Sitaram , Monojit Choudhury

Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations that are distinct but semantically similar to its training data. As shown in recent work, state-of-the-art deep learning…

Artificial Intelligence · Computer Science 2020-10-06 Tong Gao , Qi Huang , Raymond J. Mooney

The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions. There has been a large amount of work to adapt such multi-lingual models…

Computation and Language · Computer Science 2024-01-01 Meryem M'hamdi , Xiang Ren , Jonathan May

Language models often respond inconsistently to translation-equivalent prompts across languages, undermining the reliability of multilingual systems. To quantify this, we give an information-theoretic definition of crosslingual consistency…

Computation and Language · Computer Science 2026-05-29 Tianyu Liu , Jirui Qi , Mrinmaya Sachan , Ryan Cotterell , Raquel Fernández , Arianna Bisazza

We introduce GECKO, a bilingual large language model (LLM) optimized for Korean and English, along with programming languages. GECKO is pretrained on the balanced, high-quality corpus of Korean and English employing LLaMA architecture. In…

Computation and Language · Computer Science 2024-05-27 Sungwoo Oh , Donggyu Kim

Cross-lingual summarization is the task of generating a summary in one language (e.g., English) for the given document(s) in a different language (e.g., Chinese). Under the globalization background, this task has attracted increasing…

Computation and Language · Computer Science 2022-08-31 Jiaan Wang , Fandong Meng , Duo Zheng , Yunlong Liang , Zhixu Li , Jianfeng Qu , Jie Zhou

In this article we present a generic interface to several next-to-leading order cross-section programs. This enables the user to implement his/her code once and make cross-checks with different programs.

High Energy Physics - Phenomenology · Physics 2007-05-23 Thomas Hadig , Gavin McCance

The {\em longest common subsequence (LCS)} problem is a classic and well-studied problem in computer science. LCS is a central problem in stringology and finds broad applications in text compression, error-detecting codes and biological…

Data Structures and Algorithms · Computer Science 2010-04-20 Shihabur Rahman Chowdhury , Masud Hasan , Sumaiya Iqbal , M. Sohel Rahman

Lexical resources are crucial for cross-linguistic analysis and can provide new insights into computational models for natural language learning. Here, we present an advanced database for comparative studies of words with multiple meanings,…

Computation and Language · Computer Science 2025-08-22 Annika Tjuka , Robert Forkel , Christoph Rzymski , Johann-Mattis List

Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing…

Computation and Language · Computer Science 2020-10-13 Najoung Kim , Tal Linzen

Multiple data types naturally co-occur when describing real-world phenomena and learning from them is a long-standing goal in machine learning research. However, existing self-supervised generative models approximating an ELBO are not able…

Machine Learning · Computer Science 2021-06-28 Thomas M. Sutter , Imant Daunhawer , Julia E. Vogt

We introduce a flexible class of well-quasi-orderings (WQOs) on words that generalizes the ordering of (not necessarily contiguous) subwords. Each such WQO induces a class of piecewise testable languages (PTLs) as Boolean combinations of…

Formal Languages and Automata Theory · Computer Science 2018-02-22 Georg Zetzsche

Traditional linguists have proposed the use of a General Service List (GSL) to assist new language learners in identifying the most important words in English. This process requires linguistic expertise, subjective input, and a considerable…

Computation and Language · Computer Science 2025-12-18 Dakota Ellis , Samy Bakikerali , Wanshan Chen , Bao Dinh , Uyen Le

Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token. However, if at least one…

Computation and Language · Computer Science 2019-04-04 Manuel Mager , Özlem Çetinoğlu , Katharina Kann

In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information…

Computation and Language · Computer Science 2018-06-14 Junyang Lin , Xu Sun , Shuming Ma , Qi Su

Large language models (LLMs) have exhibited considerable cross-lingual generalization abilities, whereby they implicitly transfer knowledge across languages. However, the transfer is not equally successful for all languages, especially for…

Computation and Language · Computer Science 2023-12-25 Ningyu Xu , Qi Zhang , Jingting Ye , Menghan Zhang , Xuanjing Huang
‹ Prev 1 2 3 10 Next ›