Related papers: Generalised sequential crossover of words and lang…

Application of Generalised sequential crossover of languages to generalised splicing

This paper outlines an application of iterated version of generalised sequential crossover of two languages (which in some sense, an abstraction of the crossover of chromosomes in living organisms) in studying some classes of the newly…

Discrete Mathematics · Computer Science 2009-02-24 L. Jeganathan , R. Rama , Ritabrata Sengupta

Learning Regular Languages over Large Ordered Alphabets

This work is concerned with regular languages defined over large alphabets, either infinite or just too large to be expressed enumeratively. We define a generic model where transitions are labeled by elements of a finite partition of the…

Logic in Computer Science · Computer Science 2017-01-11 Irini-Eleftheria Mens , Oded Maler

Which Word Orders Facilitate Length Generalization in LMs? An Investigation with GCG-Based Artificial Languages

Whether language models (LMs) have inductive biases that favor typologically frequent grammatical properties over rare, implausible ones has been investigated, typically using artificial languages (ALs) (White and Cotterell, 2021;…

Computation and Language · Computer Science 2025-10-15 Nadine El-Naggar , Tatsuki Kuribayashi , Ted Briscoe

CroCoSum: A Benchmark Dataset for Cross-Lingual Code-Switched Summarization

Cross-lingual summarization (CLS) has attracted increasing interest in recent years due to the availability of large-scale web-mined datasets and the advancements of multilingual language models. However, given the rareness of naturally…

Computation and Language · Computer Science 2024-05-24 Ruochen Zhang , Carsten Eickhoff

GLUECoS : An Evaluation Benchmark for Code-Switched NLP

Code-switching is the use of more than one language in the same conversation or utterance. Recently, multilingual contextual embedding models, trained on multiple monolingual corpora, have shown promising results on cross-lingual and…

Computation and Language · Computer Science 2020-05-15 Simran Khanuja , Sandipan Dandapat , Anirudh Srinivasan , Sunayana Sitaram , Monojit Choudhury

Systematic Generalization on gSCAN with Language Conditioned Embedding

Systematic Generalization refers to a learning algorithm's ability to extrapolate learned behavior to unseen situations that are distinct but semantically similar to its training data. As shown in recent work, state-of-the-art deep learning…

Artificial Intelligence · Computer Science 2020-10-06 Tong Gao , Qi Huang , Raymond J. Mooney

Cross-lingual Lifelong Learning

The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions. There has been a large amount of work to adapt such multi-lingual models…

Computation and Language · Computer Science 2024-01-01 Meryem M'hamdi , Xiang Ren , Jonathan May

Post-Training Language Models for Crosslingual Consistency

Language models often respond inconsistently to translation-equivalent prompts across languages, undermining the reliability of multilingual systems. To quantify this, we give an information-theoretic definition of crosslingual consistency…

Computation and Language · Computer Science 2026-05-29 Tianyu Liu , Jirui Qi , Mrinmaya Sachan , Ryan Cotterell , Raquel Fernández , Arianna Bisazza

GECKO: Generative Language Model for English, Code and Korean

We introduce GECKO, a bilingual large language model (LLM) optimized for Korean and English, along with programming languages. GECKO is pretrained on the balanced, high-quality corpus of Korean and English employing LLaMA architecture. In…

Computation and Language · Computer Science 2024-05-27 Sungwoo Oh , Donggyu Kim

A Survey on Cross-Lingual Summarization

Cross-lingual summarization is the task of generating a summary in one language (e.g., English) for the given document(s) in a different language (e.g., Chinese). Under the globalization background, this task has attracted increasing…

Computation and Language · Computer Science 2022-08-31 Jiaan Wang , Fandong Meng , Duo Zheng , Yunlong Liang , Zhixu Li , Jianfeng Qu , Jie Zhou

A common scheme for running NLO ep event generators

In this article we present a generic interface to several next-to-leading order cross-section programs. This enables the user to implement his/her code once and make cross-checks with different programs.

High Energy Physics - Phenomenology · Physics 2007-05-23 Thomas Hadig , Gavin McCance

An $O(n^2)$ Algorithm for Computing Longest Common Cyclic Subsequence

The {\em longest common subsequence (LCS)} problem is a classic and well-studied problem in computer science. LCS is a central problem in stringology and finds broad applications in text compression, error-detecting codes and biological…

Data Structures and Algorithms · Computer Science 2010-04-20 Shihabur Rahman Chowdhury , Masud Hasan , Sumaiya Iqbal , M. Sohel Rahman

Advancing the Database of Cross-Linguistic Colexifications with New Workflows and Data

Lexical resources are crucial for cross-linguistic analysis and can provide new insights into computational models for natural language learning. Here, we present an advanced database for comparative studies of words with multiple meanings,…

Computation and Language · Computer Science 2025-08-22 Annika Tjuka , Robert Forkel , Christoph Rzymski , Johann-Mattis List

COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing…

Computation and Language · Computer Science 2020-10-13 Najoung Kim , Tal Linzen

Generalized Multimodal ELBO

Multiple data types naturally co-occur when describing real-world phenomena and learning from them is a long-standing goal in machine learning research. However, existing self-supervised generative models approximating an ELBO are not able…

Machine Learning · Computer Science 2021-06-28 Thomas M. Sutter , Imant Daunhawer , Julia E. Vogt

PTL-separability and closures for WQOs on words

We introduce a flexible class of well-quasi-orderings (WQOs) on words that generalizes the ordering of (not necessarily contiguous) subwords. Each such WQO induces a class of piecewise testable languages (PTLs) as Boolean combinations of…

Formal Languages and Automata Theory · Computer Science 2018-02-22 Georg Zetzsche

From Data to Dialogue: Unlocking Language for All

Traditional linguists have proposed the use of a General Service List (GSL) to assist new language learners in identifying the most important words in English. This process requires linguistic expertise, subjective input, and a considerable…

Computation and Language · Computer Science 2025-12-18 Dakota Ellis , Samy Bakikerali , Wanshan Chen , Bao Dinh , Uyen Le

Subword-Level Language Identification for Intra-Word Code-Switching

Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token. However, if at least one…

Computation and Language · Computer Science 2019-04-04 Manuel Mager , Özlem Çetinoğlu , Katharina Kann

Global Encoding for Abstractive Summarization

In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information…

Computation and Language · Computer Science 2018-06-14 Junyang Lin , Xu Sun , Shuming Ma , Qi Su

Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization

Large language models (LLMs) have exhibited considerable cross-lingual generalization abilities, whereby they implicitly transfer knowledge across languages. However, the transfer is not equally successful for all languages, especially for…

Computation and Language · Computer Science 2023-12-25 Ningyu Xu , Qi Zhang , Jingting Ye , Menghan Zhang , Xuanjing Huang