Related papers: Towards Analyzing N-language Polyglot Programs
Multilingual programs, whose implementations are made of different languages, are gaining traction especially in domains, such as web programming, that particularly benefit from the additional flexibility brought by using multiple…
Contemporary software often becomes vastly complex, and we are required to use a variety of technologies and different programming languages for its development. As interoperability between programming languages could cause high overhead…
Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems…
As software grows increasingly complex, the quantity and diversity of concerns to be addressed also rises. To answer this diversity of concerns, developers may end up using multiple programming languages in a single software project, a…
Modular analysis of polyglot applications is challenging because heap object flows across language boundaries must be resolved. The state-of-the-art analyses for polyglot applications have two fundamental limitations. First, they assume…
Developer preferences, language capabilities and the persistence of older languages contribute to the trend that large software codebases are often multilingual, that is, written in more than one computer language. While developers can…
Context: Software development tools should work and behave consistently across different programming languages, so that developers do not have to familiarize themselves with new tooling for new languages. Also, being able to combine…
Large language models (LLMs) are at the forefront of transforming numerous domains globally. However, their inclusivity and effectiveness remain limited for non-Latin scripts and low-resource languages. This paper tackles the imperative…
The breakthrough of generative large language models (LLMs) that can solve different tasks through chat interaction has led to a significant increase in the use of general benchmarks to assess the quality or performance of these models…
The increasing complexity of software systems has driven significant advancements in program analysis, as traditional methods unable to meet the demands of modern software development. To address these limitations, deep learning techniques,…
Polyglot is a pioneering project aimed at enhancing the non-English language performance of multilingual language models. Despite the availability of various multilingual models such as mBERT (Devlin et al., 2019), XGLM (Lin et al., 2022),…
Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions. However, the development of LLMs has been primarily focused on high-resource languages, such as English,…
In modern software development, the JavaScript ecosystem of various frameworks and libraries used to develop contemporary web applications presents many advantages. JavaScript is a widely known interpreted programming language, simple to…
As part of a research on a novel in-process multiprogramming-language interoperability system, this study investigates the interoperability and usage of multiple programming languages within a large dataset of GitHub projects and Stack…
Multilingual language models have significantly advanced due to rapid progress in natural language processing. Models like BLOOM 1.7B, trained on diverse multilingual datasets, aim to bridge linguistic gaps. However, their effectiveness in…
In recent years, large language models (e.g., Open AI's GPT-4, Meta's LLaMa, Google's PaLM) have become the dominant approach for building AI systems to analyze and generate language online. However, the automated systems that increasingly…
NLP models today strive for supporting multiple languages and modalities, improving accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal capabilities by testing on a visual reasoning task. We observe…
In this paper, we introduce PolyMath, a multilingual mathematical reasoning benchmark covering 18 languages and 4 easy-to-hard difficulty levels. Our benchmark ensures difficulty comprehensiveness, language diversity, and high-quality…
Multilingual Large Language Models (LLMs) exhibit remarkable cross-lingual abilities, yet often exhibit a systematic bias toward the representations from other languages, resulting in semantic interference when generating content in…
The rise of Large Language Models (LLMs) has revolutionized natural language processing across numerous languages and tasks. However, evaluating LLM performance in a consistent and meaningful way across multiple European languages remains…