English
Related papers

Related papers: Comparing with Python: Text Analysis in Stata

200 papers

Most of the data produced in software projects is of textual nature: source code, specifications, or documentations. The advances in quantitative analysis methods drove a lot of data analytics in software engineering. This has overshadowed…

Software Engineering · Computer Science 2016-12-02 S. Wagner , D. Méndez Fernández

This guide introduces Large Language Models (LLM) as a highly versatile text analysis method within the social sciences. As LLMs are easy-to-use, cheap, fast, and applicable on a broad range of text analysis tasks, ranging from text…

Computation and Language · Computer Science 2023-07-26 Petter Törnberg

Analyzing textual data is a very challenging task because of the huge volume of data generated daily. Fundamental issues in text analysis include the lack of structure in document datasets, the need for various preprocessing steps %(e.g.,…

Databases · Computer Science 2016-12-20 Ciprian-Octavian Truică , Jérôme Darmont , Julien Velcin

Computer-assisted reading and analysis of text has various applications in the humanities and social sciences. The increasing size of many electronic text archives has the advantage of a more complete analysis but the disadvantage of taking…

Databases · Computer Science 2007-05-23 Steven Keith , Owen Kaser , Daniel Lemire

In these lecture notes, a selection of frequently required statistical tools will be introduced and illustrated. They allow to post-process data that stem from, e.g., large-scale numerical simulations (aka sequence of random experiments).…

Data Analysis, Statistics and Probability · Physics 2012-07-26 O. Melchert

Text data is inherently temporal. The meaning of words and phrases changes over time, and the context in which they are used is constantly evolving. This is not just true for social media data, where the language used is rapidly influenced…

Computation and Language · Computer Science 2025-03-05 Kai-Robin Lange , Niklas Benner , Lars Grönberg , Aymane Hachcham , Imene Kolli , Jonas Rieger , Carsten Jentsch

We use commercially available text analysis technology to process interview text data from a computational social science study. We find that topical clustering and terminological enrichment provide for convenient exploration and…

Computation and Language · Computer Science 2020-12-01 Jussi Karlgren , Renee Li , Eva M Meyersson Milgrom

Programs that process data that reside in files are widely used in varied domains, such as banking, healthcare, and web-traffic analysis. Precise static analysis of these programs in the context of software verification and transformation…

Programming Languages · Computer Science 2015-04-06 Raveendra Kumar Medicherla , Raghavan Komondoor , S. Narendran

Meta-analysis is a data aggregation method that establishes an overall and objective level of evidence based on the results of several studies. It is necessary to maintain a high level of homogeneity in the aggregation of data collected…

In Programming by Example, a system attempts to infer a program from input and output examples, generally by searching for a composition of certain base functions. Performing a naive brute force search is infeasible for even mildly involved…

Artificial Intelligence · Computer Science 2012-09-19 Aditya Krishna Menon , Omer Tamuz , Sumit Gulwani , Butler Lampson , Adam Tauman Kalai

Static source code analysis is a powerful tool for finding and fixing bugs when deployed properly; it is, however, all too easy to deploy it in a way that looks good superficially, but which misses important defects, shows many false…

Software Engineering · Computer Science 2022-02-25 Flash Sheridan

Text classification helps analyse texts for semantic meaning and relevance, by mapping the words against this hierarchy. An analysis of various types of texts is invaluable to understanding both their semantic meaning, as well as their…

Machine Learning · Computer Science 2022-11-16 Chaitanya Chadha , Vandit Gupta , Deepak Gupta , Ashish Khanna

Analyzing texts such as open-ended responses, headlines, or social media posts is a time- and labor-intensive process highly susceptible to bias. LLMs are promising tools for text analysis, using either a predefined (top-down) or a…

In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a hard challenge for static analysis of these…

Programming Languages · Computer Science 2019-08-21 Vincenzo Arceri , Isabella Mastroeni

Critical text assessment is at the core of many expert activities, such as fact-checking, peer review, and essay grading. Yet, existing work treats critical text assessment as a black box problem, limiting interpretability and human-AI…

Computation and Language · Computer Science 2025-06-03 Nils Dycke , Matej Zečević , Ilia Kuznetsov , Beatrix Suess , Kristian Kersting , Iryna Gurevych

Python is one of the most popular programming languages; as such, projects written in Python involve an increasing number of diverse security vulnerabilities. However, existing state-of-the-art analysis tools for Python only support a few…

Software Engineering · Computer Science 2026-01-22 Yoann Marquer , Domenico Bianculli , Lionel C. Briand

A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…

Machine Learning · Computer Science 2020-01-13 Rising Odegua , Festus Ikpotokin

TextDescriptives is a Python package for calculating a large variety of metrics from text. It is built on top of spaCy and can be easily integrated into existing workflows. The package has already been used for analysing the linguistic…

Computation and Language · Computer Science 2023-10-30 Lasse Hansen , Ludvig Renbo Olsen , Kenneth Enevoldsen

Performance analysis is a critical step in the oft-repeated, iterative process of performance tuning of parallel programs. Per-process, per-thread traces (detailed logs of events with timestamps) enable in-depth analysis of parallel program…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-15 Abhinav Bhatele , Rakrish Dhakal , Alexander Movsesyan , Aditya K. Ranjan , Onur Cankur

Literary analysis, criticism or studies is a largely valued field with dedicated journals and researchers which remains mostly within the humanities scope. Text analytics is the computer-aided process of deriving information from texts. In…

Computation and Language · Computer Science 2017-10-26 Renato Fabbri , Luis Henrique Garcia
‹ Prev 1 2 3 10 Next ›