Related papers: SODA: a TypeScript/JavaScript Library for Visualiz…

Soda: An Object-Oriented Functional Language for Specifying Human-Centered Problems

We present Soda (Symbolic Objective Descriptive Analysis), a language that helps to treat qualities and quantities in a natural way and greatly simplifies the task of checking their correctness. We present key properties for the language…

Programming Languages · Computer Science 2024-11-21 Julian Alfredo Mendez

TopoEmbedding, a web tool for the interactive analysis of persistent homology

Software libraries for Topological Data Analysis (TDA) offer limited support for interactive visualization. Most libraries only allow to visualize topological descriptors (e.g., persistence diagrams), and lose the connection with the…

Graphics · Computer Science 2022-04-22 Xueyi Bao , Guoxi Liu , Federico Iuricich

SODA: Semantic-Oriented Distributional Alignment for Generative Recommendation

Generative recommendation has emerged as a scalable alternative to traditional retrieve-and-rank pipelines by operating in a compact token space. However, existing methods mainly rely on discrete code-level supervision, which leads to…

Information Retrieval · Computer Science 2026-03-03 Ziqi Xue , Dingxian Wang , Yimeng Bai , Shuai Zhu , Jialei Li , Xiaoyan Zhao , Frank Yang , Andrew Rabinovich , Yang Zhang , Pablo N. Mendes

SAINE: Scientific Annotation and Inference Engine of Scientific Research

We present SAINE, an Scientific Annotation and Inference ENgine based on a set of standard open-source software, such as Label Studio and MLflow. We show that our annotation engine can benefit the further development of a more accurate…

Digital Libraries · Computer Science 2023-07-12 Susie Xi Rao , Yilei Tu , Peter H. Egger

SODA: A Semantics-Aware Optimization Framework for Data-Intensive Applications Using Hybrid Program Analysis

In the era of data explosion, a growing number of data-intensive computing frameworks, such as Apache Hadoop and Spark, have been proposed to handle the massive volume of unstructured data in parallel. Since programming models provided by…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-27 Bingbing Rao , Zixia Liu , Hong Zhang , Siyang Lu , Liqiang Wang

YEDDA: A Lightweight Collaborative Text Span Annotation Tool

In this paper, we introduce \textsc{Yedda}, a lightweight but efficient and comprehensive open-source tool for text span annotation. \textsc{Yedda} provides a systematic solution for text span annotation, ranging from collaborative user…

Computation and Language · Computer Science 2018-05-28 Jie Yang , Yue Zhang , Linwei Li , Xingxuan Li

CADV: A software visualization approach for code annotations distribution

Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this…

Software Engineering · Computer Science 2022-10-14 Phyllipe Lima , Jorge Melegati , Everaldo Gomes , Nathalya Stefhany Pereira , Eduardo Guerra , Paulo Meirelles

Using Glowscript to Teach Numerical Modeling in Undergraduate Biology Education

Mathematical and numerical modeling is an increasingly important, yet often neglected, topic for biology students. We have found Glowscript to facilitate teaching and introducing computer simulations to students. In particular, the built-in…

Physics Education · Physics 2022-04-28 Joshua G. Schreibeis , Olivia M. Merideth , Gavin A. Buxton

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset. By contextualizing social…

Computation and Language · Computer Science 2023-10-25 Hyunwoo Kim , Jack Hessel , Liwei Jiang , Peter West , Ximing Lu , Youngjae Yu , Pei Zhou , Ronan Le Bras , Malihe Alikhani , Gunhee Kim , Maarten Sap , Yejin Choi

Recommendations for Datasets for Source Code Summarization

Source Code Summarization is the task of writing short, natural language descriptions of source code. The main use for these descriptions is in software documentation e.g. the one-sentence Java method descriptions in JavaDocs. Code…

Computation and Language · Computer Science 2019-04-05 Alexander LeClair , Collin McMillan

JSOL: JavaScript Open-source Library for Grammar of Graphics

In this paper, we introduce the JavaScript Open-source Library (\libname), a high-level grammar for representing data in visualization graphs and plots. \libname~perspective on the grammar of graphics is unique; it provides state-of-art…

Graphics · Computer Science 2022-01-13 Waleed A. Yousef , Hisham E. Mohammed , Andrew A. Naguib , Rafat S. Eid , Sherif E. Emabrak , Ahmed F. Hamed , Yusuf M. Khalifa , Shrouk T. AbdElrheem , Eman A. Awad , Sara G. Gaafar , Alaa M. Mamdoh , Nada A. Shawky

WASA: A Web Application for Sequence Annotation

Data annotation is an important and necessary task for all NLP applications. Designing and implementing a web-based application that enables many annotators to annotate and enter their input into one central database is not a trivial task.…

Computation and Language · Computer Science 2019-10-07 Fahad AlGhamdi , Mona Diab

SODA: Generating SQL for Business Users

The purpose of data warehouses is to enable business analysts to make better decisions. Over the years the technology has matured and data warehouses have become extremely successful. As a consequence, more and more data has been added to…

Databases · Computer Science 2012-07-03 Lukas Blunschi , Claudio Jossen , Donald Kossman , Magdalini Mori , Kurt Stockinger

CodeLens: An Interactive Tool for Visualizing Code Representations

Representing source code in a generic input format is crucial to automate software engineering tasks, e.g., applying machine learning algorithms to extract information. Visualizing code representations can further enable human experts to…

Software Engineering · Computer Science 2023-07-28 Yuejun Guo , Seifeddine Bettaieb , Qiang Hu , Yves Le Traon , Qiang Tang

Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs

Large code models (LCMs) have remarkably advanced the field of code generation. Despite their impressive capabilities, they still face practical deployment issues, such as high inference costs, limited accessibility of proprietary LCMs, and…

Software Engineering · Computer Science 2025-05-21 Yujia Chen , Yang Ye , Zhongqi Li , Yuchi Ma , Cuiyun Gao

DR-Tools: a suite of lightweight open-source tools to measure and visualize Java source code

In Software Engineering, some of the most critical activities are maintenance and evolution. However, to perform both with quality, minimizing impacts and risks, developers need to analyze and identify where the main problems come from…

Software Engineering · Computer Science 2020-08-11 Guilherme Lacerda , Fabio Petrillo , Marcelo Pimenta

NOVA: A Practical Method for Creating Notebook-Ready Visual Analytics

How can we develop visual analytics (VA) tools that can be easily adopted? Visualization researchers have developed a large number of web-based VA tools to help data scientists in a wide range of tasks. However, adopting these standalone…

Human-Computer Interaction · Computer Science 2023-05-16 Zijie J. Wang , David Munechika , Seongmin Lee , Duen Horng Chau

NoteFlow: Recommending Charts as Sight Glasses for Tracing Data Flow in Computational Notebooks

Exploratory Data Analysis (EDA) is a routine task for data analysts, often conducted using flexible computational notebooks. During EDA, data workers process, visualize, and interpret data tables, making decisions about subsequent analysis.…

Human-Computer Interaction · Computer Science 2025-02-05 Yuan Tian , Dazhen Deng , Sen Yang , Huawei Zheng , Bowen Shi , Kai Xiong , Xinjing Yi , Yingcai Wu

DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool

We present a lightweight annotation tool, the Data AnnotatoR Tool (DART), for the general task of labeling structured data with textual descriptions. The tool is implemented as an interactive application that reduces human efforts in…

Computation and Language · Computer Science 2020-12-02 Ernie Chang , Jeriah Caplinger , Alex Marin , Xiaoyu Shen , Vera Demberg

Data-Driven Evidence-Based Syntactic Sugar Design

Programming languages are essential tools for developers, and their evolution plays a crucial role in supporting the activities of developers. One instance of programming language evolution is the introduction of syntactic sugars, which are…

Software Engineering · Computer Science 2024-02-05 David OBrien , Robert Dyer , Tien N. Nguyen , Hridesh Rajan