Related papers: Parsing TTree Formula in Python
Uproot reads ROOT TTrees using pure Python. For numerical and (singly) jagged arrays, this is fast because a whole block of data can be interpreted as an array without modifying the data. For other cases, such as arrays of…
Trinity-RFT is a general-purpose, unified and easy-to-use framework designed for reinforcement fine-tuning (RFT) of large language models. It is built with a modular and decoupled design, consisting of (1) an RFT-core that unifies and…
Packrat parsing is a novel technique for implementing parsers in a lazy functional programming language. A packrat parser provides the power and flexibility of top-down parsing with backtracking and unlimited lookahead, but nevertheless…
In real practice, questions are typically complex and knowledge-intensive, requiring Large Language Models (LLMs) to recognize the multifaceted nature of the question and reason across multiple information sources. Iterative and adaptive…
This paper presents Tree Notation, a new simple, universal syntax. Language designers can invent new programming languages, called Tree Languages, on top of Tree Notation. Tree Languages have a number of advantages over traditional…
Programmers regularly use strings to encode many types of data, such as Unix file paths, URLs, and email addresses, that are conceptually different. However, existing mainstream programming languages use a unified string type to represent…
Source code can be parsed into the abstract syntax tree (AST) based on defined syntax rules. However, in pre-training, little work has considered the incorporation of tree structure into the learning process. In this paper, we present…
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning, which is crucial for tasks such as table question answering (TQA) and table-based fact verification (TFV).…
Logic languages based on the theory of rational, possibly infinite, trees have much appeal in that rational trees allow for faster unification (due to the safe omission of the occurs-check) and increased expressivity (cyclic terms can…
TauREx 3 is the next generation of the TauREx exoplanet atmospheric retrieval framework for Windows, Mac, and Linux. It is a complete rewrite with a full Python stack that makes it easy-to-use, high-performance, dynamic, and flexible. The…
The Fortran programming language continues to dominate the scientific computing community, with many production codes written in the outdated Fortran-77 dialect, yet with many non-standard extensions such as Cray poiters. This creates…
Syntactic dependencies can be predicted with high accuracy, and are useful for both machine-learned and pattern-based information extraction tasks. However, their utility can be improved. These syntactic dependencies are designed to…
Large Language Models (LLMs) have shown strong performance in automated source-to-target code translation through pretraining on extensive code corpora. However, mainstream LLM-based code translation methods suffer from two critical…
The use of natural language processing (NLP) is gaining popularity in software engineering. In order to correctly perform NLP, we must pre-process the textual information to separate natural language from other information, such as log…
For the last several months the main focus of development in the ROOT I/O package has been code consolidation and performance improvements. Access to remote files is affected both by bandwidth and latency. We introduced a pre-fetch…
The Triple Pattern Fragment (TPF) approach is de-facto a new way to publish Linked Data at low cost and with high server availability. However, data providers hosting TPF servers are not able to analyze the SPARQL queries they execute…
We present a general method for converting any family of unsatisfiable CNF formulas that is hard for one of the simplest proof systems, tree resolution, into formulas that require large rank in any proof system that manipulates polynomials…
How can we perform computations over natural language representations to solve tasks that require symbolic and numeric reasoning? We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic…
While it has become common to perform automated translations on natural language, performing translations between different representations of mathematical formulae has thus far not been possible. We implemented the first translator for…
Large language model agents that use external tools are often implemented through reactive execution, in which reasoning is repeatedly recomputed after each observation, increasing latency and sensitivity to error propagation. This work…