Data manipulation language ≈ Data manipulation language
View article
xarray: N-D labeled Arrays and Datasets in Python Open
Presentation given at the 2020 CZI EOSS Meeting.
View article
ObspyDMT: a Python toolbox for retrieving and processing large seismological data sets Open
We present obspyDMT, a free, open-source software toolbox for the query, retrieval, processing and management of seismological data sets, including very large, heterogeneous and/or dynamically growing ones. ObspyDMT simplifies and speeds u…
View article
Can language models automate data wrangling? Open
The automation of data science and other data manipulation processes depend on the integration and formatting of ‘messy’ data. Data wrangling is an umbrella term for these tedious and time-consuming tasks. Tasks such as transforming dates,…
View article
Typos’ Effects on Web-Based Programming Code Output: A Computational Linguistics Study Open
Computational linguistics is concerned with understanding language from a computational perspective and constructing artifacts that are useful in processing and generating language. In the use of language, whether human language or program…
View article
Scalable Micro-planned Generation of Discourse from Structured Data Open
We present a framework for generating natural language description from structured data such as tables; the problem comes under the category of data-to-text natural language generation (NLG). Modern data-to-text NLG systems typically use e…
View article
Natural Language to SQL Queries: A Review Open
The relational database is the way of maintaining, storing, and accessing structured data but in order to access the data in that database the queries need to be translated in the format of SQL queries. Using natural language rather than S…
View article
AI Assistants: A Framework for Semi-Automated Data Wrangling Open
Data wrangling tasks such as obtaining and linking data from various sources, transforming data formats, and correcting erroneous records, can constitute up to 80% of typical data engineering work. Despite the rise of machine learning and …
View article
S, R, and data science Open
Data science is increasingly important and challenging. It requires computational tools and programming environments that handle big data and difficult computations, while supporting creative, high-quality analysis. The R language and rela…
View article
The Architecture of an Agricultural Data Aggregation and Conversion Model for Smart Farming Open
Monitoring and control systems integrated into agricultural machinery enable the development of agricultural analyses with advanced management tools, but the full use of all available data is often limited by the lack of uniformity among d…
View article
S, R, and Data Science Open
Data science is increasingly important and challenging.It requires computational tools and programming environments that handle big data and difficult computations, while supporting creative, high-quality analysis.The R language and relate…
View article
Research Report: The Parsley Data Format Definition Language Open
Any program that reads formatted input relies on parsing software to check the input for validity and transform it into a representation suitable for further processing. Many security vulnerabilities can be attributed to poorly defined gra…
View article
Expressing and Applying C++ Code Transformations for the HDF5 API Through a DSL Open
Hierarchical Data Format (HDF5) is a popular binary storage solution in high performance computing (HPC) and other scientific fields. It has bindings for many popular programming languages, including C++, which is widely used in the HPC fi…
View article
The history and recent advances of Natural Language Interfaces for Databases Querying Open
Databases have been always the most important topic in the study of information systems, and an indispensable tool in all information management systems. However, the extraction of information stored in these databases is generally carried…
View article
Mirror: A Natural Language Interface for Data Querying, Summarization, and Visualization Open
We present Mirror, an open-source platform for data exploration and analysis powered by large language models. Mirror offers an intuitive natural language interface for querying databases, and automatically generates executable SQL command…
View article
DPDS Open
Successful data-driven science requires a complex combination of data engineering pipelines and data modelling techniques. Robust and defensible results can only be achieved when each step in the pipeline that is designed to clean, transfo…
View article
beeRapp: an R shiny app for automated high-throughput explorative analysis of multivariate behavioral data Open
Summary Animal behavioral studies typically generate high-dimensional datasets consisting of multiple correlated outcome measures across distinct or related behavioral domains. Here, we introduce the BEhavioral Explorative analysis R shiny…
View article
Analyzing and Presenting Data with LabVIEW Open
LabVIEW is an abbreviation for Laboratory Virtual Instrument Engineering Workbench and allows scientists and engineers to develop and implement an interactive program. LabVIEW has been specially developed to take measurements, analyze data…
View article
obspyDMT: A Python Toolbox for Retrieving and Processing of Large Seismological Datasets Open
We present obspyDMT, a free, open source software toolbox for the query, retrieval, processing and management of seismological data sets, including very large, heterogeneous, and/or dynamically growing ones. obspyDMT simplifies and speeds …
View article
SpeakQL Natural Language to SQL Open
Incorporating SQL questions from normal language is a long-standing open issue and has been drawing in extensive intrigue as of late. Natural Language Interface (NLI) is the confluence of Natural Language Processing (NLP) and Human-Compute…
View article
Essentials of Data Wrangling Open
Fundamentally, data wrangling is an elaborate process of transforming, enriching, and mapping data from one raw data form into another, to make it more valuable for analysis and enhancing its quality. It is considered as a core task within…
View article
Syntax and Table Aware Parsing Based Naturalized Structured Query Language Open
A database is characterized as accumulation of data that is organized to access, manage, and update information effectively.All information is store in a database and there are numerous approaches to interface with the database to access o…
View article
MedEx – Data Analytics for Medical Domain Experts in Real-Time Open
Translational research in the medical sector is dependent on clear communication between all participants. Visualization helps to represent data from different sources in a comprehensible way across disciplines. Existing tools for clinical…
View article
Leaf—An open-source, model-agnostic, data-driven web application for cohort discovery and translational biomedical research Open
Objective Academic medical centers and health systems are increasingly challenged with supporting appropriate secondary use of data that originate from multiple sources. Enterprise Data Warehouses (EDWs) have emerged as central resources f…
View article
Transforming spreadsheets with data noodles Open
© 2016 IEEE.Data wrangling is the term used by data scientists for the work of re-organising data into a new structure, before work starts on reporting or analysis. We present a prototype that applies programming by example methods to data…
View article
Naturalizing a Programming Language via Interactive Learning Open
Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. However, existing natural language interfaces for such tasks…
View article
Data Preparation in Context of Social Sciences Research Open
In many research fields, including social sciences one needs to prepare quality data by pre-processing the raw data. An essential step in the data analysis process is data preparation. The raw data which we collect, never comes in a form t…
View article
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation Open
Vision-Language-Action (VLA) models have shown remarkable potential in visuomotor control and instruction comprehension through end-to-end learning processes. However, current VLA models face significant challenges: they are slow during in…
View article
PyHelpers: An open-source toolkit for facilitating Python users' data manipulation tasks Open
PyHelpers is an open-source Python package designed to streamline data (pre-)processing and manipulation tasks. It accommodates a wide range of functions and classes grounded in practical applications, making common data operations more ac…
View article
Generating SQL Command Syntax Using MySQL Based on Typing Command Sentence Open
Information retrieval system is a system that is widely used to retrieve information. This research will discuss how the system finds back the information stored in database tables. Tables in the database are arranged to store all forms of…
View article
Hindi Language Interface to Database Open
In our everyday lives we require information to accomplish daily tasks. Database is one of the most important sources of information. Database systems have been widely used in data storage and retrieval. However, to extract information fro…