Raw data ≈ Raw data
View article: MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights
MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights Open
Since its first release over a decade ago, the MetaboAnalyst web-based platform has become widely used for comprehensive metabolomics data analysis and interpretation. Here we introduce MetaboAnalyst version 5.0, aiming to narrow the gap f…
View article
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction Open
Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or requir…
View article
iProX: an integrated proteome resource Open
This FAIRsharing record describes: iProX is a public platform for collecting and sharing raw data, analysis results and metadata obtained from proteomics experiments. The iProX repository employs a web-based proteome data submission proces…
View article
The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types Open
The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here …
View article
Sustainable data analysis with Snakemake Open
Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that da…
View article
Increasing Transparency Through a Multiverse Analysis Open
Empirical research inevitably includes constructing a data set by processing raw data into a form ready for statistical analysis. Data processing often involves choices among several reasonable options for excluding, transforming, and codi…
View article
DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG Open
This paper proposes a deep learning model, named DeepSleepNet, for automatic sleep stage scoring based on raw single-channel EEG. Most of the existing methods rely on hand-engineered features, which require prior knowledge of sleep analysi…
View article
Web of Science as a data source for research on scientific and scholarly activity Open
Web of Science (WoS) is the world’s oldest, most widely used and authoritative database of research publications and citations. Based on the Science Citation Index, founded by Eugene Garfield in 1964, it has expanded its selective, balance…
View article
A review: Data pre-processing and data augmentation techniques Open
This review paper provides an overview of data pre-processing in Machine learning, focusing on all types of problems while building the machine learning problems. It deals with two significant issues in the pre-processing process (i). issu…
View article
GGIR: A Research Community–Driven Open Source R Package for Generating Physical Activity and Sleep Outcomes From Multi-Day Raw Accelerometer Data Open
Recent technological advances have transformed the research on physical activity initially based on questionnaire data to the most recent objective data from accelerometers. The shift to availability of raw accelerations has increased meas…
View article
From DFT to machine learning: recent approaches to materials science–a review Open
Recent advances in experimental and computational methods are increasing the quantity and complexity of generated data. This massive amount of raw data needs to be stored and interpreted in order to advance the materials science field. Ide…
View article
ampvis2: an R package to analyse and visualise 16S rRNA amplicon data Open
Summary Microbial community analysis using 16S rRNA gene amplicon sequencing is the backbone of many microbial ecology studies. Several approaches and pipelines exist for processing the raw data generated through DNA sequencing and convert…
View article
Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks Open
In modern manufacturing systems and industries, more and more research efforts have been made in developing effective machine health monitoring systems. Among various machine health monitoring approaches, data-driven methods are gaining in…
View article
Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning Open
In recent years, deep learning algorithms have become increasingly more prominent for their unparalleled ability to automatically learn discriminant features from large amounts of data. However, within the field of electromyography-based g…
View article
Establishing microbial composition measurement standards with reference frames Open
Differential abundance analysis is controversial throughout microbiome research. Gold standard approaches require laborious measurements of total microbial load, or absolute number of microorganisms, to accurately determine taxonomic shift…
View article
Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach Open
Existing traffic flow forecasting approaches by deep learning models achieve excellent success based on a large volume of datasets gathered by governments and organizations. However, these datasets may contain lots of user's private data, …
View article
PlotsOfData—A web app for visualizing data together with their summaries Open
Reporting of the actual data in graphs and plots increases transparency and enables independent evaluation. On the other hand, data summaries are often used in graphs because they aid interpretation. To democratize state-of-the-art data vi…
View article
fastMRI: An Open Dataset and Benchmarks for Accelerated MRI Open
Accelerating Magnetic Resonance Imaging (MRI) by taking fewer measurements has the potential to reduce medical costs, minimize stress to patients and make MRI possible in applications where it is currently prohibitively slow or expensive. …
View article
Mapping single-cell data to reference atlases by transfer learning Open
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational re…
View article
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction Open
Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or requir…
View article
SplitFed: When Federated Learning Meets Split Learning Open
Federated learning (FL) and split learning (SL) are two popular distributed machine learning approaches. Both follow a model-to-data scenario; clients train and test machine learning models without sharing raw data. SL provides better mode…
View article
Applying Thematic Analysis to Education: A Hybrid Approach to Interpreting Data in Practitioner Research Open
Thematic analysis (TA), as a qualitative analytic method, is widely used in health care, psychology, and beyond. However, scant details are often given to demonstrate the process of data analysis, especially in the field of education. This…
View article
Raincloud plots: a multi-platform tool for robust data visualization Open
Across scientific disciplines, there is a rapidly growing recognition of the need for more statistically robust, transparent approaches to data visualization. Complementary to this, many scientists have called for plotting tools that accur…
View article
Rapid and Rigorous Qualitative Data Analysis Open
Despite the advantages of using qualitative data to advance research and practice, applied researchers agree that the most daunting task is trying to analyze the data rapidly and rigorously. This article introduces a quick and comprehensiv…
View article
The BIG Data Center: from deposition to integration to translation Open
Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academ…
View article
Processing two-dimensional X-ray diffraction and small-angle scattering data in <i>DAWN 2</i> Open
A software package for the calibration and processing of powder X-ray diffraction and small-angle X-ray scattering data is presented. It provides a multitude of data processing and visualization tools as well as a command-line scripting in…
View article
Large-scale analysis of test–retest reliabilities of self-regulation measures Open
The ability to regulate behavior in service of long-term goals is a widely studied psychological construct known as self-regulation. This wide interest is in part due to the putative relations between self-regulation and a range of real-wo…
View article
Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ Open
Meta-analysis using individual participant data (IPD) obtains and synthesises the raw, participant-level data from a set of relevant studies. The IPD approach is becoming an increasingly popular tool as an alternative to traditional aggreg…
View article
Retraction: Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19. N Engl J Med. DOI: 10.1056/NEJMoa2007621. Open
To the Editor: Because all the authors were not granted access to the raw data and the raw data could not be made available to a third-party auditor, we are unable to validate the primary data sour...
View article
Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture Open
In recent years, research has proposed several deep learning (DL) approaches to providing reliable remaining useful life (RUL) predictions in Prognostics and Health Management (PHM) applications. Although supervised DL techniques, such as …