Ke Hu
YOU?
Author Swipe
View article: Printed sensing human-machine interface with individualized adaptive machine learning
Printed sensing human-machine interface with individualized adaptive machine learning Open
Developing intelligent robots with integrated sensing capabilities is critical for advanced manufacturing, medical robots, and embodied intelligence. Existing robotic sensing technologies are limited to recording of acceleration, driving t…
View article: Word Level Timestamp Generation for Automatic Speech Recognition and Translation
Word Level Timestamp Generation for Automatic Speech Recognition and Translation Open
We introduce a data-driven approach for enabling word-level timestamp prediction in the Canary model. Accurate timestamp information is crucial for a variety of downstream tasks such as speech content retrieval and timed subtitles. While t…
View article: SALM-Duplex: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
SALM-Duplex: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model Open
Spoken dialogue is an intuitive form of human-computer interaction, yet current speech language models often remain constrained to turn-based exchanges, lacking real-time adaptability such as user barge-in. We propose a novel duplex speech…
View article: <i>TESS</i> photometry, radial velocity, and orbital period investigations of four eclipsing contact binaries
<i>TESS</i> photometry, radial velocity, and orbital period investigations of four eclipsing contact binaries Open
We collected photometric data from the Transiting Exoplanet Survey Satellite and spectroscopic observations from the Large Sky Area Multi-Object Fiber Spectroscopic Telescope. Using this data, we simultaneously analyzed the radial velocity…
View article: Enhanced Landslide Risk Evaluation in Hydroelectric Reservoir Zones Utilizing an Improved Random Forest Approach
Enhanced Landslide Risk Evaluation in Hydroelectric Reservoir Zones Utilizing an Improved Random Forest Approach Open
Landslides on reservoir slopes are one of the key geologic hazards that threaten the safe operation of hydropower plants. The aim of our study was to reduce the limitations of the existing methods of landslide risk assessment when dealing …
View article: Research on the Cross-Industry Application of Autonomous Driving Technology in the Field of FinTech
Research on the Cross-Industry Application of Autonomous Driving Technology in the Field of FinTech Open
This thesis focuses on the interdisciplinary integration of autonomous driving technology and financial technology (FinTech), exploring the synergistic effects and application prospects of these two cutting-edge fields under the impetus of…
View article: Artificial Intelligence Empowering Robo-Advisors: A Data-Driven Wealth Management Model Analysis
Artificial Intelligence Empowering Robo-Advisors: A Data-Driven Wealth Management Model Analysis Open
In the digital age, the rapid development of financial technology has brought new opportunities to wealth management, especially with the emergence of robo-advisors as an innovative wealth management model that is increasingly favored by i…
View article: Adversarial Machine Learning in Cybersecurity: Attacks and Defenses
Adversarial Machine Learning in Cybersecurity: Attacks and Defenses Open
Adversarial Machine Learning (AML) refers to the research field that involves testing and improving machine learning models by introducing adversarial samples or attack techniques. In the cybersecurity domain, AML has significant potential…
View article: SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models Open
We introduce Speech-based Intelligence Quotient (SIQ) as a new form of human cognition-inspired evaluation pipeline for voice understanding large language models, LLM Voice, designed to assess their voice understanding ability. Moving beyo…
View article: VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning Open
View article: Cation exchange reshapes Cu active sites to promote C−C coupling for efficiently selective production of C2H4 from CO2 photoreduction
Cation exchange reshapes Cu active sites to promote C−C coupling for efficiently selective production of C2H4 from CO2 photoreduction Open
View article: NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model
NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model Open
View article: Large-scale acceleration algorithms for a deep convective physical parameterization scheme on GPU
Large-scale acceleration algorithms for a deep convective physical parameterization scheme on GPU Open
Early warning of geological hazards requires monitoring extreme weather conditions, such as heavy rainfall. Atmospheric circulation models are used for weather forecasting and climate simulation. As a critical physical process in atmospher…
View article: NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model
NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model Open
Construction of a general-purpose post-recognition error corrector poses a crucial question: how can we most effectively train a model on a large mixture of domain datasets? The answer would lie in learning dataset-specific features and di…
View article: VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning Open
Recent studies have augmented large language models (LLMs) with speech capabilities, leading to the development of speech language models (SpeechLMs). Earlier SpeechLMs focused on single-turn speech-based question answering (QA), where use…
View article: EMMeTT: Efficient Multimodal Machine Translation Training
EMMeTT: Efficient Multimodal Machine Translation Training Open
A rising interest in the modality extension of foundation language models warrants discussion on the most effective, and efficient, multimodal training approach. This work focuses on neural machine translation (NMT) and proposes a joint mu…
View article: Chain-of-Thought Prompting for Speech Translation
Chain-of-Thought Prompting for Speech Translation Open
Large language models (LLMs) have demonstrated remarkable advancements in language understanding and generation. Building on the success of text-based LLMs, recent research has adapted these models to use speech embeddings for prompting, r…
View article: OO Leo: An Active Contact Binary with Possible Solar-like Differential Rotation
OO Leo: An Active Contact Binary with Possible Solar-like Differential Rotation Open
With Transiting Exoplanet Survey Satellite (TESS) high-precision photometry and Large Sky Area Multi-object Fiber Spectroscopic Telescope medium-resolution spectra, we present the first light and radial velocity curve analyses for the ecli…
View article: Research on a Multi-Dimensional Indicator Assessment Model for Evaluating Landslide Risk near Large Alpine Reservoirs
Research on a Multi-Dimensional Indicator Assessment Model for Evaluating Landslide Risk near Large Alpine Reservoirs Open
Geological disasters in large alpine reservoirs primarily take the form of landslide occurrences and are predominantly induced by slope instability. Presently, risk monitoring and assessment strategies tend to prioritize sudden alerts over…
View article: Enhancing Visual Continual Learning with Language-Guided Supervision
Enhancing Visual Continual Learning with Language-Guided Supervision Open
Continual learning (CL) aims to empower models to learn new tasks without forgetting previously acquired knowledge. Most prior works concentrate on the techniques of architectures, replay data, regularization, \etc. However, the category n…
View article: Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study Open
In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck. We propose a non-autoregressive LM-fused ASR system that effectively leverages the parallelization capabilities…
View article: Preparation and Performance Investigation of Silicate Non-Sintered Ceramsite Using Engineering Waste Soil Under The Action of Alkali-Thermal Activation
Preparation and Performance Investigation of Silicate Non-Sintered Ceramsite Using Engineering Waste Soil Under The Action of Alkali-Thermal Activation Open
View article: Influence of Curve Location and Type of Adolescent Idiopathic Scoliosis on Static and Dynamic Plantar Pressure
Influence of Curve Location and Type of Adolescent Idiopathic Scoliosis on Static and Dynamic Plantar Pressure Open
View article: Feature Norm Regularized Federated Learning: Transforming Skewed Distributions into Global Insights
Feature Norm Regularized Federated Learning: Transforming Skewed Distributions into Global Insights Open
In the field of federated learning, addressing non-independent and identically distributed (non-i.i.d.) data remains a quintessential challenge for improving global model performance. This work introduces the Feature Norm Regularized Feder…
View article: Improving Joint Speech-Text Representations Without Alignment
Improving Joint Speech-Text Representations Without Alignment Open
The last year has seen astonishing progress in text-prompted image generation premised on the idea of a cross-modal representation space in which the text and image domains are represented jointly. In ASR, this idea has found application a…
View article: Mixture-of-Expert Conformer for Streaming Multilingual ASR
Mixture-of-Expert Conformer for Streaming Multilingual ASR Open
End-to-end models with large capacity have significantly improved multilingual automatic speech recognition, but their computation cost poses challenges for on-device applications. We propose a streaming truly multilingual Conformer incorp…
View article: IP Lyn: A Totally Eclipsing Contact Binary with an Extremely Low Mass Ratio
IP Lyn: A Totally Eclipsing Contact Binary with an Extremely Low Mass Ratio Open
We present the first photometric and orbital period investigations for a neglected totally eclipsing contact binary IP Lyn. The photometric solutions derived from both ground-based and several surveys’ observations suggest that it is a sha…
View article: Hot Subdwarf Stars Identified in LAMOST DR8 with Single-lined and Composite Spectra
Hot Subdwarf Stars Identified in LAMOST DR8 with Single-lined and Composite Spectra Open
A total of 222 hot subdwarf stars were identified with LAMOST DR8 spectra, among which 131 stars show composite spectra and have been decomposed, while 91 stars present single-lined spectra. Atmospheric parameters of all sample stars were …
View article: Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling
Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling Open
User historical behaviors are proved useful for Click Through Rate (CTR) prediction in online advertising system. In Meituan, one of the largest e-commerce platform in China, an item is typically displayed with its image and whether a user…
View article: Deep Position-wise Interaction Network for CTR Prediction
Deep Position-wise Interaction Network for CTR Prediction Open
Click-through rate (CTR) prediction plays an important role in online\nadvertising and recommender systems. In practice, the training of CTR models\ndepends on click data which is intrinsically biased towards higher positions\nsince higher…