Dawei Song
YOU?
Author Swipe
View article: Defense Strategy Against False Data Injection Attacks on Cyber–Physical System for Vehicle–Grid Based on KNN-GAE
Defense Strategy Against False Data Injection Attacks on Cyber–Physical System for Vehicle–Grid Based on KNN-GAE Open
With the in-depth integration of electric vehicles (EVs) and smart grids, the Cyber–Physical System for Vehicle–Grid (CPSVG) has become a crucial component of power systems. However, its inherent characteristic of deep cyber–physical coupl…
View article: ZigzagAttention: Efficient Long-Context Inference with Exclusive Retrieval and Streaming Heads
ZigzagAttention: Efficient Long-Context Inference with Exclusive Retrieval and Streaming Heads Open
With the rapid development of large language models (LLMs), handling long context has become one of the vital abilities in LLMs. Such long-context ability is accompanied by difficulties in deployment, especially due to the increased consum…
View article: Optimizing interfacial stability of sulfurized polyacrylonitrile batteries by fluorinated composite polymer electrolytes
Optimizing interfacial stability of sulfurized polyacrylonitrile batteries by fluorinated composite polymer electrolytes Open
Sulfurized polyacrylonitrile (SPAN) is a promising cathode to address the notorious polysulfide shuttle effect sluggish and reaction dynamics of traditional lithium-sulfur (Li–S) batteries through its conductive pyridinic framework. Howeve…
View article: Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge
Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge Open
Retrieval-augmented generation (RAG) is a cost-effective approach to mitigate the hallucination of Large Language Models (LLMs) by incorporating the retrieved external knowledge into the generation process. However, external knowledge may …
View article: WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM Inference
WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM Inference Open
With the advancements in long-context inference capabilities of large language models (LLMs), the KV cache has become one of the foundational components. However, its substantial GPU memory consumption makes KV cache compression a key tech…
View article: Frequency- and state-dependent dynamics of EEG microstates during propofol anesthesia
Frequency- and state-dependent dynamics of EEG microstates during propofol anesthesia Open
Electroencephalography microstate analysis has emerged as a powerful tool for investigating brain dynamics during anesthesia-induced unconsciousness. However, existing studies typically analyze EEG signals across broad frequency bands, lea…
View article: Boosting High-Rate Lithium Metal Batteries by Using Ether-Based Gel Polymer Electrolyte
Boosting High-Rate Lithium Metal Batteries by Using Ether-Based Gel Polymer Electrolyte Open
Ether-based electrolytes are widely used in lithium metal batteries owing to their higher compatibility with Li anodes compared to that of carbonate-based electrolytes. Compared to the concern with high voltage resistance characteristics, …
View article: Label-Based Disentanglement Measure among Hidden Units of Deep Learning
Label-Based Disentanglement Measure among Hidden Units of Deep Learning Open
The capability to disentangle underlying factors hidden in the observable data, thereby obtaining their abstract representations, is considered one important ingredient for the subsequent success of deep networks in various application sce…
View article: Quantum-inspired semantic matching based on neural networks with the duality of density matrices
Quantum-inspired semantic matching based on neural networks with the duality of density matrices Open
Social media text can be semantically matched in different ways, viz paraphrase identification, answer selection, community question answering, and so on. The performance of the above semantic matching tasks depends largely on the ability …
View article: Review on polymer electrolytes for lithium‐sulfurized polyacrylonitrile batteries
Review on polymer electrolytes for lithium‐sulfurized polyacrylonitrile batteries Open
Lithium‐sulfur (Li‐S) batteries are deemed as the next generation of energy storage devices due to high theoretical specific capacity (1675 mAh g −1 ) and energy density (2600 Wh kg −1 ). However, the commercial application has always been…
View article: Investigating Context Effects in Similarity Judgements in Large Language Models
Investigating Context Effects in Similarity Judgements in Large Language Models Open
Large Language Models (LLMs) have revolutionised the capability of AI models in comprehending and generating natural language text. They are increasingly being used to empower and deploy agents in real-world scenarios, which make decisions…
View article: Research on Network Crime Prediction Based on Improved PSO-BP Neural Network Algorithm
Research on Network Crime Prediction Based on Improved PSO-BP Neural Network Algorithm Open
This research addresses the challenge of effectively predicting network crimes by introducing an enhanced model combining Particle Swarm Optimization (PSO) with Back Propagation (BP) neural networks. Traditional BP networks often suffer fr…
View article: Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check
Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check Open
Chinese Spelling Check (CSC) aims to detect and correct potentially misspelled characters in Chinese sentences. Naturally, it involves the detection and correction subtasks, which interact with each other dynamically. Such interactions are…
View article: Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models
Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models Open
With the increasingly giant scales of (causal) large language models (LLMs), the inference efficiency comes as one of the core concerns along the improved performance. In contrast to the memory footprint, the latency bottleneck seems to be…
View article: LLM-Oriented Retrieval Tuner
LLM-Oriented Retrieval Tuner Open
Dense Retrieval (DR) is now considered as a promising tool to enhance the memorization capacity of Large Language Models (LLM) such as GPT3 and GPT-4 by incorporating external memories. However, due to the paradigm discrepancy between text…
View article: Towards the Law of Capacity Gap in Distilling Language Models
Towards the Law of Capacity Gap in Distilling Language Models Open
Language model (LM) distillation aims at distilling the knowledge in a large teacher LM to a small student one. As a critical issue facing LM distillation, a superior student often arises from a teacher of a relatively small scale instead …
View article: On Elastic Language Models
On Elastic Language Models Open
Large-scale pretrained language models have achieved compelling performance in a wide range of language understanding and information retrieval tasks. Knowledge distillation offers an opportunity to compress a large language model to a sma…
View article: Sparse Contrastive Learning of Sentence Embeddings
Sparse Contrastive Learning of Sentence Embeddings Open
Recently, SimCSE has shown the feasibility of contrastive learning in training sentence embeddings and illustrates its expressiveness in spanning an aligned and uniform embedding space. However, prior studies have shown that dense models c…
View article: Controllable Text Generation with Residual Memory Transformer
Controllable Text Generation with Residual Memory Transformer Open
Large-scale Causal Language Models (CLMs), e.g., GPT3 and ChatGPT, have brought great success in text generation. However, it is still an open challenge to control the generation process of CLM while balancing flexibility, control granular…
View article: A Quantum Probability Driven Framework for Joint Multi-Modal Sarcasm, Sentiment and Emotion Analysis
A Quantum Probability Driven Framework for Joint Multi-Modal Sarcasm, Sentiment and Emotion Analysis Open
Sarcasm, sentiment, and emotion are three typical kinds of spontaneous affective responses of humans to external events and they are tightly intertwined with each other. Such events may be expressed in multiple modalities (e.g., linguistic…
View article: A Survey of Quantum-Cognitively Inspired Sentiment Analysis Models
A Survey of Quantum-Cognitively Inspired Sentiment Analysis Models Open
Quantum theory, originally proposed as a physical theory to describe the motions of microscopic particles, has been applied to various non-physics domains involving human cognition and decision-making that are inherently uncertain and exhi…
View article: Task-agnostic Distillation of Encoder-Decoder Language Models
Task-agnostic Distillation of Encoder-Decoder Language Models Open
Finetuning pretrained language models (LMs) have enabled appealing performance on a diverse array of tasks. The intriguing task-agnostic property has driven a shifted focus from task-specific to task-agnostic distillation of LMs. While tas…
View article: Lifting the Curse of Capacity Gap in Distilling Language Models
Lifting the Curse of Capacity Gap in Distilling Language Models Open
Pretrained language models (LMs) have shown compelling performance on various downstream tasks, but unfortunately they require a tremendous amount of inference compute. Knowledge distillation finds a path to compress LMs to small ones with…