Lin F. Yang
YOU?
Author Swipe
View article: A Saddle Point Remedy: Power of Variable Elimination in Non-convex Optimization
A Saddle Point Remedy: Power of Variable Elimination in Non-convex Optimization Open
The proliferation of saddle points, rather than poor local minima, is increasingly understood to be a primary obstacle in large-scale non-convex optimization for machine learning. Variable elimination algorithms, like Variable Projection (…
View article: ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization Open
Large language models (LLMs) present significant deployment challenges due to their immense computational and memory requirements. While semi-structured pruning, particularly 2:4 sparsity, offers a path to practical hardware acceleration, …
View article: Research on a Controlled Knife Recognition System Based on YOLOv11s
Research on a Controlled Knife Recognition System Based on YOLOv11s Open
With the rapid development of computer technology, the importance of object detection technology in the field of dangerous item detection is increasingly highlighted. This paper focuses on the precise detection of dangerous knives in publi…
View article: Near-Optimal Sample Complexity Bounds for Constrained Average-Reward MDPs
Near-Optimal Sample Complexity Bounds for Constrained Average-Reward MDPs Open
Recent advances have significantly improved our understanding of the sample complexity of learning in average-reward Markov decision processes (AMDPs) under the generative model. However, much less is known about the constrained average-re…
View article: Research on the influencing factors of generative artificial intelligence usage intent in post-secondary education: an empirical analysis based on the AIDUA extended model
Research on the influencing factors of generative artificial intelligence usage intent in post-secondary education: an empirical analysis based on the AIDUA extended model Open
Objective Generative Artificial Intelligence (AIGC) presents a profound dialectic in higher education: its transformative potential is challenged by deep-seated psychological and ethical barriers. Traditional adoption models fail to captur…
View article: Supplemental Material: Two episodes of orogenic gold mineralization at Chaihulanzi, NE China, in response to superimposed orogeny associated with subduction of the Paleo-Asian and Paleo-Pacific Ocean plates
Supplemental Material: Two episodes of orogenic gold mineralization at Chaihulanzi, NE China, in response to superimposed orogeny associated with subduction of the Paleo-Asian and Paleo-Pacific Ocean plates Open
Tables S1–S6
View article: Advancing Large Language Models for Tibetan with Curated Data and Continual Pre-Training
Advancing Large Language Models for Tibetan with Curated Data and Continual Pre-Training Open
Large language models have achieved remarkable progress across many languages. However, Tibetan, as a representative low-resource language, is particularly underrepresented in existing models due to the scarcity of high-quality training co…
View article: Research on Capacitated Multi-Ship Replenishment Path Planning Problem Based on the Synergistic Hybrid Optimization Algorithm
Research on Capacitated Multi-Ship Replenishment Path Planning Problem Based on the Synergistic Hybrid Optimization Algorithm Open
Ship replenishment path planning is a critical problem in the field of maritime logistics. This study proposes a novel synergistic hybrid optimization algorithm (SHOA) that effectively integrates ant colony optimization (ACO), the Clarke–W…
View article: Integrating Generative AI-Based Assistance Tool in Programming Education for Medical Students: A Cross-Sectional Study
Integrating Generative AI-Based Assistance Tool in Programming Education for Medical Students: A Cross-Sectional Study Open
Backgroud With the increasing importance of computational skills in healthcare, there is a growing need to equip medical students with programming knowledge to address complex healthcare challenges effectively. Traditional programming meth…
View article: Effective equidistribution in rank 2 homogeneous spaces and values of quadratic forms
Effective equidistribution in rank 2 homogeneous spaces and values of quadratic forms Open
We establish effective equidistribution theorems, with a polynomial error rate, for orbits of unipotent subgroups in quotients of quasi-split, almost simple Linear algebraic groups of absolute rank 2. As an application, inspired by the res…
View article: Research on Ship Replenishment Path Planning Based on the Modified Whale Optimization Algorithm
Research on Ship Replenishment Path Planning Based on the Modified Whale Optimization Algorithm Open
Ship replenishment path planning has always been a critical concern for researchers in the field of security. This study proposes a modified whale optimization algorithm (MWOA) to address single-task ship replenishment path planning proble…
View article: Transition Transfer $Q$-Learning for Composite Markov Decision Processes
Transition Transfer $Q$-Learning for Composite Markov Decision Processes Open
To bridge the gap between empirical success and theoretical understanding in transfer reinforcement learning (RL), we study a principled approach with provable performance guarantees. We introduce a novel composite MDP framework where high…
View article: Nearly Linear Row Sampling Algorithm for Quantile Regression
Nearly Linear Row Sampling Algorithm for Quantile Regression Open
We give a row sampling algorithm for the quantile loss function with sample complexity nearly linear in the dimensionality of the data, improving upon the previous best algorithm whose sampling complexity has at least cubic dependence on t…
View article: Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning
Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning Open
The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate exte…
View article: Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error Open
The recent work by Dong & Yang (2023) showed for misspecified sparse linear bandits, one can obtain an $O\left(ε\right)$-optimal policy using a polynomial number of samples when the sparsity is a constant, where $ε$ is the misspecification…
View article: Confident Natural Policy Gradient for Local Planning in $q_π$-realizable Constrained MDPs
Confident Natural Policy Gradient for Local Planning in $q_π$-realizable Constrained MDPs Open
The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how…
View article: Learning for Bandits under Action Erasures
Learning for Bandits under Action Erasures Open
We consider a novel multi-arm bandit (MAB) setup, where a learner needs to communicate the actions to distributed agents over erasure channels, while the rewards for the actions are directly available to the learner through external sensor…
View article: Don't Forget to Connect! Improving RAG with Graph-based Reranking
Don't Forget to Connect! Improving RAG with Graph-based Reranking Open
Retrieval Augmented Generation (RAG) has greatly improved the performance of Large Language Model (LLM) responses by grounding generation with context from existing documents. These systems work well when documents are clearly relevant to …
View article: Research on acoustic methods for buried PE pipeline detection based on LSTM neural networks
Research on acoustic methods for buried PE pipeline detection based on LSTM neural networks Open
As an essential component of urban infrastructure construction, polyethylene (PE) pipelines face the challenging task of underground detection due to the complex and dynamic nature of the subsurface environment, diverse installation paths,…
View article: Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning Open
Existing metrics for reinforcement learning (RL) such as regret, PAC bounds, or uniform-PAC (Dann et al., 2017), typically evaluate the cumulative performance, while allowing the agent to play an arbitrarily bad policy at any finite time t…
View article: Modeling Bellman Error with Logistic Distribution with Applications in Reinforcement Learning
Modeling Bellman Error with Logistic Distribution with Applications in Reinforcement Learning Open
View article: Ir-Ids: A Network Intrusion Detection Method Based on Causal Feature Selection and Explainable Model Optimization
Ir-Ids: A Network Intrusion Detection Method Based on Causal Feature Selection and Explainable Model Optimization Open
View article: Multi-Agent Bandit Learning through Heterogeneous Action Erasure Channels
Multi-Agent Bandit Learning through Heterogeneous Action Erasure Channels Open
Multi-Armed Bandit (MAB) systems are witnessing an upswing in applications within multi-agent distributed environments, leading to the advancement of collaborative MAB algorithms. In such settings, communication between agents executing ac…
View article: Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation Open
To tackle long planning horizon problems in reinforcement learning with general function approximation, we propose the first algorithm, termed as UCRL-WVTR, that achieves both \emph{horizon-free} and \emph{instance-dependent}, since it eli…
View article: Adaptive Liquidity Provision in Uniswap V3 with Deep Reinforcement Learning
Adaptive Liquidity Provision in Uniswap V3 with Deep Reinforcement Learning Open
Decentralized exchanges (DEXs) are a cornerstone of decentralized finance (DeFi), allowing users to trade cryptocurrencies without the need for third-party authorization. Investors are incentivized to deposit assets into liquidity pools, a…
View article: Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing
Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing Open
Recently, DARPA launched the ShELL program, which aims to explore how experience sharing can benefit distributed lifelong learning agents in adapting to new challenges. In this paper, we address this issue by conducting both theoretical an…
View article: On the Model-Misspecification in Reinforcement Learning
On the Model-Misspecification in Reinforcement Learning Open
The success of reinforcement learning (RL) crucially depends on effective function approximation when dealing with complex ground-truth models. Existing sample-efficient RL algorithms primarily employ three approaches to function approxima…
View article: Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling Open
Policy optimization methods are powerful algorithms in Reinforcement Learning (RL) for their flexibility to deal with policy parameterization and ability to handle model misspecification. However, these methods usually suffer from slow con…
View article: Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds
Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds Open
While numerous works have focused on devising efficient algorithms for reinforcement learning (RL) with uniformly bounded rewards, it remains an open question whether sample or time-efficient algorithms for RL with large state-action space…
View article: MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models
MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models Open
Large-scale language models have shown the ability to adapt to a new task via conditioning on a few demonstrations (i.e., in-context learning). However, in the vision-language domain, most large-scale pre-trained vision-language (VL) model…