Daouda Sow
YOU?
Author Swipe
View article: Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining Open
Pretraining large language models (LLMs) on vast and heterogeneous datasets is crucial for achieving state-of-the-art performance across diverse downstream tasks. However, current training paradigms treat all samples equally, overlooking t…
View article: Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization Open
In the rapidly advancing arena of large language models (LLMs), a key challenge is to enhance their capabilities amid a looming shortage of high-quality training data. Our study starts from an empirical strategy for the light continual tra…
View article: Non-Convex Bilevel Optimization with Time-Varying Objective Functions
Non-Convex Bilevel Optimization with Time-Varying Objective Functions Open
Bilevel optimization has become a powerful tool in a wide variety of machine learning problems. However, the current nonconvex bilevel optimization considers an offline dataset and static functions, which may not work well in emerging onli…
View article: Doubly Robust Instance-Reweighted Adversarial Training
Doubly Robust Instance-Reweighted Adversarial Training Open
Assigning importance weights to adversarial data has achieved great success in training adversarially robust networks under limited model capacity. However, existing instance-reweighted adversarial training (AT) methods heavily depend on h…
View article: Algorithm Design for Online Meta-Learning with Task Boundary Detection
Algorithm Design for Online Meta-Learning with Task Boundary Detection Open
Online meta-learning has recently emerged as a marriage between batch meta-learning and online learning, for achieving the capability of quick adaptation on new tasks in a lifelong manner. However, most existing approaches focus on the res…
View article: A Primal-Dual Approach to Bilevel Optimization with Multiple Inner Minima
A Primal-Dual Approach to Bilevel Optimization with Multiple Inner Minima Open
Bilevel optimization has found extensive applications in modern machine learning problems such as hyperparameter optimization, neural architecture search, meta-learning, etc. While bilevel problems with a unique inner minimal point (e.g., …
View article: On the Convergence Theory for Hessian-Free Bilevel Algorithms
On the Convergence Theory for Hessian-Free Bilevel Algorithms Open
Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and He…
View article: ES-Based Jacobian Enables Faster Bilevel Optimization
ES-Based Jacobian Enables Faster Bilevel Optimization Open
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems. However, due to the nested structure of BO, existing gradient-based methods require second-order derivative approximations via Jacob…
View article: A sequential guiding network with attention for image captioning
A sequential guiding network with attention for image captioning Open
The recent advances of deep learning in both computer vision (CV) and natural language processing (NLP) provide us a new way of understanding semantics, by which we can deal with more challenging tasks such as automatic description generat…
View article: Development of a Solar Controller with MLI Control
Development of a Solar Controller with MLI Control Open
This work presents the development of a solar regulator which manages the charge and discharge of a (lead) battery installed in a photovoltaic system in order to extend its lifetime. The regulator is controlled by a microcontroller (PIC16F…