Adel Nabli
YOU?
Author Swipe
View article: ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training
ACCO: Accumulate While You Communicate for Communication-Overlapped Sharded LLM Training Open
Training LLMs relies on distributed implementations using multiple GPUs to compute gradients in parallel with sharded optimizers. However, synchronizing gradients in data parallel setups introduces communication overhead that grows with th…
View article: WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average Open
The performance of deep neural networks is enhanced by ensemble methods, which average the output of several models. However, this comes at an increased cost at inference. Weight averaging methods aim at balancing the generalization of ens…
View article: Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning Open
We introduce a simple neural encoder architecture that can be trained using\nan unsupervised contrastive learning objective which gets its positive samples\nfrom data-augmented k-Nearest Neighbors search. We show that when built on top\nof…
View article: DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization
DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization Open
This work introduces DADAO: the first decentralized, accelerated, asynchronous, primal, first-order algorithm to minimize a sum of $L$-smooth and $μ$-strongly convex functions distributed over a given network of size $n$. Our key insight i…
View article: Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning Open
We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search. We show that when built on top of re…
View article: Complexity of the Multilvel Critical Node Problem
Complexity of the Multilvel Critical Node Problem Open
In this work, we analyze a sequential game played in a graph called the Multilevel Critical Node problem (MCN). A defender and an attacker are the players of this game. The defender starts by preventively interdicting vertices (vaccination…
View article: The multilevel critical node problem : theoretical intractability and a curriculum learning approach
The multilevel critical node problem : theoretical intractability and a curriculum learning approach Open
Évaluer la vulnérabilité des réseaux est un enjeu de plus en plus critique. Dans ce mémoire, nous nous penchons sur une approche étudiant la défense d’infrastructures stratégiques contre des attaques malveillantes au travers de problèmes d…
View article: Curriculum learning for multilevel budgeted combinatorial problems
Curriculum learning for multilevel budgeted combinatorial problems Open
Learning heuristics for combinatorial optimization problems through graph neural networks have recently shown promising results on some classic NP-hard problems. These are single-level optimization problems with only one player. Multilevel…