TrungTin Nguyen
YOU?
Author Swipe
View article: Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps Open
We develop a unified statistical framework for softmax-gated Gaussian mixture of experts (SGMoE) that addresses three long-standing obstacles in parameter estimation and model selection: (i) non-identifiability of gating parameters up to c…
View article: Modifications of the BIC for order selection in finite mixture models
Modifications of the BIC for order selection in finite mixture models Open
Finite mixture models are ubiquitous tools in modern statistical modeling, and a frequently encountered problem that arises in their implementation is the choice of model order. In Kerebin (2000, Sankhya: The Indian Journal of Statistics, …
View article: A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random
A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random Open
Model-based clustering integrated with variable selection is a powerful tool for uncovering latent structures within complex data. However, its effectiveness is often hindered by challenges such as identifying relevant variables that defin…
View article: On the large-sample limits of some Bayesian model evaluation statistics
On the large-sample limits of some Bayesian model evaluation statistics Open
Model selection and order selection problems frequently arise in statistical practice. A popular approach to addressing these problems in the frequentist setting involves information criteria based on penalised maxima of log-likelihoods fo…
View article: Revisiting Concentration Results for Approximate Bayesian Computation
Revisiting Concentration Results for Approximate Bayesian Computation Open
International audience
View article: Bayesian nonparametric mixture of experts for inverse problems
Bayesian nonparametric mixture of experts for inverse problems Open
International audience
View article: ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models
ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models Open
State-of-the-art medical multi-modal LLMs (med-MLLMs), such as LLaVA-Med and BioMedGPT, primarily depend on scaling model size and data volume, with training driven largely by autoregressive objectives. However, we reveal that this approac…
View article: Bayesian Likelihood Free Inference using Mixtures of Experts
Bayesian Likelihood Free Inference using Mixtures of Experts Open
International audience
View article: Accelerating Transformers with Spectrum-Preserving Token Merging
Accelerating Transformers with Spectrum-Preserving Token Merging Open
Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effe…
View article: Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence
Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence Open
We consider the problem of estimating probability density functions based on sample data, using a finite mixture of densities from some component class. To this end, we introduce the $h$-lifted Kullback--Leibler (KL) divergence as a genera…
View article: CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition Open
Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the represen…
View article: Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks Open
A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds. A 3D (geometric) representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates. Every …
View article: HyperRouter: Towards Efficient Training and Inference of Sparse Mixture\n of Experts
HyperRouter: Towards Efficient Training and Inference of Sparse Mixture\n of Experts Open
By routing input tokens to only a few split experts, Sparse\nMixture-of-Experts has enabled efficient training of large language models.\nRecent findings suggest that fixing the routers can achieve competitive\nperformance by alleviating t…
View article: HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts Open
By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models. Recent findings suggest that fixing the routers can achieve competitive performance by alleviating the …
View article: A non-asymptotic theory for model selection in a high-dimensional mixture of experts via joint rank and variable selection
A non-asymptotic theory for model selection in a high-dimensional mixture of experts via joint rank and variable selection Open
View article: A Non-asymptotic Risk Bound for Model Selection in a High-Dimensional Mixture of Experts via Joint Rank and Variable Selection
A Non-asymptotic Risk Bound for Model Selection in a High-Dimensional Mixture of Experts via Joint Rank and Variable Selection Open
View article: A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts Open
Mixture-of-experts (MoE) model incorporates the power of multiple submodels via gating functions to achieve greater performance in numerous regression and classification applications. From a theoretical perspective, while there have been p…
View article: Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts Open
Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applicat…
View article: Demystifying Softmax Gating Function in Gaussian Mixture of Experts
Demystifying Softmax Gating Function in Gaussian Mixture of Experts Open
Understanding the parameter estimation of softmax gating Gaussian mixture of experts has remained a long-standing open problem in the literature. It is mainly due to three fundamental theoretical challenges associated with the softmax gati…
View article: A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models Open
Mixture of experts (MoE) are a popular class of statistical and machine learning models that have gained attention over the years due to their flexibility and efficiency. In this work, we consider Gaussiangated localized MoE (GLoME) and bl…
View article: Concentration results for approximate Bayesian computation without identifiability
Concentration results for approximate Bayesian computation without identifiability Open
We study the large sample behaviors of approximate Bayesian computation (ABC) posterior measures in situations when the data generating process is dependent on non-identifiable parameters. In particular, we establish the concentration of p…
View article: Bayesian nonparametric mixture of experts for high-dimensional inverse problems
Bayesian nonparametric mixture of experts for high-dimensional inverse problems Open
International audience
View article: HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts Open
Truong Do, Le Khiem, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Binh Nguyen, Chenghao Liu, Savitha Ramasamy, Xiaoli Li, Steven Hoi. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
View article: Summary statistics and discrepancy measures for approximate Bayesian computation via surrogate posteriors
Summary statistics and discrepancy measures for approximate Bayesian computation via surrogate posteriors Open
View article: Mixture of expert posterior surrogates for approximate Bayesian computation
Mixture of expert posterior surrogates for approximate Bayesian computation Open
International audience
View article: Model selection by penalization in mixture of experts models with a non-asymptotic approach
Model selection by penalization in mixture of experts models with a non-asymptotic approach Open
International audience
View article: Approximation of probability density functions via location-scale finite mixtures in Lebesgue spaces
Approximation of probability density functions via location-scale finite mixtures in Lebesgue spaces Open
The class of location-scale finite mixtures is of enduring interest both from\napplied and theoretical perspectives of probability and statistics. We prove\nthe following results: to an arbitrary degree of accuracy, (a) location-scale\nmix…
View article: A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models Open
International audience
View article: Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models
Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models Open
View article: Non-asymptotic model selection in block-diagonal mixture of polynomial\n experts models
Non-asymptotic model selection in block-diagonal mixture of polynomial\n experts models Open
Model selection, via penalized likelihood type criteria, is a standard task\nin many statistical inference and machine learning problems. Progress has led\nto deriving criteria with asymptotic consistency results and an increasing\nemphasi…