Marcos André Gonçalves
YOU?
Author Swipe
View article: Extração Automática de Atributos de Sinais de Emissão Acústica com Redes Neurais Autocodificadoras para Predição de Integridade em Tubulações
Extração Automática de Atributos de Sinais de Emissão Acústica com Redes Neurais Autocodificadoras para Predição de Integridade em Tubulações Open
A análise de sinais de Emissão Acústica (EA) é uma técnica amplamente utilizada para monitorar processos de degradação estrutural, como a perda de espessura em tubulações. Contudo, a extração de atributos informativos a partir desses sinai…
View article: Um Estudo Comparativo de Estratégias de Seleção de Exemplos para In-Context Learning aplicado à Classificação Automática de Texto com Grandes Modelos de Linguagem
Um Estudo Comparativo de Estratégias de Seleção de Exemplos para In-Context Learning aplicado à Classificação Automática de Texto com Grandes Modelos de Linguagem Open
A Classificação Automática de Texto (CAT) com Grandes Modelos de Linguagem (LLMs) pode ser feita via zero-shot (baixo custo, menor efetividade) ou fine-tuning (alto custo, maior efetividade). Este estudo investiga o in-context learning, ab…
View article: Pondere e Expanda: Impacto e Limitações de Representações Contextual-Esparsas na Modelagem de Tópicos
Pondere e Expanda: Impacto e Limitações de Representações Contextual-Esparsas na Modelagem de Tópicos Open
Este trabalho propõe o uso de representações contextual-esparsas na tarefa de Modelagem de Tópicos (MT), com o objetivo de combinar a interpretabilidade das representações esparsas com o poder semântico das representações contextuais. Util…
View article: Inteligência Artificial Sustentável baseado em Engenharia de Dados, Aprendizado de Máquina e Transferência de Conhecimento para Processamento de Linguagem Natural
Inteligência Artificial Sustentável baseado em Engenharia de Dados, Aprendizado de Máquina e Transferência de Conhecimento para Processamento de Linguagem Natural Open
Grandes Modelos de Linguagem (GMLs), baseados em técnicas de Inteligência Artificial, têm transformado o Processamento de Linguagem Natural (PLN), sendo referência em tarefas como classificação de texto, análise de sentimentos, sumarização…
View article: Aprendizado Federado Incremental e Sensível ao Risco para Modelos de Ranqueamento em Cenários com Distribuições Heterogêneas de Dados
Aprendizado Federado Incremental e Sensível ao Risco para Modelos de Ranqueamento em Cenários com Distribuições Heterogêneas de Dados Open
Este trabalho propõe uma nova estratégia de Aprendizado Federado para Ranqueamento (FL2R) em cenários com dados não independentes e não identicamente distribuídos (não-IID) entre clientes. Apresentamos o FedRisk, um método de agregação sen…
View article: A Comprehensive Exploitation of Instance Selection Methods for Automatic Text Classification — “Doing More with Less”
A Comprehensive Exploitation of Instance Selection Methods for Automatic Text Classification — “Doing More with Less” Open
Recent progress in NLP has followed a “more is better” trend (more data, computing power, and model complexity) best exemplified by the Large Language Models (LLMs). However, training such models remains resource-intensive. This Ph.D. diss…
View article: A Comprehensive Exploitation of Instance Selection Methods for Automatic Text Classification: “Doing More with Less”
A Comprehensive Exploitation of Instance Selection Methods for Automatic Text Classification: “Doing More with Less” Open
Progress in Natural Language Processing (NLP) has been dictated by the “rule of more”: more data, more computing power and more complexity, best exemplified by the current Large Language Models (LLMs). Indeed, to properly work (with high a…
View article: Optimizing Tail-Head Trade-off for Extreme Multi-Label Text Classification (XMTC) with RAG-Labels and a Dynamic Two-Stage Retrieval and Fusion Pipeline
Optimizing Tail-Head Trade-off for Extreme Multi-Label Text Classification (XMTC) with RAG-Labels and a Dynamic Two-Stage Retrieval and Fusion Pipeline Open
View article: Ranking-based Fusion Algorithms for Extreme Multi-label Text Classification (XMTC)
Ranking-based Fusion Algorithms for Extreme Multi-label Text Classification (XMTC) Open
In the context of Extreme Multi-label Text Classification (XMTC), where labels are assigned to text instances from a large label space, the long-tail distribution of labels presents a significant challenge. Labels can be broadly categorize…
View article: Semi-Active Vibration Control for High-Speed Elevator Using Magnetorheological Damper
Semi-Active Vibration Control for High-Speed Elevator Using Magnetorheological Damper Open
This paper presents the results of investigating the application of magnetorheological fluids in controlling the lateral and angular vibrations of a high-speed elevator. Numerical simulations are performed for a mathematical model with two…
View article: Monitoramento remoto de áreas utilizando VANTs: uma abordagem baseada na Transferência do Aprendizado e Aprendizado Federado
Monitoramento remoto de áreas utilizando VANTs: uma abordagem baseada na Transferência do Aprendizado e Aprendizado Federado Open
O monitoramento remoto de terrenos com veículos aéreos não tripulados (VANTs) tem ganhado destaque em áreas como agricultura de precisão, mapeamento ambiental e gestão urbana, permitindo a coleta de dados em grande escala e em tempo real. …
View article: A comprehensive exploitation of instance selection methods for automatic text classification
A comprehensive exploitation of instance selection methods for automatic text classification Open
Progress in Natural Language Processing (NLP) has been dictated by the rule of more: more data, more computing power and more complexity, best exemplified by the Large Language Models. However, training (or fine-tuning) large dense models …
View article: Unraveling relevant cross-waves pattern drifts in patient-hospital risk factors among hospitalized COVID-19 patients using explainable machine learning methods
Unraveling relevant cross-waves pattern drifts in patient-hospital risk factors among hospitalized COVID-19 patients using explainable machine learning methods Open
Not applicable.
View article: A Human-Centered Multiperspective and Interactive Visual Tool For Explainable Machine Learning
A Human-Centered Multiperspective and Interactive Visual Tool For Explainable Machine Learning Open
Understanding why a trained machine learning model makes some decisions is paramount to trusting the model and applying its recommendations in real-world applications. In this article, we present the design and development of an interactiv…
View article: Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification
Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification Open
View article: QuantumCLEF 2025 - The Second Edition of the Quantum Computing Lab at CLEF
QuantumCLEF 2025 - The Second Edition of the Quantum Computing Lab at CLEF Open
View article: ``Are the Current Topic Modeling Evaluation Metrics Enough?'' Mitigating The Limitations of Topic Modeling Evaluation Metrics Using a Multi-Perspective Game Theoretic Approach
``Are the Current Topic Modeling Evaluation Metrics Enough?'' Mitigating The Limitations of Topic Modeling Evaluation Metrics Using a Multi-Perspective Game Theoretic Approach Open
View article: A Noise-Oriented and Redundancy-Aware Instance Selection Framework
A Noise-Oriented and Redundancy-Aware Instance Selection Framework Open
Fine-tuning transformer-based deep-learning models are currently at the forefront of natural language processing (NLP) and information retrieval (IR) tasks. However, fine-tuning these transformers for specific tasks, especially when dealin…
View article: Exploiting Contextual Embeddings in Hierarchical Topic Modeling and Investigating the Limits of the Current Evaluation Metrics
Exploiting Contextual Embeddings in Hierarchical Topic Modeling and Investigating the Limits of the Current Evaluation Metrics Open
We investigate two essential challenges in the context of hierarchical topic modeling (HTM)—(i) the impact of data representation and (ii) topic evaluation. The data representation directly influences the performance of the topic generatio…
View article: Estratégias de Undersampling para Redução de Viés em Classificação de Texto Baseada em Transformers
Estratégias de Undersampling para Redução de Viés em Classificação de Texto Baseada em Transformers Open
Automatic Text Classification (ATC) in unbalanced datasets is a common challenge in real-world applications. In this scenario, one (or more) class(es) is overrepresented, which usually causes a bias in the learning process towards these ma…
View article: A New Risk-Sensitive Deep Learning Optimization Function for Ranking Tasks
A New Risk-Sensitive Deep Learning Optimization Function for Ranking Tasks Open
This master thesis proposes the RiskLoss function to deal with the (hard) problem of incorporating risk-sensitiveness measures into Deep Neural Networks (DNNs), by including two adaptations for neural network ranking in ad-hoc retrieval an…
View article: On the Role of Semantic Word Clusters — CluWords — in Natural Language Processing (NLP)
On the Role of Semantic Word Clusters — CluWords — in Natural Language Processing (NLP) Open
Esta tese de doutorado tem como foco a proposta, concepção e avaliação de uma nova representação textual de documentos que combina o “melhor de dois mundos”: a informação frequentista, eficiente e eficaz (representações TFIDF), com informa…
View article: Um Estudo Aprofundado sobre Grupos Semânticos de Palavras - CluWords - em tarefas de PLN
Um Estudo Aprofundado sobre Grupos Semânticos de Palavras - CluWords - em tarefas de PLN Open
This Ph.D. dissertation focused on proposing, designing and evaluating a novel textual document representation that exploits the “best of two worlds”: efficient and effective frequentist information (TFIDF representations) with semantic in…
View article: Comprehensive statistical analysis reveals significant benefits of COVID-19 vaccination in hospitalized patients: propensity score, covariate adjustment, and feature importance by permutation
Comprehensive statistical analysis reveals significant benefits of COVID-19 vaccination in hospitalized patients: propensity score, covariate adjustment, and feature importance by permutation Open
View article: A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification
A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification Open
Transformer models have achieved state-of-the-art results, with Large Language Models (LLMs), an evolution of first-generation transformers (1stTR), being considered the cutting edge in several NLP tasks. However, the literature has yet to…
View article: A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning
A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning Open
Deep Learning approaches have become pervasive in recent years due to their ability to solve complex tasks. However, these models need huge datasets for proper training and good generalization. This translates into high training and fine-t…
View article: Pipelining Semantic Expansion and Noise Filtering for Sentiment Analysis of Short Documents – CluSent Method
Pipelining Semantic Expansion and Noise Filtering for Sentiment Analysis of Short Documents – CluSent Method Open
The challenge of constructing effective sentiment models is exacerbated by a lack of sufficient information, particularly in short texts. Enhancing short texts with semantic relationships becomes crucial for capturing affective nuances and…
View article: Multi-modality cardiac imaging confirms quadricuspid aortic valve and excludes papillary fibroelastoma
Multi-modality cardiac imaging confirms quadricuspid aortic valve and excludes papillary fibroelastoma Open
View article: Why are You Traveling?
Why are You Traveling? Open
View article: Using Active Learning for Segmentation and Semantic Classification of Legal Acts Extracted from Official Diaries
Using Active Learning for Segmentation and Semantic Classification of Legal Acts Extracted from Official Diaries Open
Based on openness and transparency for good governance, unimpeded and verifiable access to legal and regulatory information is essential. With such access, we can monitor government actions to ensure that public financial resources are not…