Umar Farooq Minhas
YOU?
Author Swipe
View article: KG-TRICK: Unifying Textual and Relational Information Completion of Knowledge for Multilingual Knowledge Graphs
KG-TRICK: Unifying Textual and Relational Information Completion of Knowledge for Multilingual Knowledge Graphs Open
Multilingual knowledge graphs (KGs) provide high-quality relational and textual information for various NLP applications, but they are often incomplete, especially in non-English languages. Previous research has shown that combining inform…
View article: Incremental IVF Index Maintenance for Streaming Vector Search
Incremental IVF Index Maintenance for Streaming Vector Search Open
The prevalence of vector similarity search in modern machine learning applications and the continuously changing nature of data processed by these applications necessitate efficient and effective index maintenance techniques for vector sea…
View article: Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs Open
Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than t…
View article: Entity Disambiguation via Fusion Entity Decoding
Entity Disambiguation via Fusion Entity Decoding Open
Entity disambiguation (ED), which links the mentions of ambiguous entities to their referent entities in a knowledge base, serves as a core component in entity linking (EL). Existing generative approaches demonstrate improved accuracy comp…
View article: Enhancing Machine Translation Experiences with Multilingual Knowledge Graphs
Enhancing Machine Translation Experiences with Multilingual Knowledge Graphs Open
Translating entity names, especially when a literal translation is not correct, poses a significant challenge. Although Machine Translation (MT) systems have achieved impressive results, they still struggle to translate cultural nuances an…
View article: Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs Open
Translating text that contains entity names is a challenging task, as cultural-related references can vary significantly across languages. These variations may also be caused by transcreation, an adaptation process that entails more than t…
View article: Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs Open
Recent work in Natural Language Processing and Computer Vision has been using textual information -- e.g., entity names and descriptions -- available in knowledge graphs to ground neural models to high-quality structured data. However, whe…
View article: Growing and Serving Large Open-domain Knowledge Graphs
Growing and Serving Large Open-domain Knowledge Graphs Open
Applications of large open-domain knowledge graphs (KGs) to real-world\nproblems pose many unique challenges. In this paper, we present extensions to\nSaga our platform for continuous construction and serving of knowledge at\nscale. In par…
View article: High-Throughput Vector Similarity Search in Knowledge Graphs
High-Throughput Vector Similarity Search in Knowledge Graphs Open
There is an increasing adoption of machine learning for encoding data into vectors to serve online recommendation and search use cases. As a result, recent data management systems propose augmenting query processing with online vector simi…
View article: Bounding the Last Mile: Efficient Learned String Indexing
Bounding the Last Mile: Efficient Learned String Indexing Open
We introduce the RadixStringSpline (RSS) learned index structure for efficiently indexing strings. RSS is a tree of radix splines each indexing a fixed number of bytes. RSS approaches or exceeds the performance of traditional string indexe…
View article: Instance-Optimized Data Layouts for Cloud Analytics Workloads
Instance-Optimized Data Layouts for Cloud Analytics Workloads Open
Today, businesses rely on efficiently running analytics on large amounts of operational and historical data to gain business insights and competitive advantage. Increasingly, such analytics are run using cloud-based data analytics services…
View article: APEX: A High-Performance Learned Index on Persistent Memory
APEX: A High-Performance Learned Index on Persistent Memory Open
The recently released persistent memory (PM) offers high performance, persistence, and is cheaper than DRAM. This opens up new possibilities for indexes that operate and persist data directly on the memory bus. Recent learned indexes explo…
View article: The state of SQL-on-Hadoop in the cloud
The state of SQL-on-Hadoop in the cloud Open
Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving comp…