Nearest neighbor search ≈ Nearest neighbor search
View article
A Survey on Learning to Hash Open
Nearest neighbor search is a problem of finding the data points from the database such that the distances from them to the query point are the smallest. Learning to hash is one of the major solutions to this problem and has been widely stu…
View article
k-Nearest Neighbour Classifiers - A Tutorial Open
Perhaps the most straightforward classifier in the arsenal or Machine Learning techniques is the Nearest Neighbour Classifier—classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to…
View article
Deep Hashing Network for Efficient Similarity Retrieval Open
Due to the storage and retrieval efficiency, hashing has been widely deployed to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing, which improves the quality of hash coding by exploiting the sema…
View article
Billion-Scale Similarity Search with GPUs Open
Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles t…
View article
Collaborative Metric Learning Open
Metric learning algorithms produce distance metrics that capture the important relationships among data. In this work, we study the connection between metric learning and collaborative filtering. We propose Collaborative Metric Learning (C…
View article
Asymmetric Deep Supervised Hashing Open
Hashing has been widely used for large-scale approximate nearest neighbor search because of its storage and search efficiency. Recent work has found that deep supervised hashing can significantly outperform non-deep supervised hashing in m…
View article
Deep Quantization Network for Efficient Image Retrieval Open
Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing improves the quality of hash coding by exploiting the semantic similarity on data pairs and has received increa…
View article
Graph PCA Hashing for Similarity Search Open
This paper proposes a new hashing framework to conduct similarity search via the following steps: first, employing linear clustering methods to obtain a set of representative data points and a set of landmarks of the big dataset; second, u…
View article
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs Open
We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional…
View article
Deep Graph-neighbor Coherence Preserving Network for Unsupervised Cross-modal Hashing Open
Unsupervised cross-modal hashing (UCMH) has become a hot topic recently. Current UCMH focuses on exploring data similarities. However, current UCMH methods calculate the similarity between two data, mainly relying on the two data's cross-m…
View article
Exploring Nearest Neighbor Approaches for Image Captioning Open
We explore a variety of nearest neighbor baseline approaches for image captioning. These approaches find a set of nearest neighbor images in the training set from which a caption may be borrowed for the query image. We select a caption for…
View article
Levenshtein Distance, Sequence Comparison and Biological Database Search Open
Levenshtein edit distance has played a central role—both past and present—in sequence alignment in particular and biological database similarity search in general. We start our review with a history of dynamic programming algorithms for co…
View article
Optimization of distance formula in K-Nearest Neighbor method Open
K-Nearest Neighbor (KNN) is a method applied in classifying objects based on learning data that is closest to the object based on comparison between previous and current data. In the learning process, KNN calculates the distance of the nea…
View article
Efficient Hyperparameter Tuning with Grid Search for Text Categorization using kNN Approach with BM25 Similarity Open
In machine learning, hyperparameter tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. Several approaches have been widely adopted for hyperparameter tuning, which is typically a time consuming pro…
View article
Meta-Path Guided Embedding for Similarity Search in Large-Scale Heterogeneous Information Networks Open
Most real-world data can be modeled as heterogeneous information networks (HINs) consisting of vertices of multiple types and their relationships. Search for similar vertices of the same type in large HINs, such as bibliographic networks a…
View article
A comparative analysis of trajectory similarity measures Open
Computing trajectory similarity is a fundamental operation in movement analytics, required in search, clustering, and classification of trajectories, for example. Yet the range of different but interrelated trajectory similarity measures c…
View article
JOSIE Open
We present a new solution for finding joinable tables in massive data lakes: given a table and one join column, find tables that can be joined with the given table on the largest number of distinct values. The problem can be formulated as …
View article
mTM-align: a server for fast protein structure database search and multiple protein structure alignment Open
With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of …
View article
DrugShot: querying biomedical search terms to retrieve prioritized lists of small molecules Open
Background PubMed contains millions of abstracts that co-mention terms that describe drugs with other biomedical terms such as genes or diseases. Unique opportunities exist for leveraging these co-mentions by integrating them with other dr…
View article
Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing Open
Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. …
View article
An Improved DBSCAN Algorithm Based on the Neighbor Similarity and Fast Nearest Neighbor Query Open
DBSCAN is the most famous density based clustering algorithm which is one of the main clustering paradigms. However, there are many redundant distance computations among the process of DBSCAN clustering, due to brute force Range-Query used…
View article
Binary Hashing for Approximate Nearest Neighbor Search on Big Data: A Survey Open
Nearest neighbor search is a fundamental problem in various domains, such as computer vision, data mining, and machine learning. With the explosive growth of data on the Internet, many new data structures using spatial partitions and recur…
View article
Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval Open
With the dramatic development of the Internet, how to exploit large-scale retrieval techniques for multimodal web data has become one of the most popular but challenging problems in computer vision and multimedia. Recently, hashing methods…
View article
A method for satellite time series anomaly detection based on fast-DTW and improved-KNN Open
In satellite anomaly detection, there are some problems such as unbalanced sample distribution, fewer fault samples, and unobvious anomaly characteristics. These problems cause the extisted anomaly detection methods are difficult to train …
View article
Spectral Multimodal Hashing and Its Application to Multimedia Retrieval Open
In recent years, multimedia retrieval has sparked much research interest in the multimedia, pattern recognition, and data mining communities. Although some attempts have been made along this direction, performing fast multimodal search at …
View article
MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing Open
Most of the unsupervised hashing methods usually map images into semantic similarity-preserving hash codes by constructing local semantic similarity structure as guiding information, i.e., treating each point similar to its k nearest neigh…
View article
Activity-relevant similarity values for fingerprints and implications for similarity searching Open
A largely unsolved problem in chemoinformatics is the issue of how calculated compound similarity relates to activity similarity, which is central to many applications. In general, activity relationships are predicted from calculated simil…
View article
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination Open
In applications ranging from image search to recommendation systems, the problem of identifying a set of "similar" real-valued vectors to a query vector plays a critical role. However, retrieving these vectors and computing the correspondi…
View article
A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems Open
The k-nearest neighbor (kNN) algorithm is a classic supervised machine learning algorithm. It is widely used in cyber-physical-social systems (CPSS) to analyze and mine data. However, in practical CPSS applications, the standard linear kNN…
View article
A generalized fuzzy k-nearest neighbor regression model based on Minkowski distance Open
The fuzzy k-nearest neighbor (FKNN) algorithm, one of the most well-known and effective supervised learning techniques, has often been used in data classification problems but rarely in regression settings. This paper introduces a new, mor…