Gonzalo Navarro
YOU?
Author Swipe
View article: CompactLTJ: Space & Time Efficient Leapfrog Triejoin on Graph Databases
CompactLTJ: Space & Time Efficient Leapfrog Triejoin on Graph Databases Open
Leapfrog Triejoin (LTJ) is arguably the most practical and popular worst-case-optimal (wco) algorithm for solving basic graph patterns in graph databases. Its main drawback is that it needs the database triples (subject, predicate, object)…
View article: Smallest Suffixient Sets: Effectiveness, Resilience, and Calculation
Smallest Suffixient Sets: Effectiveness, Resilience, and Calculation Open
A suffixient set is a novel combinatorial object that captures the essential information of repetitive strings in a way that, provided with a random access mechanism, supports various forms of pattern matching. In this paper, we study the …
View article: Practical Adaptive Dynamic Bitvectors
Practical Adaptive Dynamic Bitvectors Open
Introduction While operations rank and select on static bitvectors can be supported in constant time, lower bounds show that this is impossible when supporting updates; practical implementations offer time for the operations, which is clos…
View article: Worst-Case-Optimal Joins on Graphs with Topological Relations
Worst-Case-Optimal Joins on Graphs with Topological Relations Open
View article: PD155 RedETS Horizon Scanning: Impact In The Decision-Making Process
PD155 RedETS Horizon Scanning: Impact In The Decision-Making Process Open
Introduction The RedETS horizon scanning (HS) program in Spain is focused on identifying non-pharmaceutical emerging health technologies. HS is organized in three steps: (i) identification using different sources (PubMed, the biomedical pr…
View article: Computing MEMs and Relatives on Repetitive Text Collections
Computing MEMs and Relatives on Repetitive Text Collections Open
We consider the problem of computing the Maximal Exact Matches (MEMs) of a given pattern \(P[1\mathinner{.. }m]\) on a large repetitive text collection \(T[1\mathinner{.. }n]\) over an alphabet of size \(\sigma\) , which is represented as …
View article: Fast and Small Subsampled R-indexes
Fast and Small Subsampled R-indexes Open
The $r$-index represented a breakthrough in compressed indexing of repetitive text collections, outperforming its alternatives by orders of magnitude in query time. Its space usage, $O(r)$ where $r$ is the number of runs in the Burrows--Wh…
View article: Faster run-length compressed suffix arrays
Faster run-length compressed suffix arrays Open
We first review how we can store a run-length compressed suffix array (RLCSA) for a text $T$ of length $n$ over an alphabet of size $σ$ whose Burrows-Wheeler Transform (BWT) consists of $r$ runs in $O \left( \rule{0ex}{2ex} r \log (n / r) …
View article: New Compressed Indices for Multijoins on Graph Databases
New Compressed Indices for Multijoins on Graph Databases Open
A recent surprising result in the implementation of worst-case-optimal (wco) multijoins in graph databases (specifically, basic graph patterns) is that they can be supported on graph representations that take even less space than a plain r…
View article: Counting on General Run-Length Grammars
Counting on General Run-Length Grammars Open
We introduce a data structure for counting pattern occurrences in texts compressed with any run-length context-free grammar. Our structure uses space proportional to the grammar size and counts the occurrences of a pattern of length $m$ in…
View article: Generalized Straight-Line Programs
Generalized Straight-Line Programs Open
It was recently proved that any Straight-Line Program (SLP) generating a given string can be transformed in linear time into an equivalent balanced SLP of the same asymptotic size. We generalize this proof to a general class of grammars we…
View article: Faster Maximal Exact Matches with Lazy LCP Evaluation
Faster Maximal Exact Matches with Lazy LCP Evaluation Open
MONI (Rossi et al., JCB 2022) is a BWT-based compressed index for computing the matching statistics and maximal exact matches (MEMs) of a pattern (usually a DNA read) with respect to a highly repetitive text (usually a database of g…
View article: A Textbook Solution for Dynamic Strings
A Textbook Solution for Dynamic Strings Open
We consider the problem of maintaining a collection of strings while efficiently supporting splits and concatenations on them, as well as comparing two substrings, and computing the longest common prefix between two suffixes. This problem …
View article: BAT-LZ Out of Hell
BAT-LZ Out of Hell Open
Despite consistently yielding the best compression on repetitive text collections, the Lempel-Ziv parsing has resisted all attempts at offering relevant guarantees on the cost to access an arbitrary symbol. This makes it less attractive fo…
View article: Worst-Case-Optimal Similarity Joins on Graph Databases
Worst-Case-Optimal Similarity Joins on Graph Databases Open
We extend the concept of worst-case optimal equijoins in graph databases to the case where some nodes are required to be within the k-nearest neighbors (kNN) of others under some similarity function. We model the problem by superimposing t…
View article: Iterated Straight-Line Programs
Iterated Straight-Line Programs Open
We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure $δ$ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial …
View article: Stronger compact representations of object trajectories
Stronger compact representations of object trajectories Open
[Absctract]: GraCT and ContaCT were the first compressed data structures to represent object trajectories, demonstrating that it was possible to use orders of magnitude less space than classical indexes while staying competitive in query t…
View article: Taxonomic classification with maximal exact matches in KATKA kernels and minimizer digests
Taxonomic classification with maximal exact matches in KATKA kernels and minimizer digests Open
For taxonomic classification, we are asked to index the genomes in a phylogenetic tree such that later, given a DNA read, we can quickly choose a small subtree likely to contain the genome from which that read was drawn. Although popular c…
View article: The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra Space
The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra Space Open
We present an indexing scheme for triple-based graphs that supports join queries in worst-case optimal (wco) time within compact space. This scheme, called a ring , regards each triple as a cyclic string of length 3. Each rotation of the t…
View article: Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests
Taxonomic Classification with Maximal Exact Matches in KATKA Kernels and Minimizer Digests Open
For taxonomic classification, we are asked to index the genomes in a phylogenetic tree such that later, given a DNA read, we can quickly choose a small subtree likely to contain the genome from which that read was drawn. Although popular c…
View article: Suffixient Sets
Suffixient Sets Open
We define a suffixient set for a text $T [1..n]$ to be a set $S$ of positions between 1 and $n$ such that, for any edge descending from a node $u$ to a node $v$ in the suffix tree of $T$, there is an element $s \in S$ such that $u$'s path …
View article: Faster Maximal Exact Matches with Lazy LCP Evaluation
Faster Maximal Exact Matches with Lazy LCP Evaluation Open
MONI (Rossi et al., {\it JCB} 2022) is a BWT-based compressed index for computing the matching statistics and maximal exact matches (MEMs) of a pattern (usually a DNA read) with respect to a highly repetitive text (usually a database of ge…
View article: Optimizing RPQs over a compact graph representation
Optimizing RPQs over a compact graph representation Open
View article: Efficient construction of the BWT for repetitive text using string compression
Efficient construction of the BWT for repetitive text using string compression Open
View article: Dynamic Compact Data Structure for Temporal Reachability with Unsorted Contact Insertions
Dynamic Compact Data Structure for Temporal Reachability with Unsorted Contact Insertions Open
Temporal graphs represent interactions between entities over time. Deciding whether entities can reach each other through temporal paths is useful for various applications such as in communication networks and epidemiology. Previous works …
View article: Wheeler maps
Wheeler maps Open
Motivated by challenges in pangenomic read alignment, we propose a generalization of Wheeler graphs that we call Wheeler maps. A Wheeler map stores a text $T[1..n]$ and an assignment of tags to the characters of $T$ such that we can prepro…
View article: Evaluating Regular Path Queries on Compressed Adjacency Matrices
Evaluating Regular Path Queries on Compressed Adjacency Matrices Open
Regular Path Queries (RPQs), which are essentially regular expressions to be matched against the labels of paths in labeled graphs, are at the core of graph database query languages like SPARQL. A way to solve RPQs is to translate them int…
View article: MillenniumDB: An Open-Source Graph Database System
MillenniumDB: An Open-Source Graph Database System Open
In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon w…
View article: Maintaining the cycle structure of dynamic permutations
Maintaining the cycle structure of dynamic permutations Open
We present a new data structure for maintaining dynamic permutations, which we call a $\textit{forest of splay trees (FST)}$. The FST allows one to efficiently maintain the cycle structure of a permutation $π$ when the allowed updates are …
View article: Compact representations of spatial hierarchical structures with support for topological queries
Compact representations of spatial hierarchical structures with support for topological queries Open