Prefix-free graphs and suffix array construction in sublinear space Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2306.14689
A recent paradigm shift in bioinformatics from a single reference genome to a pangenome brought with it several graph structures. These graph structures must implement operations, such as efficient construction from multiple genomes and read mapping. Read mapping is a well-studied problem in sequential data, and, together with data structures such as suffix array and Burrows-Wheeler transform, allows for efficient computation. Attempts to achieve comparatively high performance on graphs bring many complications since the common data structures on strings are not easily obtainable for graphs. In this work, we introduce prefix-free graphs, a novel pangenomic data structure; we show how to construct them and how to use them to obtain well-known data structures from stringology in sublinear space, allowing for many efficient operations on pangenomes.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2306.14689
- https://arxiv.org/pdf/2306.14689
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4382323343
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4382323343Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2306.14689Digital Object Identifier
- Title
-
Prefix-free graphs and suffix array construction in sublinear spaceWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-06-26Full publication date if available
- Authors
-
Andrej Baláž, Alessia PetesciaList of authors in order
- Landing page
-
https://arxiv.org/abs/2306.14689Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2306.14689Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2306.14689Direct OA link when available
- Concepts
-
Computer science, Prefix, Data structure, Sublinear function, Suffix, Suffix array, Theoretical computer science, Computation, Compressed suffix array, Construct (python library), Algorithm, Discrete mathematics, Mathematics, Suffix tree, Computer network, Programming language, Linguistics, PhilosophyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4382323343 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2306.14689 |
| ids.doi | https://doi.org/10.48550/arxiv.2306.14689 |
| ids.openalex | https://openalex.org/W4382323343 |
| fwci | |
| type | preprint |
| title | Prefix-free graphs and suffix array construction in sublinear space |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11269 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9995999932289124 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Algorithms and Data Compression |
| topics[1].id | https://openalex.org/T10015 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9824000000953674 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | Genomics and Phylogenetic Studies |
| topics[2].id | https://openalex.org/T10222 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9688000082969666 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Genomics and Chromatin Dynamics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7120466232299805 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C141603448 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6926887035369873 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q134830 |
| concepts[1].display_name | Prefix |
| concepts[2].id | https://openalex.org/C162319229 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6879509091377258 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q175263 |
| concepts[2].display_name | Data structure |
| concepts[3].id | https://openalex.org/C117160843 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6735744476318359 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q338652 |
| concepts[3].display_name | Sublinear function |
| concepts[4].id | https://openalex.org/C2779804580 |
| concepts[4].level | 2 |
| concepts[4].score | 0.595281183719635 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q102047 |
| concepts[4].display_name | Suffix |
| concepts[5].id | https://openalex.org/C2779259728 |
| concepts[5].level | 3 |
| concepts[5].score | 0.5644867420196533 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q281472 |
| concepts[5].display_name | Suffix array |
| concepts[6].id | https://openalex.org/C80444323 |
| concepts[6].level | 1 |
| concepts[6].score | 0.5455666184425354 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[6].display_name | Theoretical computer science |
| concepts[7].id | https://openalex.org/C45374587 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5392844080924988 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q12525525 |
| concepts[7].display_name | Computation |
| concepts[8].id | https://openalex.org/C100903775 |
| concepts[8].level | 4 |
| concepts[8].score | 0.4946291148662567 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q5157028 |
| concepts[8].display_name | Compressed suffix array |
| concepts[9].id | https://openalex.org/C2780801425 |
| concepts[9].level | 2 |
| concepts[9].score | 0.439251184463501 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q5164392 |
| concepts[9].display_name | Construct (python library) |
| concepts[10].id | https://openalex.org/C11413529 |
| concepts[10].level | 1 |
| concepts[10].score | 0.350506067276001 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[10].display_name | Algorithm |
| concepts[11].id | https://openalex.org/C118615104 |
| concepts[11].level | 1 |
| concepts[11].score | 0.22980636358261108 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q121416 |
| concepts[11].display_name | Discrete mathematics |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.21322882175445557 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C2781166958 |
| concepts[13].level | 3 |
| concepts[13].score | 0.19836243987083435 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q1426863 |
| concepts[13].display_name | Suffix tree |
| concepts[14].id | https://openalex.org/C31258907 |
| concepts[14].level | 1 |
| concepts[14].score | 0.07356828451156616 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[14].display_name | Computer network |
| concepts[15].id | https://openalex.org/C199360897 |
| concepts[15].level | 1 |
| concepts[15].score | 0.06620240211486816 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[15].display_name | Programming language |
| concepts[16].id | https://openalex.org/C41895202 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[16].display_name | Linguistics |
| concepts[17].id | https://openalex.org/C138885662 |
| concepts[17].level | 0 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[17].display_name | Philosophy |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7120466232299805 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/prefix |
| keywords[1].score | 0.6926887035369873 |
| keywords[1].display_name | Prefix |
| keywords[2].id | https://openalex.org/keywords/data-structure |
| keywords[2].score | 0.6879509091377258 |
| keywords[2].display_name | Data structure |
| keywords[3].id | https://openalex.org/keywords/sublinear-function |
| keywords[3].score | 0.6735744476318359 |
| keywords[3].display_name | Sublinear function |
| keywords[4].id | https://openalex.org/keywords/suffix |
| keywords[4].score | 0.595281183719635 |
| keywords[4].display_name | Suffix |
| keywords[5].id | https://openalex.org/keywords/suffix-array |
| keywords[5].score | 0.5644867420196533 |
| keywords[5].display_name | Suffix array |
| keywords[6].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[6].score | 0.5455666184425354 |
| keywords[6].display_name | Theoretical computer science |
| keywords[7].id | https://openalex.org/keywords/computation |
| keywords[7].score | 0.5392844080924988 |
| keywords[7].display_name | Computation |
| keywords[8].id | https://openalex.org/keywords/compressed-suffix-array |
| keywords[8].score | 0.4946291148662567 |
| keywords[8].display_name | Compressed suffix array |
| keywords[9].id | https://openalex.org/keywords/construct |
| keywords[9].score | 0.439251184463501 |
| keywords[9].display_name | Construct (python library) |
| keywords[10].id | https://openalex.org/keywords/algorithm |
| keywords[10].score | 0.350506067276001 |
| keywords[10].display_name | Algorithm |
| keywords[11].id | https://openalex.org/keywords/discrete-mathematics |
| keywords[11].score | 0.22980636358261108 |
| keywords[11].display_name | Discrete mathematics |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.21322882175445557 |
| keywords[12].display_name | Mathematics |
| keywords[13].id | https://openalex.org/keywords/suffix-tree |
| keywords[13].score | 0.19836243987083435 |
| keywords[13].display_name | Suffix tree |
| keywords[14].id | https://openalex.org/keywords/computer-network |
| keywords[14].score | 0.07356828451156616 |
| keywords[14].display_name | Computer network |
| keywords[15].id | https://openalex.org/keywords/programming-language |
| keywords[15].score | 0.06620240211486816 |
| keywords[15].display_name | Programming language |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2306.14689 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2306.14689 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2306.14689 |
| locations[1].id | doi:10.48550/arxiv.2306.14689 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2306.14689 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5059632877 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3113-3389 |
| authorships[0].author.display_name | Andrej Baláž |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Baláž, Andrej |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5034083296 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9000-3664 |
| authorships[1].author.display_name | Alessia Petescia |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Petescia, Alessia |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2306.14689 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Prefix-free graphs and suffix array construction in sublinear space |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11269 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9995999932289124 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Algorithms and Data Compression |
| related_works | https://openalex.org/W2576782855, https://openalex.org/W2359436045, https://openalex.org/W153119118, https://openalex.org/W2143531254, https://openalex.org/W2608759984, https://openalex.org/W363060408, https://openalex.org/W2003608043, https://openalex.org/W2055795184, https://openalex.org/W2738126475, https://openalex.org/W2063556310 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2306.14689 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2306.14689 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2306.14689 |
| primary_location.id | pmh:oai:arXiv.org:2306.14689 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2306.14689 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2306.14689 |
| publication_date | 2023-06-26 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 0 |
| abstract_inverted_index.a | 7, 12, 39, 92 |
| abstract_inverted_index.In | 85 |
| abstract_inverted_index.as | 27, 51 |
| abstract_inverted_index.in | 4, 42, 115 |
| abstract_inverted_index.is | 38 |
| abstract_inverted_index.it | 16 |
| abstract_inverted_index.on | 67, 77, 123 |
| abstract_inverted_index.to | 11, 62, 100, 105, 108 |
| abstract_inverted_index.we | 88, 97 |
| abstract_inverted_index.and | 33, 54, 103 |
| abstract_inverted_index.are | 79 |
| abstract_inverted_index.for | 58, 83, 119 |
| abstract_inverted_index.how | 99, 104 |
| abstract_inverted_index.not | 80 |
| abstract_inverted_index.the | 73 |
| abstract_inverted_index.use | 106 |
| abstract_inverted_index.Read | 36 |
| abstract_inverted_index.and, | 45 |
| abstract_inverted_index.data | 48, 75, 95, 111 |
| abstract_inverted_index.from | 6, 30, 113 |
| abstract_inverted_index.high | 65 |
| abstract_inverted_index.many | 70, 120 |
| abstract_inverted_index.must | 23 |
| abstract_inverted_index.read | 34 |
| abstract_inverted_index.show | 98 |
| abstract_inverted_index.such | 26, 50 |
| abstract_inverted_index.them | 102, 107 |
| abstract_inverted_index.this | 86 |
| abstract_inverted_index.with | 15, 47 |
| abstract_inverted_index.These | 20 |
| abstract_inverted_index.array | 53 |
| abstract_inverted_index.bring | 69 |
| abstract_inverted_index.data, | 44 |
| abstract_inverted_index.graph | 18, 21 |
| abstract_inverted_index.novel | 93 |
| abstract_inverted_index.shift | 3 |
| abstract_inverted_index.since | 72 |
| abstract_inverted_index.work, | 87 |
| abstract_inverted_index.allows | 57 |
| abstract_inverted_index.common | 74 |
| abstract_inverted_index.easily | 81 |
| abstract_inverted_index.genome | 10 |
| abstract_inverted_index.graphs | 68 |
| abstract_inverted_index.obtain | 109 |
| abstract_inverted_index.recent | 1 |
| abstract_inverted_index.single | 8 |
| abstract_inverted_index.space, | 117 |
| abstract_inverted_index.suffix | 52 |
| abstract_inverted_index.achieve | 63 |
| abstract_inverted_index.brought | 14 |
| abstract_inverted_index.genomes | 32 |
| abstract_inverted_index.graphs, | 91 |
| abstract_inverted_index.graphs. | 84 |
| abstract_inverted_index.mapping | 37 |
| abstract_inverted_index.problem | 41 |
| abstract_inverted_index.several | 17 |
| abstract_inverted_index.strings | 78 |
| abstract_inverted_index.Attempts | 61 |
| abstract_inverted_index.allowing | 118 |
| abstract_inverted_index.mapping. | 35 |
| abstract_inverted_index.multiple | 31 |
| abstract_inverted_index.paradigm | 2 |
| abstract_inverted_index.together | 46 |
| abstract_inverted_index.construct | 101 |
| abstract_inverted_index.efficient | 28, 59, 121 |
| abstract_inverted_index.implement | 24 |
| abstract_inverted_index.introduce | 89 |
| abstract_inverted_index.pangenome | 13 |
| abstract_inverted_index.reference | 9 |
| abstract_inverted_index.sublinear | 116 |
| abstract_inverted_index.obtainable | 82 |
| abstract_inverted_index.operations | 122 |
| abstract_inverted_index.pangenomic | 94 |
| abstract_inverted_index.sequential | 43 |
| abstract_inverted_index.structure; | 96 |
| abstract_inverted_index.structures | 22, 49, 76, 112 |
| abstract_inverted_index.transform, | 56 |
| abstract_inverted_index.well-known | 110 |
| abstract_inverted_index.operations, | 25 |
| abstract_inverted_index.pangenomes. | 124 |
| abstract_inverted_index.performance | 66 |
| abstract_inverted_index.prefix-free | 90 |
| abstract_inverted_index.stringology | 114 |
| abstract_inverted_index.structures. | 19 |
| abstract_inverted_index.computation. | 60 |
| abstract_inverted_index.construction | 29 |
| abstract_inverted_index.well-studied | 40 |
| abstract_inverted_index.comparatively | 64 |
| abstract_inverted_index.complications | 71 |
| abstract_inverted_index.bioinformatics | 5 |
| abstract_inverted_index.Burrows-Wheeler | 55 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |