Towards Optimal Grammars for RNA Structures Article Swipe
Evarista Onokpasa
,
Sebastian Wild
,
Prudence W. H. Wong
·
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2401.16623
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2401.16623
In past work (Onokpasa, Wild, Wong, DCC 2023), we showed that (a) for joint compression of RNA sequence and structure, stochastic context-free grammars are the best known compressors and (b) that grammars which have better compression ability also show better performance in ab initio structure prediction. Previous grammars were manually curated by human experts. In this work, we develop a framework for automatic and systematic search algorithms for stochastic grammars with better compression (and prediction) ability for RNA. We perform an exhaustive search of small grammars and identify grammars that surpass the performance of human-expert grammars.
Related Topics
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2401.16623
- https://arxiv.org/pdf/2401.16623
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4391420940
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4391420940Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2401.16623Digital Object Identifier
- Title
-
Towards Optimal Grammars for RNA StructuresWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-01-29Full publication date if available
- Authors
-
Evarista Onokpasa, Sebastian Wild, Prudence W. H. WongList of authors in order
- Landing page
-
https://arxiv.org/abs/2401.16623Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2401.16623Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2401.16623Direct OA link when available
- Concepts
-
Rule-based machine translation, Computer science, Natural language processingTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4391420940 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2401.16623 |
| ids.doi | https://doi.org/10.48550/arxiv.2401.16623 |
| ids.openalex | https://openalex.org/W4391420940 |
| fwci | |
| type | preprint |
| title | Towards Optimal Grammars for RNA Structures |
| awards[0].id | https://openalex.org/G8905621085 |
| awards[0].funder_id | https://openalex.org/F4320334627 |
| awards[0].display_name | |
| awards[0].funder_award_id | EP/X039447/1 |
| awards[0].funder_display_name | Engineering and Physical Sciences Research Council |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10521 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9993000030517578 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | RNA and protein synthesis mechanisms |
| topics[1].id | https://openalex.org/T11482 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9620000123977661 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | RNA modifications and cancer |
| topics[2].id | https://openalex.org/T12029 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9617999792098999 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | DNA and Biological Computing |
| funders[0].id | https://openalex.org/F4320334627 |
| funders[0].ror | https://ror.org/0439y7842 |
| funders[0].display_name | Engineering and Physical Sciences Research Council |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C53893814 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6976819038391113 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q7378909 |
| concepts[0].display_name | Rule-based machine translation |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.47158563137054443 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.29188477993011475 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| keywords[0].id | https://openalex.org/keywords/rule-based-machine-translation |
| keywords[0].score | 0.6976819038391113 |
| keywords[0].display_name | Rule-based machine translation |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.47158563137054443 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.29188477993011475 |
| keywords[2].display_name | Natural language processing |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2401.16623 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2401.16623 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2401.16623 |
| locations[1].id | doi:10.48550/arxiv.2401.16623 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2401.16623 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5023241798 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Evarista Onokpasa |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Onokpasa, Evarista |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5071263179 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6061-9177 |
| authorships[1].author.display_name | Sebastian Wild |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wild, Sebastian |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5063692696 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-7935-7245 |
| authorships[2].author.display_name | Prudence W. H. Wong |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Wong, Prudence W. H. |
| authorships[2].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2401.16623 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Towards Optimal Grammars for RNA Structures |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10521 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9993000030517578 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | RNA and protein synthesis mechanisms |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W2382290278, https://openalex.org/W2350741829, https://openalex.org/W2530322880, https://openalex.org/W1596801655, https://openalex.org/W2359140296 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2401.16623 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2401.16623 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2401.16623 |
| primary_location.id | pmh:oai:arXiv.org:2401.16623 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2401.16623 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2401.16623 |
| publication_date | 2024-01-29 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 59 |
| abstract_inverted_index.In | 0, 54 |
| abstract_inverted_index.We | 78 |
| abstract_inverted_index.ab | 42 |
| abstract_inverted_index.an | 80 |
| abstract_inverted_index.by | 51 |
| abstract_inverted_index.in | 41 |
| abstract_inverted_index.of | 15, 83, 93 |
| abstract_inverted_index.we | 8, 57 |
| abstract_inverted_index.(a) | 11 |
| abstract_inverted_index.(b) | 29 |
| abstract_inverted_index.DCC | 6 |
| abstract_inverted_index.RNA | 16 |
| abstract_inverted_index.and | 18, 28, 63, 86 |
| abstract_inverted_index.are | 23 |
| abstract_inverted_index.for | 12, 61, 67, 76 |
| abstract_inverted_index.the | 24, 91 |
| abstract_inverted_index.(and | 73 |
| abstract_inverted_index.RNA. | 77 |
| abstract_inverted_index.also | 37 |
| abstract_inverted_index.best | 25 |
| abstract_inverted_index.have | 33 |
| abstract_inverted_index.past | 1 |
| abstract_inverted_index.show | 38 |
| abstract_inverted_index.that | 10, 30, 89 |
| abstract_inverted_index.this | 55 |
| abstract_inverted_index.were | 48 |
| abstract_inverted_index.with | 70 |
| abstract_inverted_index.work | 2 |
| abstract_inverted_index.Wild, | 4 |
| abstract_inverted_index.Wong, | 5 |
| abstract_inverted_index.human | 52 |
| abstract_inverted_index.joint | 13 |
| abstract_inverted_index.known | 26 |
| abstract_inverted_index.small | 84 |
| abstract_inverted_index.which | 32 |
| abstract_inverted_index.work, | 56 |
| abstract_inverted_index.2023), | 7 |
| abstract_inverted_index.better | 34, 39, 71 |
| abstract_inverted_index.initio | 43 |
| abstract_inverted_index.search | 65, 82 |
| abstract_inverted_index.showed | 9 |
| abstract_inverted_index.ability | 36, 75 |
| abstract_inverted_index.curated | 50 |
| abstract_inverted_index.develop | 58 |
| abstract_inverted_index.perform | 79 |
| abstract_inverted_index.surpass | 90 |
| abstract_inverted_index.Previous | 46 |
| abstract_inverted_index.experts. | 53 |
| abstract_inverted_index.grammars | 22, 31, 47, 69, 85, 88 |
| abstract_inverted_index.identify | 87 |
| abstract_inverted_index.manually | 49 |
| abstract_inverted_index.sequence | 17 |
| abstract_inverted_index.automatic | 62 |
| abstract_inverted_index.framework | 60 |
| abstract_inverted_index.grammars. | 95 |
| abstract_inverted_index.structure | 44 |
| abstract_inverted_index.(Onokpasa, | 3 |
| abstract_inverted_index.algorithms | 66 |
| abstract_inverted_index.exhaustive | 81 |
| abstract_inverted_index.stochastic | 20, 68 |
| abstract_inverted_index.structure, | 19 |
| abstract_inverted_index.systematic | 64 |
| abstract_inverted_index.compression | 14, 35, 72 |
| abstract_inverted_index.compressors | 27 |
| abstract_inverted_index.performance | 40, 92 |
| abstract_inverted_index.prediction) | 74 |
| abstract_inverted_index.prediction. | 45 |
| abstract_inverted_index.context-free | 21 |
| abstract_inverted_index.human-expert | 94 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |