Scaling Dense Representations for Single Cell with Transcriptome-Scale Context Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1101/2024.11.28.625303
Developing a unified model of cellular systems is a canonical challenge in biology. Recently, a wealth of public single-cell RNA sequencing data as well as rapid scaling of self-supervised learning methods have provided new avenues to address this longstanding challenge. However, rapid parameter scaling has been essential to the success of large language models in text and images, while similar scaling has not been attempted with Transformer architectures for cellular modeling. To produce accurate, transferable, and biologically meaningful representations of cellular systems, we develop AIDO.Cell, a pretrained module for representing gene expression and cellular systems in an AI-driven Digital Organism [1]. AIDO.Cell contains a series of 3M, 10M, 100M, and 650M parameter encoder-only dense Transformer models pre-trained on 50 million human cells from diverse tissues using a read-depth-aware masked gene expression pretraining objective. Unlike previous models, AIDO.Cell is capable of handling the entire human transcriptome as input without truncation or sampling tricks, thus learning accurate and general representations of the human cell’s entire transcriptional context. This pretraining with a longer context was enabled through FlashAttention-2, mixed precision, and large-scale distributed systems training. AIDO.Cell (100M) achieves state-of-the-art results in tasks such as zero-shot clustering, cell-type classification, and perturbation modeling. Our findings reveal interesting loss scaling behaviors as we increase AIDO.Cell’s parameters from 3M to 650M, providing insights for future directions in single-cell modeling. Models and code are available through ModelGenerator in https://github.com/genbio-ai/AIDO and on Hugging Face.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.1101/2024.11.28.625303
- https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdf
- OA Status
- green
- Cited By
- 8
- References
- 39
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4404945205
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4404945205Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1101/2024.11.28.625303Digital Object Identifier
- Title
-
Scaling Dense Representations for Single Cell with Transcriptome-Scale ContextWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-03Full publication date if available
- Authors
-
Nicholas Ho, Caleb N. Ellington, Jing Hou, Sohan Addagudi, Shentong Mo, Tianhua Tao, Dian Li, Yonghao Zhuang, Hongyi Wang, Xingyi Cheng, Le Song, Eric P. XingList of authors in order
- Landing page
-
https://doi.org/10.1101/2024.11.28.625303Publisher landing page
- PDF URL
-
https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdfDirect OA link when available
- Concepts
-
Computer science, Scaling, Context (archaeology), Encoder, Artificial intelligence, Transformer, Computational biology, Biology, Mathematics, Engineering, Electrical engineering, Geometry, Operating system, Voltage, PaleontologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
8Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 8Per-year citation counts (last 5 years)
- References (count)
-
39Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4404945205 |
|---|---|
| doi | https://doi.org/10.1101/2024.11.28.625303 |
| ids.doi | https://doi.org/10.1101/2024.11.28.625303 |
| ids.openalex | https://openalex.org/W4404945205 |
| fwci | 3.8419778 |
| type | preprint |
| title | Scaling Dense Representations for Single Cell with Transcriptome-Scale Context |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11289 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Single-cell and spatial transcriptomics |
| topics[1].id | https://openalex.org/T12859 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9988999962806702 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1304 |
| topics[1].subfield.display_name | Biophysics |
| topics[1].display_name | Cell Image Analysis Techniques |
| topics[2].id | https://openalex.org/T10621 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9894000291824341 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Gene Regulatory Network Analysis |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.6892776489257812 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C99844830 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6027362942695618 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q102441924 |
| concepts[1].display_name | Scaling |
| concepts[2].id | https://openalex.org/C2779343474 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5215986967086792 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q3109175 |
| concepts[2].display_name | Context (archaeology) |
| concepts[3].id | https://openalex.org/C118505674 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5051954388618469 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q42586063 |
| concepts[3].display_name | Encoder |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.4999673366546631 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C66322947 |
| concepts[5].level | 3 |
| concepts[5].score | 0.4562777578830719 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11658 |
| concepts[5].display_name | Transformer |
| concepts[6].id | https://openalex.org/C70721500 |
| concepts[6].level | 1 |
| concepts[6].score | 0.326898992061615 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q177005 |
| concepts[6].display_name | Computational biology |
| concepts[7].id | https://openalex.org/C86803240 |
| concepts[7].level | 0 |
| concepts[7].score | 0.2164558470249176 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[7].display_name | Biology |
| concepts[8].id | https://openalex.org/C33923547 |
| concepts[8].level | 0 |
| concepts[8].score | 0.1428864300251007 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[8].display_name | Mathematics |
| concepts[9].id | https://openalex.org/C127413603 |
| concepts[9].level | 0 |
| concepts[9].score | 0.09203392267227173 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[9].display_name | Engineering |
| concepts[10].id | https://openalex.org/C119599485 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q43035 |
| concepts[10].display_name | Electrical engineering |
| concepts[11].id | https://openalex.org/C2524010 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[11].display_name | Geometry |
| concepts[12].id | https://openalex.org/C111919701 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[12].display_name | Operating system |
| concepts[13].id | https://openalex.org/C165801399 |
| concepts[13].level | 2 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q25428 |
| concepts[13].display_name | Voltage |
| concepts[14].id | https://openalex.org/C151730666 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7205 |
| concepts[14].display_name | Paleontology |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.6892776489257812 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/scaling |
| keywords[1].score | 0.6027362942695618 |
| keywords[1].display_name | Scaling |
| keywords[2].id | https://openalex.org/keywords/context |
| keywords[2].score | 0.5215986967086792 |
| keywords[2].display_name | Context (archaeology) |
| keywords[3].id | https://openalex.org/keywords/encoder |
| keywords[3].score | 0.5051954388618469 |
| keywords[3].display_name | Encoder |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.4999673366546631 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/transformer |
| keywords[5].score | 0.4562777578830719 |
| keywords[5].display_name | Transformer |
| keywords[6].id | https://openalex.org/keywords/computational-biology |
| keywords[6].score | 0.326898992061615 |
| keywords[6].display_name | Computational biology |
| keywords[7].id | https://openalex.org/keywords/biology |
| keywords[7].score | 0.2164558470249176 |
| keywords[7].display_name | Biology |
| keywords[8].id | https://openalex.org/keywords/mathematics |
| keywords[8].score | 0.1428864300251007 |
| keywords[8].display_name | Mathematics |
| keywords[9].id | https://openalex.org/keywords/engineering |
| keywords[9].score | 0.09203392267227173 |
| keywords[9].display_name | Engineering |
| language | en |
| locations[0].id | doi:10.1101/2024.11.28.625303 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306402567 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| locations[0].source.host_organization | https://openalex.org/I2750212522 |
| locations[0].source.host_organization_name | Cold Spring Harbor Laboratory |
| locations[0].source.host_organization_lineage | https://openalex.org/I2750212522 |
| locations[0].license | cc-by-nc-nd |
| locations[0].pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdf |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-nd |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.1101/2024.11.28.625303 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5102978796 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-9843-0866 |
| authorships[0].author.display_name | Nicholas Ho |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I74973139 |
| authorships[0].affiliations[0].raw_affiliation_string | Carnegie Mellon University |
| authorships[0].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[0].institutions[0].id | https://openalex.org/I74973139 |
| authorships[0].institutions[0].ror | https://ror.org/05x2bcf33 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I74973139 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Carnegie Mellon University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Nicholas Ho |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Carnegie Mellon University, GenBio AI |
| authorships[1].author.id | https://openalex.org/A5047695651 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7029-8023 |
| authorships[1].author.display_name | Caleb N. Ellington |
| authorships[1].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Caleb N. Ellington |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | GenBio AI |
| authorships[2].author.id | https://openalex.org/A5101756256 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-0792-8971 |
| authorships[2].author.display_name | Jing Hou |
| authorships[2].countries | US |
| authorships[2].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[2].affiliations[1].institution_ids | https://openalex.org/I74973139 |
| authorships[2].affiliations[1].raw_affiliation_string | Carnegie Mellon University |
| authorships[2].institutions[0].id | https://openalex.org/I74973139 |
| authorships[2].institutions[0].ror | https://ror.org/05x2bcf33 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I74973139 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | Carnegie Mellon University |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Jinyu Hou |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Carnegie Mellon University, GenBio AI |
| authorships[3].author.id | https://openalex.org/A5114967852 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Sohan Addagudi |
| authorships[3].countries | US |
| authorships[3].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[3].affiliations[1].institution_ids | https://openalex.org/I74973139 |
| authorships[3].affiliations[1].raw_affiliation_string | Carnegie Mellon University |
| authorships[3].institutions[0].id | https://openalex.org/I74973139 |
| authorships[3].institutions[0].ror | https://ror.org/05x2bcf33 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I74973139 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | Carnegie Mellon University |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Sohan Addagudi |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Carnegie Mellon University, GenBio AI |
| authorships[4].author.id | https://openalex.org/A5042783792 |
| authorships[4].author.orcid | https://orcid.org/0000-0003-3308-9585 |
| authorships[4].author.display_name | Shentong Mo |
| authorships[4].countries | AE |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I4210113480 |
| authorships[4].affiliations[0].raw_affiliation_string | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[4].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[4].institutions[0].id | https://openalex.org/I4210113480 |
| authorships[4].institutions[0].ror | https://ror.org/0258gkt32 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I4210113480 |
| authorships[4].institutions[0].country_code | AE |
| authorships[4].institutions[0].display_name | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Shentong Mo |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | GenBio AI, Mohamed bin Zayed University of Artificial Intelligence |
| authorships[5].author.id | https://openalex.org/A5081822522 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Tianhua Tao |
| authorships[5].countries | US |
| authorships[5].affiliations[0].institution_ids | https://openalex.org/I201448701 |
| authorships[5].affiliations[0].raw_affiliation_string | University of Washington |
| authorships[5].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[5].institutions[0].id | https://openalex.org/I201448701 |
| authorships[5].institutions[0].ror | https://ror.org/00cvxb145 |
| authorships[5].institutions[0].type | education |
| authorships[5].institutions[0].lineage | https://openalex.org/I201448701 |
| authorships[5].institutions[0].country_code | US |
| authorships[5].institutions[0].display_name | University of Washington |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Tianhua Tao |
| authorships[5].is_corresponding | False |
| authorships[5].raw_affiliation_strings | GenBio AI, University of Washington |
| authorships[6].author.id | https://openalex.org/A5100675443 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-9968-1907 |
| authorships[6].author.display_name | Dian Li |
| authorships[6].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Dian Li |
| authorships[6].is_corresponding | False |
| authorships[6].raw_affiliation_strings | GenBio AI |
| authorships[7].author.id | https://openalex.org/A5076407338 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Yonghao Zhuang |
| authorships[7].countries | US |
| authorships[7].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[7].affiliations[1].institution_ids | https://openalex.org/I74973139 |
| authorships[7].affiliations[1].raw_affiliation_string | Carnegie Mellon University |
| authorships[7].institutions[0].id | https://openalex.org/I74973139 |
| authorships[7].institutions[0].ror | https://ror.org/05x2bcf33 |
| authorships[7].institutions[0].type | education |
| authorships[7].institutions[0].lineage | https://openalex.org/I74973139 |
| authorships[7].institutions[0].country_code | US |
| authorships[7].institutions[0].display_name | Carnegie Mellon University |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Yonghao Zhuang |
| authorships[7].is_corresponding | False |
| authorships[7].raw_affiliation_strings | Carnegie Mellon University, GenBio AI |
| authorships[8].author.id | https://openalex.org/A5100701041 |
| authorships[8].author.orcid | https://orcid.org/0009-0006-0034-0074 |
| authorships[8].author.display_name | Hongyi Wang |
| authorships[8].affiliations[0].raw_affiliation_string | GenBio AI |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Hongyi Wang |
| authorships[8].is_corresponding | False |
| authorships[8].raw_affiliation_strings | GenBio AI |
| authorships[9].author.id | https://openalex.org/A5034236827 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | Xingyi Cheng |
| authorships[9].countries | AE |
| authorships[9].affiliations[0].institution_ids | https://openalex.org/I4210113480 |
| authorships[9].affiliations[0].raw_affiliation_string | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[9].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[9].institutions[0].id | https://openalex.org/I4210113480 |
| authorships[9].institutions[0].ror | https://ror.org/0258gkt32 |
| authorships[9].institutions[0].type | education |
| authorships[9].institutions[0].lineage | https://openalex.org/I4210113480 |
| authorships[9].institutions[0].country_code | AE |
| authorships[9].institutions[0].display_name | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Xingyi Cheng |
| authorships[9].is_corresponding | False |
| authorships[9].raw_affiliation_strings | GenBio AI, Mohamed bin Zayed University of Artificial Intelligence |
| authorships[10].author.id | https://openalex.org/A5030589527 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-9655-2787 |
| authorships[10].author.display_name | Le Song |
| authorships[10].countries | AE |
| authorships[10].affiliations[0].institution_ids | https://openalex.org/I4210113480 |
| authorships[10].affiliations[0].raw_affiliation_string | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[10].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[10].institutions[0].id | https://openalex.org/I4210113480 |
| authorships[10].institutions[0].ror | https://ror.org/0258gkt32 |
| authorships[10].institutions[0].type | education |
| authorships[10].institutions[0].lineage | https://openalex.org/I4210113480 |
| authorships[10].institutions[0].country_code | AE |
| authorships[10].institutions[0].display_name | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Le Song |
| authorships[10].is_corresponding | True |
| authorships[10].raw_affiliation_strings | GenBio AI, Mohamed bin Zayed University of Artificial Intelligence |
| authorships[11].author.id | https://openalex.org/A5009547049 |
| authorships[11].author.orcid | https://orcid.org/0009-0005-9158-4201 |
| authorships[11].author.display_name | Eric P. Xing |
| authorships[11].countries | AE, US |
| authorships[11].affiliations[0].institution_ids | https://openalex.org/I74973139 |
| authorships[11].affiliations[0].raw_affiliation_string | Carnegie Mellon University |
| authorships[11].affiliations[1].raw_affiliation_string | GenBio AI |
| authorships[11].affiliations[2].institution_ids | https://openalex.org/I4210113480 |
| authorships[11].affiliations[2].raw_affiliation_string | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[11].institutions[0].id | https://openalex.org/I4210113480 |
| authorships[11].institutions[0].ror | https://ror.org/0258gkt32 |
| authorships[11].institutions[0].type | education |
| authorships[11].institutions[0].lineage | https://openalex.org/I4210113480 |
| authorships[11].institutions[0].country_code | AE |
| authorships[11].institutions[0].display_name | Mohamed bin Zayed University of Artificial Intelligence |
| authorships[11].institutions[1].id | https://openalex.org/I74973139 |
| authorships[11].institutions[1].ror | https://ror.org/05x2bcf33 |
| authorships[11].institutions[1].type | education |
| authorships[11].institutions[1].lineage | https://openalex.org/I74973139 |
| authorships[11].institutions[1].country_code | US |
| authorships[11].institutions[1].display_name | Carnegie Mellon University |
| authorships[11].author_position | last |
| authorships[11].raw_author_name | Eric P. Xing |
| authorships[11].is_corresponding | True |
| authorships[11].raw_affiliation_strings | Carnegie Mellon University, GenBio AI, Mohamed bin Zayed University of Artificial Intelligence |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdf |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Scaling Dense Representations for Single Cell with Transcriptome-Scale Context |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11289 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Single-cell and spatial transcriptomics |
| related_works | https://openalex.org/W4390516098, https://openalex.org/W141820298, https://openalex.org/W2181948922, https://openalex.org/W2384362569, https://openalex.org/W2049584446, https://openalex.org/W2079781215, https://openalex.org/W4378770497, https://openalex.org/W2142795561, https://openalex.org/W4308245303, https://openalex.org/W2014033564 |
| cited_by_count | 8 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 8 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1101/2024.11.28.625303 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306402567 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| best_oa_location.source.host_organization | https://openalex.org/I2750212522 |
| best_oa_location.source.host_organization_name | Cold Spring Harbor Laboratory |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I2750212522 |
| best_oa_location.license | cc-by-nc-nd |
| best_oa_location.pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdf |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.1101/2024.11.28.625303 |
| primary_location.id | doi:10.1101/2024.11.28.625303 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306402567 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| primary_location.source.host_organization | https://openalex.org/I2750212522 |
| primary_location.source.host_organization_name | Cold Spring Harbor Laboratory |
| primary_location.source.host_organization_lineage | https://openalex.org/I2750212522 |
| primary_location.license | cc-by-nc-nd |
| primary_location.pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/12/03/2024.11.28.625303.full.pdf |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.1101/2024.11.28.625303 |
| publication_date | 2024-12-03 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W4401736259, https://openalex.org/W4322718191, https://openalex.org/W4384918448, https://openalex.org/W2341128841, https://openalex.org/W4310942643, https://openalex.org/W4391262585, https://openalex.org/W2561754210, https://openalex.org/W4206247338, https://openalex.org/W4297243391, https://openalex.org/W4389132297, https://openalex.org/W4378838672, https://openalex.org/W4399387478, https://openalex.org/W4386553577, https://openalex.org/W4387696832, https://openalex.org/W4387866792, https://openalex.org/W4402694371, https://openalex.org/W4286750685, https://openalex.org/W4401157551, https://openalex.org/W2973727699, https://openalex.org/W3204998121, https://openalex.org/W2896457183, https://openalex.org/W4399317989, https://openalex.org/W4399657245, https://openalex.org/W4393399080, https://openalex.org/W3168867926, https://openalex.org/W2095705004, https://openalex.org/W4399470739, https://openalex.org/W3211595762, https://openalex.org/W4282937885, https://openalex.org/W4384523448, https://openalex.org/W4225591000, https://openalex.org/W4378505278, https://openalex.org/W2901677030, https://openalex.org/W2951506174, https://openalex.org/W2523369352, https://openalex.org/W4285596404, https://openalex.org/W4387122971, https://openalex.org/W4385955324, https://openalex.org/W4392168151 |
| referenced_works_count | 39 |
| abstract_inverted_index.a | 2, 9, 15, 86, 104, 127, 169 |
| abstract_inverted_index.3M | 212 |
| abstract_inverted_index.50 | 119 |
| abstract_inverted_index.To | 72 |
| abstract_inverted_index.an | 97 |
| abstract_inverted_index.as | 23, 25, 146, 191, 206 |
| abstract_inverted_index.in | 12, 55, 96, 188, 220, 230 |
| abstract_inverted_index.is | 8, 138 |
| abstract_inverted_index.of | 5, 17, 28, 51, 80, 106, 140, 159 |
| abstract_inverted_index.on | 118, 233 |
| abstract_inverted_index.or | 150 |
| abstract_inverted_index.to | 36, 48, 213 |
| abstract_inverted_index.we | 83, 207 |
| abstract_inverted_index.3M, | 107 |
| abstract_inverted_index.Our | 199 |
| abstract_inverted_index.RNA | 20 |
| abstract_inverted_index.and | 57, 76, 93, 110, 156, 178, 196, 224, 232 |
| abstract_inverted_index.are | 226 |
| abstract_inverted_index.for | 69, 89, 217 |
| abstract_inverted_index.has | 45, 62 |
| abstract_inverted_index.new | 34 |
| abstract_inverted_index.not | 63 |
| abstract_inverted_index.the | 49, 142, 160 |
| abstract_inverted_index.was | 172 |
| abstract_inverted_index.10M, | 108 |
| abstract_inverted_index.650M | 111 |
| abstract_inverted_index.This | 166 |
| abstract_inverted_index.[1]. | 101 |
| abstract_inverted_index.been | 46, 64 |
| abstract_inverted_index.code | 225 |
| abstract_inverted_index.data | 22 |
| abstract_inverted_index.from | 123, 211 |
| abstract_inverted_index.gene | 91, 130 |
| abstract_inverted_index.have | 32 |
| abstract_inverted_index.loss | 203 |
| abstract_inverted_index.such | 190 |
| abstract_inverted_index.text | 56 |
| abstract_inverted_index.this | 38 |
| abstract_inverted_index.thus | 153 |
| abstract_inverted_index.well | 24 |
| abstract_inverted_index.with | 66, 168 |
| abstract_inverted_index.100M, | 109 |
| abstract_inverted_index.650M, | 214 |
| abstract_inverted_index.Face. | 235 |
| abstract_inverted_index.cells | 122 |
| abstract_inverted_index.dense | 114 |
| abstract_inverted_index.human | 121, 144, 161 |
| abstract_inverted_index.input | 147 |
| abstract_inverted_index.large | 52 |
| abstract_inverted_index.mixed | 176 |
| abstract_inverted_index.model | 4 |
| abstract_inverted_index.rapid | 26, 42 |
| abstract_inverted_index.tasks | 189 |
| abstract_inverted_index.using | 126 |
| abstract_inverted_index.while | 59 |
| abstract_inverted_index.(100M) | 184 |
| abstract_inverted_index.Models | 223 |
| abstract_inverted_index.Unlike | 134 |
| abstract_inverted_index.entire | 143, 163 |
| abstract_inverted_index.future | 218 |
| abstract_inverted_index.longer | 170 |
| abstract_inverted_index.masked | 129 |
| abstract_inverted_index.models | 54, 116 |
| abstract_inverted_index.module | 88 |
| abstract_inverted_index.public | 18 |
| abstract_inverted_index.reveal | 201 |
| abstract_inverted_index.series | 105 |
| abstract_inverted_index.wealth | 16 |
| abstract_inverted_index.Digital | 99 |
| abstract_inverted_index.Hugging | 234 |
| abstract_inverted_index.address | 37 |
| abstract_inverted_index.avenues | 35 |
| abstract_inverted_index.capable | 139 |
| abstract_inverted_index.context | 171 |
| abstract_inverted_index.develop | 84 |
| abstract_inverted_index.diverse | 124 |
| abstract_inverted_index.enabled | 173 |
| abstract_inverted_index.general | 157 |
| abstract_inverted_index.images, | 58 |
| abstract_inverted_index.methods | 31 |
| abstract_inverted_index.million | 120 |
| abstract_inverted_index.models, | 136 |
| abstract_inverted_index.produce | 73 |
| abstract_inverted_index.results | 187 |
| abstract_inverted_index.scaling | 27, 44, 61, 204 |
| abstract_inverted_index.similar | 60 |
| abstract_inverted_index.success | 50 |
| abstract_inverted_index.systems | 7, 95, 181 |
| abstract_inverted_index.through | 174, 228 |
| abstract_inverted_index.tissues | 125 |
| abstract_inverted_index.tricks, | 152 |
| abstract_inverted_index.unified | 3 |
| abstract_inverted_index.without | 148 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.However, | 41 |
| abstract_inverted_index.Organism | 100 |
| abstract_inverted_index.accurate | 155 |
| abstract_inverted_index.achieves | 185 |
| abstract_inverted_index.biology. | 13 |
| abstract_inverted_index.cellular | 6, 70, 81, 94 |
| abstract_inverted_index.cell’s | 162 |
| abstract_inverted_index.contains | 103 |
| abstract_inverted_index.context. | 165 |
| abstract_inverted_index.findings | 200 |
| abstract_inverted_index.handling | 141 |
| abstract_inverted_index.increase | 208 |
| abstract_inverted_index.insights | 216 |
| abstract_inverted_index.language | 53 |
| abstract_inverted_index.learning | 30, 154 |
| abstract_inverted_index.previous | 135 |
| abstract_inverted_index.provided | 33 |
| abstract_inverted_index.sampling | 151 |
| abstract_inverted_index.systems, | 82 |
| abstract_inverted_index.AI-driven | 98 |
| abstract_inverted_index.AIDO.Cell | 102, 137, 183 |
| abstract_inverted_index.Recently, | 14 |
| abstract_inverted_index.accurate, | 74 |
| abstract_inverted_index.attempted | 65 |
| abstract_inverted_index.available | 227 |
| abstract_inverted_index.behaviors | 205 |
| abstract_inverted_index.canonical | 10 |
| abstract_inverted_index.cell-type | 194 |
| abstract_inverted_index.challenge | 11 |
| abstract_inverted_index.essential | 47 |
| abstract_inverted_index.modeling. | 71, 198, 222 |
| abstract_inverted_index.parameter | 43, 112 |
| abstract_inverted_index.providing | 215 |
| abstract_inverted_index.training. | 182 |
| abstract_inverted_index.zero-shot | 192 |
| abstract_inverted_index.AIDO.Cell, | 85 |
| abstract_inverted_index.Developing | 1 |
| abstract_inverted_index.challenge. | 40 |
| abstract_inverted_index.directions | 219 |
| abstract_inverted_index.expression | 92, 131 |
| abstract_inverted_index.meaningful | 78 |
| abstract_inverted_index.objective. | 133 |
| abstract_inverted_index.parameters | 210 |
| abstract_inverted_index.precision, | 177 |
| abstract_inverted_index.pretrained | 87 |
| abstract_inverted_index.sequencing | 21 |
| abstract_inverted_index.truncation | 149 |
| abstract_inverted_index.Transformer | 67, 115 |
| abstract_inverted_index.clustering, | 193 |
| abstract_inverted_index.distributed | 180 |
| abstract_inverted_index.interesting | 202 |
| abstract_inverted_index.large-scale | 179 |
| abstract_inverted_index.pre-trained | 117 |
| abstract_inverted_index.pretraining | 132, 167 |
| abstract_inverted_index.single-cell | 19, 221 |
| abstract_inverted_index.biologically | 77 |
| abstract_inverted_index.encoder-only | 113 |
| abstract_inverted_index.longstanding | 39 |
| abstract_inverted_index.perturbation | 197 |
| abstract_inverted_index.representing | 90 |
| abstract_inverted_index.AIDO.Cell’s | 209 |
| abstract_inverted_index.architectures | 68 |
| abstract_inverted_index.transcriptome | 145 |
| abstract_inverted_index.transferable, | 75 |
| abstract_inverted_index.ModelGenerator | 229 |
| abstract_inverted_index.classification, | 195 |
| abstract_inverted_index.representations | 79, 158 |
| abstract_inverted_index.self-supervised | 29 |
| abstract_inverted_index.transcriptional | 164 |
| abstract_inverted_index.read-depth-aware | 128 |
| abstract_inverted_index.state-of-the-art | 186 |
| abstract_inverted_index.FlashAttention-2, | 175 |
| abstract_inverted_index.https://github.com/genbio-ai/AIDO | 231 |
| cited_by_percentile_year.max | 99 |
| cited_by_percentile_year.min | 98 |
| corresponding_author_ids | https://openalex.org/A5030589527, https://openalex.org/A5009547049 |
| countries_distinct_count | 2 |
| institutions_distinct_count | 12 |
| corresponding_institution_ids | https://openalex.org/I4210113480, https://openalex.org/I74973139 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.6399999856948853 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile.value | 0.90362735 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |