Multivariate regression with missing response data for modelling regional DNA methylation QTLs Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2507.05990
Identifying genetic regulators of DNA methylation (mQTLs) with multivariate models enhances statistical power, but is challenged by missing data from bisulfite sequencing. Standard imputation-based methods can introduce bias, limiting reliable inference. We propose \texttt{missoNet}, a novel convex estimation framework that jointly estimates regression coefficients and the precision matrix from data with missing responses. By using unbiased surrogate estimators, our three-stage procedure avoids imputation while simultaneously performing variable selection and learning the conditional dependence structure among responses. We establish theoretical error bounds, and our simulations demonstrate that \texttt{missoNet} consistently outperforms existing methods in both prediction and sparsity recovery. In a real-world mQTL analysis of the CARTaGENE cohort, \texttt{missoNet} achieved superior predictive accuracy and false-discovery control on a held-out validation set, identifying known and credible novel genetic associations. The method offers a robust, efficient, and theoretically grounded tool for genomic analyses, and is available as an R package.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2507.05990
- https://arxiv.org/pdf/2507.05990
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416061787
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416061787Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2507.05990Digital Object Identifier
- Title
-
Multivariate regression with missing response data for modelling regional DNA methylation QTLsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-08Full publication date if available
- Authors
-
Shomoita Alam, Yixiao Zeng, Sasha Bernatsky, Marie Hudson, Inés Colmegna, David A. Stephens, Celia M.T. Greenwood, Archer Y. YangList of authors in order
- Landing page
-
https://arxiv.org/abs/2507.05990Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2507.05990Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2507.05990Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416061787 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2507.05990 |
| ids.doi | https://doi.org/10.48550/arxiv.2507.05990 |
| ids.openalex | https://openalex.org/W4416061787 |
| fwci | |
| type | preprint |
| title | Multivariate regression with missing response data for modelling regional DNA methylation QTLs |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2507.05990 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2507.05990 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2507.05990 |
| locations[1].id | doi:10.48550/arxiv.2507.05990 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2507.05990 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5076379861 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Shomoita Alam |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Alam, Shomoita |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5027890723 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Yixiao Zeng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zeng, Yixiao |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5052911433 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-9515-2802 |
| authorships[2].author.display_name | Sasha Bernatsky |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Bernatsky, Sasha |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5105250440 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-6718-2468 |
| authorships[3].author.display_name | Marie Hudson |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Hudson, Marie |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5089015395 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-8091-8334 |
| authorships[4].author.display_name | Inés Colmegna |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Colmegna, Inés |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5085069223 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-9811-7140 |
| authorships[5].author.display_name | David A. Stephens |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Stephens, David A. |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5010018617 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-2427-5696 |
| authorships[6].author.display_name | Celia M.T. Greenwood |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Greenwood, Celia M. T. |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5012091034 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Archer Y. Yang |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Yang, Archer Y. |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2507.05990 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Multivariate regression with missing response data for modelling regional DNA methylation QTLs |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T10:25:13.592616 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2507.05990 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2507.05990 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2507.05990 |
| primary_location.id | pmh:oai:arXiv.org:2507.05990 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2507.05990 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2507.05990 |
| publication_date | 2025-07-08 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.R | 144 |
| abstract_inverted_index.a | 34, 98, 115, 129 |
| abstract_inverted_index.By | 53 |
| abstract_inverted_index.In | 97 |
| abstract_inverted_index.We | 31, 76 |
| abstract_inverted_index.an | 143 |
| abstract_inverted_index.as | 142 |
| abstract_inverted_index.by | 16 |
| abstract_inverted_index.in | 91 |
| abstract_inverted_index.is | 14, 140 |
| abstract_inverted_index.of | 3, 102 |
| abstract_inverted_index.on | 114 |
| abstract_inverted_index.DNA | 4 |
| abstract_inverted_index.The | 126 |
| abstract_inverted_index.and | 44, 68, 81, 94, 111, 121, 132, 139 |
| abstract_inverted_index.but | 13 |
| abstract_inverted_index.can | 25 |
| abstract_inverted_index.for | 136 |
| abstract_inverted_index.our | 58, 82 |
| abstract_inverted_index.the | 45, 70, 103 |
| abstract_inverted_index.both | 92 |
| abstract_inverted_index.data | 18, 49 |
| abstract_inverted_index.from | 19, 48 |
| abstract_inverted_index.mQTL | 100 |
| abstract_inverted_index.set, | 118 |
| abstract_inverted_index.that | 39, 85 |
| abstract_inverted_index.tool | 135 |
| abstract_inverted_index.with | 7, 50 |
| abstract_inverted_index.among | 74 |
| abstract_inverted_index.bias, | 27 |
| abstract_inverted_index.error | 79 |
| abstract_inverted_index.known | 120 |
| abstract_inverted_index.novel | 35, 123 |
| abstract_inverted_index.using | 54 |
| abstract_inverted_index.while | 63 |
| abstract_inverted_index.avoids | 61 |
| abstract_inverted_index.convex | 36 |
| abstract_inverted_index.matrix | 47 |
| abstract_inverted_index.method | 127 |
| abstract_inverted_index.models | 9 |
| abstract_inverted_index.offers | 128 |
| abstract_inverted_index.power, | 12 |
| abstract_inverted_index.(mQTLs) | 6 |
| abstract_inverted_index.bounds, | 80 |
| abstract_inverted_index.cohort, | 105 |
| abstract_inverted_index.control | 113 |
| abstract_inverted_index.genetic | 1, 124 |
| abstract_inverted_index.genomic | 137 |
| abstract_inverted_index.jointly | 40 |
| abstract_inverted_index.methods | 24, 90 |
| abstract_inverted_index.missing | 17, 51 |
| abstract_inverted_index.propose | 32 |
| abstract_inverted_index.robust, | 130 |
| abstract_inverted_index.Standard | 22 |
| abstract_inverted_index.accuracy | 110 |
| abstract_inverted_index.achieved | 107 |
| abstract_inverted_index.analysis | 101 |
| abstract_inverted_index.credible | 122 |
| abstract_inverted_index.enhances | 10 |
| abstract_inverted_index.existing | 89 |
| abstract_inverted_index.grounded | 134 |
| abstract_inverted_index.held-out | 116 |
| abstract_inverted_index.learning | 69 |
| abstract_inverted_index.limiting | 28 |
| abstract_inverted_index.package. | 145 |
| abstract_inverted_index.reliable | 29 |
| abstract_inverted_index.sparsity | 95 |
| abstract_inverted_index.superior | 108 |
| abstract_inverted_index.unbiased | 55 |
| abstract_inverted_index.variable | 66 |
| abstract_inverted_index.CARTaGENE | 104 |
| abstract_inverted_index.analyses, | 138 |
| abstract_inverted_index.available | 141 |
| abstract_inverted_index.bisulfite | 20 |
| abstract_inverted_index.establish | 77 |
| abstract_inverted_index.estimates | 41 |
| abstract_inverted_index.framework | 38 |
| abstract_inverted_index.introduce | 26 |
| abstract_inverted_index.precision | 46 |
| abstract_inverted_index.procedure | 60 |
| abstract_inverted_index.recovery. | 96 |
| abstract_inverted_index.selection | 67 |
| abstract_inverted_index.structure | 73 |
| abstract_inverted_index.surrogate | 56 |
| abstract_inverted_index.challenged | 15 |
| abstract_inverted_index.dependence | 72 |
| abstract_inverted_index.efficient, | 131 |
| abstract_inverted_index.estimation | 37 |
| abstract_inverted_index.imputation | 62 |
| abstract_inverted_index.inference. | 30 |
| abstract_inverted_index.performing | 65 |
| abstract_inverted_index.prediction | 93 |
| abstract_inverted_index.predictive | 109 |
| abstract_inverted_index.real-world | 99 |
| abstract_inverted_index.regression | 42 |
| abstract_inverted_index.regulators | 2 |
| abstract_inverted_index.responses. | 52, 75 |
| abstract_inverted_index.validation | 117 |
| abstract_inverted_index.Identifying | 0 |
| abstract_inverted_index.conditional | 71 |
| abstract_inverted_index.demonstrate | 84 |
| abstract_inverted_index.estimators, | 57 |
| abstract_inverted_index.identifying | 119 |
| abstract_inverted_index.methylation | 5 |
| abstract_inverted_index.outperforms | 88 |
| abstract_inverted_index.sequencing. | 21 |
| abstract_inverted_index.simulations | 83 |
| abstract_inverted_index.statistical | 11 |
| abstract_inverted_index.theoretical | 78 |
| abstract_inverted_index.three-stage | 59 |
| abstract_inverted_index.coefficients | 43 |
| abstract_inverted_index.consistently | 87 |
| abstract_inverted_index.multivariate | 8 |
| abstract_inverted_index.associations. | 125 |
| abstract_inverted_index.theoretically | 133 |
| abstract_inverted_index.simultaneously | 64 |
| abstract_inverted_index.false-discovery | 112 |
| abstract_inverted_index.imputation-based | 23 |
| abstract_inverted_index.\texttt{missoNet} | 86, 106 |
| abstract_inverted_index.\texttt{missoNet}, | 33 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |