Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2509.20586
Randomized controlled trials (RCTs) are widely regarded as the gold standard for causal inference in biomedical research. For instance, when estimating the average treatment effect on the treated (ATT), a doubly robust estimation procedure can be applied, requiring either the propensity score model or the control outcome model to be correctly specified. In this paper, we address scenarios where external control data, often with a much larger sample size, are available. Such data are typically easier to obtain from historical records or third-party sources. However, we find that incorporating external controls into the standard doubly robust estimator for ATT may paradoxically result in reduced efficiency compared to using the estimator without external controls. This counterintuitive outcome suggests that the naive incorporation of external controls could be detrimental to estimation efficiency. To resolve this issue, we propose a novel doubly robust estimator that guarantees higher efficiency than the standard approach without external controls, even under model misspecification. When all models are correctly specified, this estimator aligns with the standard doubly robust estimator that incorporates external controls and achieves semiparametric efficiency. The asymptotic theory developed in this work applies to high-dimensional confounder settings, which are increasingly common with the growing prevalence of electronic health record data. We demonstrate the effectiveness of our methodology through extensive simulation studies and a real-world data application.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2509.20586
- https://arxiv.org/pdf/2509.20586
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414787978
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414787978Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2509.20586Digital Object Identifier
- Title
-
Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double SafetyWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-09-24Full publication date if available
- Authors
-
Chi-Shian Dai, Chao Ying, Yang Ning, Jiwei ZhaoList of authors in order
- Landing page
-
https://arxiv.org/abs/2509.20586Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2509.20586Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2509.20586Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414787978 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2509.20586 |
| ids.doi | https://doi.org/10.48550/arxiv.2509.20586 |
| ids.openalex | https://openalex.org/W4414787978 |
| fwci | |
| type | preprint |
| title | Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10876 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.7871000170707703 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2207 |
| topics[0].subfield.display_name | Control and Systems Engineering |
| topics[0].display_name | Fault Detection and Control Systems |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2509.20586 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2509.20586 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2509.20586 |
| locations[1].id | doi:10.48550/arxiv.2509.20586 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2509.20586 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5013760027 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Chi-Shian Dai |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Dai, Chi-Shian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5112699690 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Chao Ying |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ying, Chao |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5070406375 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6877-9231 |
| authorships[2].author.display_name | Yang Ning |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Ning, Yang |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5100960844 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Jiwei Zhao |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Zhao, Jiwei |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2509.20586 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Incorporating External Controls for Estimating the Average Treatment Effect on the Treated with High-Dimensional Data: Retaining Double Robustness and Ensuring Double Safety |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10876 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.7871000170707703 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2207 |
| primary_topic.subfield.display_name | Control and Systems Engineering |
| primary_topic.display_name | Fault Detection and Control Systems |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2509.20586 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2509.20586 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2509.20586 |
| primary_location.id | pmh:oai:arXiv.org:2509.20586 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2509.20586 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2509.20586 |
| publication_date | 2025-09-24 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 29, 64, 136, 216 |
| abstract_inverted_index.In | 52 |
| abstract_inverted_index.To | 130 |
| abstract_inverted_index.We | 204 |
| abstract_inverted_index.as | 7 |
| abstract_inverted_index.be | 35, 49, 125 |
| abstract_inverted_index.in | 14, 102, 183 |
| abstract_inverted_index.of | 121, 199, 208 |
| abstract_inverted_index.on | 25 |
| abstract_inverted_index.or | 43, 81 |
| abstract_inverted_index.to | 48, 76, 106, 127, 187 |
| abstract_inverted_index.we | 55, 85, 134 |
| abstract_inverted_index.ATT | 98 |
| abstract_inverted_index.For | 17 |
| abstract_inverted_index.The | 179 |
| abstract_inverted_index.all | 157 |
| abstract_inverted_index.and | 175, 215 |
| abstract_inverted_index.are | 4, 69, 73, 159, 192 |
| abstract_inverted_index.can | 34 |
| abstract_inverted_index.for | 11, 97 |
| abstract_inverted_index.may | 99 |
| abstract_inverted_index.our | 209 |
| abstract_inverted_index.the | 8, 21, 26, 39, 44, 92, 108, 118, 146, 166, 196, 206 |
| abstract_inverted_index.Such | 71 |
| abstract_inverted_index.This | 113 |
| abstract_inverted_index.When | 156 |
| abstract_inverted_index.data | 72, 218 |
| abstract_inverted_index.even | 152 |
| abstract_inverted_index.find | 86 |
| abstract_inverted_index.from | 78 |
| abstract_inverted_index.gold | 9 |
| abstract_inverted_index.into | 91 |
| abstract_inverted_index.much | 65 |
| abstract_inverted_index.than | 145 |
| abstract_inverted_index.that | 87, 117, 141, 171 |
| abstract_inverted_index.this | 53, 132, 162, 184 |
| abstract_inverted_index.when | 19 |
| abstract_inverted_index.with | 63, 165, 195 |
| abstract_inverted_index.work | 185 |
| abstract_inverted_index.could | 124 |
| abstract_inverted_index.data, | 61 |
| abstract_inverted_index.data. | 203 |
| abstract_inverted_index.model | 42, 47, 154 |
| abstract_inverted_index.naive | 119 |
| abstract_inverted_index.novel | 137 |
| abstract_inverted_index.often | 62 |
| abstract_inverted_index.score | 41 |
| abstract_inverted_index.size, | 68 |
| abstract_inverted_index.under | 153 |
| abstract_inverted_index.using | 107 |
| abstract_inverted_index.where | 58 |
| abstract_inverted_index.which | 191 |
| abstract_inverted_index.(ATT), | 28 |
| abstract_inverted_index.(RCTs) | 3 |
| abstract_inverted_index.aligns | 164 |
| abstract_inverted_index.causal | 12 |
| abstract_inverted_index.common | 194 |
| abstract_inverted_index.doubly | 30, 94, 138, 168 |
| abstract_inverted_index.easier | 75 |
| abstract_inverted_index.effect | 24 |
| abstract_inverted_index.either | 38 |
| abstract_inverted_index.health | 201 |
| abstract_inverted_index.higher | 143 |
| abstract_inverted_index.issue, | 133 |
| abstract_inverted_index.larger | 66 |
| abstract_inverted_index.models | 158 |
| abstract_inverted_index.obtain | 77 |
| abstract_inverted_index.paper, | 54 |
| abstract_inverted_index.record | 202 |
| abstract_inverted_index.result | 101 |
| abstract_inverted_index.robust | 31, 95, 139, 169 |
| abstract_inverted_index.sample | 67 |
| abstract_inverted_index.theory | 181 |
| abstract_inverted_index.trials | 2 |
| abstract_inverted_index.widely | 5 |
| abstract_inverted_index.address | 56 |
| abstract_inverted_index.applies | 186 |
| abstract_inverted_index.average | 22 |
| abstract_inverted_index.control | 45, 60 |
| abstract_inverted_index.growing | 197 |
| abstract_inverted_index.outcome | 46, 115 |
| abstract_inverted_index.propose | 135 |
| abstract_inverted_index.records | 80 |
| abstract_inverted_index.reduced | 103 |
| abstract_inverted_index.resolve | 131 |
| abstract_inverted_index.studies | 214 |
| abstract_inverted_index.through | 211 |
| abstract_inverted_index.treated | 27 |
| abstract_inverted_index.without | 110, 149 |
| abstract_inverted_index.However, | 84 |
| abstract_inverted_index.achieves | 176 |
| abstract_inverted_index.applied, | 36 |
| abstract_inverted_index.approach | 148 |
| abstract_inverted_index.compared | 105 |
| abstract_inverted_index.controls | 90, 123, 174 |
| abstract_inverted_index.external | 59, 89, 111, 122, 150, 173 |
| abstract_inverted_index.regarded | 6 |
| abstract_inverted_index.sources. | 83 |
| abstract_inverted_index.standard | 10, 93, 147, 167 |
| abstract_inverted_index.suggests | 116 |
| abstract_inverted_index.controls, | 151 |
| abstract_inverted_index.controls. | 112 |
| abstract_inverted_index.correctly | 50, 160 |
| abstract_inverted_index.developed | 182 |
| abstract_inverted_index.estimator | 96, 109, 140, 163, 170 |
| abstract_inverted_index.extensive | 212 |
| abstract_inverted_index.inference | 13 |
| abstract_inverted_index.instance, | 18 |
| abstract_inverted_index.procedure | 33 |
| abstract_inverted_index.requiring | 37 |
| abstract_inverted_index.research. | 16 |
| abstract_inverted_index.scenarios | 57 |
| abstract_inverted_index.settings, | 190 |
| abstract_inverted_index.treatment | 23 |
| abstract_inverted_index.typically | 74 |
| abstract_inverted_index.Randomized | 0 |
| abstract_inverted_index.asymptotic | 180 |
| abstract_inverted_index.available. | 70 |
| abstract_inverted_index.biomedical | 15 |
| abstract_inverted_index.confounder | 189 |
| abstract_inverted_index.controlled | 1 |
| abstract_inverted_index.efficiency | 104, 144 |
| abstract_inverted_index.electronic | 200 |
| abstract_inverted_index.estimating | 20 |
| abstract_inverted_index.estimation | 32, 128 |
| abstract_inverted_index.guarantees | 142 |
| abstract_inverted_index.historical | 79 |
| abstract_inverted_index.prevalence | 198 |
| abstract_inverted_index.propensity | 40 |
| abstract_inverted_index.real-world | 217 |
| abstract_inverted_index.simulation | 213 |
| abstract_inverted_index.specified, | 161 |
| abstract_inverted_index.specified. | 51 |
| abstract_inverted_index.demonstrate | 205 |
| abstract_inverted_index.detrimental | 126 |
| abstract_inverted_index.efficiency. | 129, 178 |
| abstract_inverted_index.methodology | 210 |
| abstract_inverted_index.third-party | 82 |
| abstract_inverted_index.application. | 219 |
| abstract_inverted_index.incorporates | 172 |
| abstract_inverted_index.increasingly | 193 |
| abstract_inverted_index.effectiveness | 207 |
| abstract_inverted_index.incorporating | 88 |
| abstract_inverted_index.incorporation | 120 |
| abstract_inverted_index.paradoxically | 100 |
| abstract_inverted_index.semiparametric | 177 |
| abstract_inverted_index.counterintuitive | 114 |
| abstract_inverted_index.high-dimensional | 188 |
| abstract_inverted_index.misspecification. | 155 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |