Replication Data for: The Balance-Sample Size Frontier in Matching Methods for Causal Inference Article Swipe
We propose a simplified approach to matching for causal inference that simultaneously optimizes both balance (similarity between the treated and control groups) and matched sample size. Existing approaches either fix the matched sample size and maximize balance or fix balance and maximize sample size, leaving analysts to settle for suboptimal solutions or attempt manual optimization by iteratively tweaking their matching method and rechecking balance. To jointly maximize balance and sample size, we introduce the matching frontier, the set of matching solutions with maximum balance for each possible sample size. Rather than iterating, researchers can choose matching solutions from the frontier for analysis in one step. We derive fast algorithms that calculate the matching frontier for several commonly used balance metrics. We demonstrate with analyses of the effect of sex on judging and job training programs that show how the methods we introduce can extract new knowledge from existing data sets.
Related Topics
- Type
- dataset
- Language
- en
- Landing Page
- https://doi.org/10.7910/DVN/SURSEO
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4398893941
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4398893941Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.7910/dvn/surseoDigital Object Identifier
- Title
-
Replication Data for: The Balance-Sample Size Frontier in Matching Methods for Causal InferenceWork title
- Type
-
datasetOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2016Year of publication
- Publication date
-
2016-03-18Full publication date if available
- Authors
-
Gary KingList of authors in order
- Landing page
-
https://doi.org/10.7910/DVN/SURSEOPublisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.7910/dvn/surseoDirect OA link when available
- Concepts
-
Replication (statistics), Frontier, Inference, Sample size determination, Causal inference, Matching (statistics), Balance (ability), Econometrics, Computer science, Statistics, Artificial intelligence, Psychology, Mathematics, Geography, Archaeology, NeuroscienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2016: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4398893941 |
|---|---|
| doi | https://doi.org/10.7910/dvn/surseo |
| ids.doi | https://doi.org/10.7910/dvn/surseo |
| ids.openalex | https://openalex.org/W4398893941 |
| fwci | |
| type | dataset |
| title | Replication Data for: The Balance-Sample Size Frontier in Matching Methods for Causal Inference |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11303 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8801000118255615 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Bayesian Modeling and Causal Inference |
| topics[1].id | https://openalex.org/T10136 |
| topics[1].field.id | https://openalex.org/fields/26 |
| topics[1].field.display_name | Mathematics |
| topics[1].score | 0.8341000080108643 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2613 |
| topics[1].subfield.display_name | Statistics and Probability |
| topics[1].display_name | Statistical Methods and Inference |
| topics[2].id | https://openalex.org/T10845 |
| topics[2].field.id | https://openalex.org/fields/26 |
| topics[2].field.display_name | Mathematics |
| topics[2].score | 0.824999988079071 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2613 |
| topics[2].subfield.display_name | Statistics and Probability |
| topics[2].display_name | Advanced Causal Inference Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C12590798 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8232817649841309 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q3933199 |
| concepts[0].display_name | Replication (statistics) |
| concepts[1].id | https://openalex.org/C2778571376 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6538026332855225 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1355821 |
| concepts[1].display_name | Frontier |
| concepts[2].id | https://openalex.org/C2776214188 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6002773642539978 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[2].display_name | Inference |
| concepts[3].id | https://openalex.org/C129848803 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5912736654281616 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2564360 |
| concepts[3].display_name | Sample size determination |
| concepts[4].id | https://openalex.org/C158600405 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5790477991104126 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q5054566 |
| concepts[4].display_name | Causal inference |
| concepts[5].id | https://openalex.org/C165064840 |
| concepts[5].level | 2 |
| concepts[5].score | 0.539749264717102 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1321061 |
| concepts[5].display_name | Matching (statistics) |
| concepts[6].id | https://openalex.org/C168031717 |
| concepts[6].level | 2 |
| concepts[6].score | 0.457535058259964 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1530280 |
| concepts[6].display_name | Balance (ability) |
| concepts[7].id | https://openalex.org/C149782125 |
| concepts[7].level | 1 |
| concepts[7].score | 0.4361550509929657 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q160039 |
| concepts[7].display_name | Econometrics |
| concepts[8].id | https://openalex.org/C41008148 |
| concepts[8].level | 0 |
| concepts[8].score | 0.4267445206642151 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[8].display_name | Computer science |
| concepts[9].id | https://openalex.org/C105795698 |
| concepts[9].level | 1 |
| concepts[9].score | 0.3202418386936188 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[9].display_name | Statistics |
| concepts[10].id | https://openalex.org/C154945302 |
| concepts[10].level | 1 |
| concepts[10].score | 0.2516442835330963 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[10].display_name | Artificial intelligence |
| concepts[11].id | https://openalex.org/C15744967 |
| concepts[11].level | 0 |
| concepts[11].score | 0.25127077102661133 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[11].display_name | Psychology |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.24346762895584106 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C205649164 |
| concepts[13].level | 0 |
| concepts[13].score | 0.1889152228832245 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q1071 |
| concepts[13].display_name | Geography |
| concepts[14].id | https://openalex.org/C166957645 |
| concepts[14].level | 1 |
| concepts[14].score | 0.05898725986480713 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q23498 |
| concepts[14].display_name | Archaeology |
| concepts[15].id | https://openalex.org/C169760540 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q207011 |
| concepts[15].display_name | Neuroscience |
| keywords[0].id | https://openalex.org/keywords/replication |
| keywords[0].score | 0.8232817649841309 |
| keywords[0].display_name | Replication (statistics) |
| keywords[1].id | https://openalex.org/keywords/frontier |
| keywords[1].score | 0.6538026332855225 |
| keywords[1].display_name | Frontier |
| keywords[2].id | https://openalex.org/keywords/inference |
| keywords[2].score | 0.6002773642539978 |
| keywords[2].display_name | Inference |
| keywords[3].id | https://openalex.org/keywords/sample-size-determination |
| keywords[3].score | 0.5912736654281616 |
| keywords[3].display_name | Sample size determination |
| keywords[4].id | https://openalex.org/keywords/causal-inference |
| keywords[4].score | 0.5790477991104126 |
| keywords[4].display_name | Causal inference |
| keywords[5].id | https://openalex.org/keywords/matching |
| keywords[5].score | 0.539749264717102 |
| keywords[5].display_name | Matching (statistics) |
| keywords[6].id | https://openalex.org/keywords/balance |
| keywords[6].score | 0.457535058259964 |
| keywords[6].display_name | Balance (ability) |
| keywords[7].id | https://openalex.org/keywords/econometrics |
| keywords[7].score | 0.4361550509929657 |
| keywords[7].display_name | Econometrics |
| keywords[8].id | https://openalex.org/keywords/computer-science |
| keywords[8].score | 0.4267445206642151 |
| keywords[8].display_name | Computer science |
| keywords[9].id | https://openalex.org/keywords/statistics |
| keywords[9].score | 0.3202418386936188 |
| keywords[9].display_name | Statistics |
| keywords[10].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[10].score | 0.2516442835330963 |
| keywords[10].display_name | Artificial intelligence |
| keywords[11].id | https://openalex.org/keywords/psychology |
| keywords[11].score | 0.25127077102661133 |
| keywords[11].display_name | Psychology |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.24346762895584106 |
| keywords[12].display_name | Mathematics |
| keywords[13].id | https://openalex.org/keywords/geography |
| keywords[13].score | 0.1889152228832245 |
| keywords[13].display_name | Geography |
| keywords[14].id | https://openalex.org/keywords/archaeology |
| keywords[14].score | 0.05898725986480713 |
| keywords[14].display_name | Archaeology |
| language | en |
| locations[0].id | pmh:doi:10.7910/DVN/SURSEO |
| locations[0].is_oa | False |
| locations[0].source | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.7910/DVN/SURSEO |
| locations[1].id | doi:10.7910/dvn/surseo |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4377196806 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | Harvard Dataverse |
| locations[1].source.host_organization | https://openalex.org/I136199984 |
| locations[1].source.host_organization_name | Harvard University |
| locations[1].source.host_organization_lineage | https://openalex.org/I136199984 |
| locations[1].license | other-oa |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | dataset |
| locations[1].license_id | https://openalex.org/licenses/other-oa |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.7910/dvn/surseo |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A5109724351 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Gary King |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | King, Gary (Harvard University); Lucas, Christopher; Nielsen, Richard |
| authorships[0].is_corresponding | True |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.7910/dvn/surseo |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Replication Data for: The Balance-Sample Size Frontier in Matching Methods for Causal Inference |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11303 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8801000118255615 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Bayesian Modeling and Causal Inference |
| related_works | https://openalex.org/W2347401120, https://openalex.org/W2261902776, https://openalex.org/W2041961361, https://openalex.org/W2310010941, https://openalex.org/W1988132375, https://openalex.org/W2334292868, https://openalex.org/W579144800, https://openalex.org/W2147233680, https://openalex.org/W2069525434, https://openalex.org/W2046798653 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2016 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | doi:10.7910/dvn/surseo |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4377196806 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Harvard Dataverse |
| best_oa_location.source.host_organization | https://openalex.org/I136199984 |
| best_oa_location.source.host_organization_name | Harvard University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I136199984 |
| best_oa_location.license | other-oa |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | dataset |
| best_oa_location.license_id | https://openalex.org/licenses/other-oa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.7910/dvn/surseo |
| primary_location.id | pmh:doi:10.7910/DVN/SURSEO |
| primary_location.is_oa | False |
| primary_location.source | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.7910/DVN/SURSEO |
| publication_date | 2016-03-18 |
| publication_year | 2016 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 2 |
| abstract_inverted_index.To | 64 |
| abstract_inverted_index.We | 0, 105, 120 |
| abstract_inverted_index.by | 55 |
| abstract_inverted_index.in | 102 |
| abstract_inverted_index.of | 78, 124, 127 |
| abstract_inverted_index.on | 129 |
| abstract_inverted_index.or | 37, 51 |
| abstract_inverted_index.to | 5, 46 |
| abstract_inverted_index.we | 71, 140 |
| abstract_inverted_index.and | 19, 22, 34, 40, 61, 68, 131 |
| abstract_inverted_index.can | 93, 142 |
| abstract_inverted_index.fix | 29, 38 |
| abstract_inverted_index.for | 7, 48, 84, 100, 114 |
| abstract_inverted_index.how | 137 |
| abstract_inverted_index.job | 132 |
| abstract_inverted_index.new | 144 |
| abstract_inverted_index.one | 103 |
| abstract_inverted_index.set | 77 |
| abstract_inverted_index.sex | 128 |
| abstract_inverted_index.the | 17, 30, 73, 76, 98, 111, 125, 138 |
| abstract_inverted_index.both | 13 |
| abstract_inverted_index.data | 148 |
| abstract_inverted_index.each | 85 |
| abstract_inverted_index.fast | 107 |
| abstract_inverted_index.from | 97, 146 |
| abstract_inverted_index.show | 136 |
| abstract_inverted_index.size | 33 |
| abstract_inverted_index.than | 90 |
| abstract_inverted_index.that | 10, 109, 135 |
| abstract_inverted_index.used | 117 |
| abstract_inverted_index.with | 81, 122 |
| abstract_inverted_index.sets. | 149 |
| abstract_inverted_index.size, | 43, 70 |
| abstract_inverted_index.size. | 25, 88 |
| abstract_inverted_index.step. | 104 |
| abstract_inverted_index.their | 58 |
| abstract_inverted_index.Rather | 89 |
| abstract_inverted_index.causal | 8 |
| abstract_inverted_index.choose | 94 |
| abstract_inverted_index.derive | 106 |
| abstract_inverted_index.effect | 126 |
| abstract_inverted_index.either | 28 |
| abstract_inverted_index.manual | 53 |
| abstract_inverted_index.method | 60 |
| abstract_inverted_index.sample | 24, 32, 42, 69, 87 |
| abstract_inverted_index.settle | 47 |
| abstract_inverted_index.attempt | 52 |
| abstract_inverted_index.balance | 14, 36, 39, 67, 83, 118 |
| abstract_inverted_index.between | 16 |
| abstract_inverted_index.control | 20 |
| abstract_inverted_index.extract | 143 |
| abstract_inverted_index.groups) | 21 |
| abstract_inverted_index.jointly | 65 |
| abstract_inverted_index.judging | 130 |
| abstract_inverted_index.leaving | 44 |
| abstract_inverted_index.matched | 23, 31 |
| abstract_inverted_index.maximum | 82 |
| abstract_inverted_index.methods | 139 |
| abstract_inverted_index.propose | 1 |
| abstract_inverted_index.several | 115 |
| abstract_inverted_index.treated | 18 |
| abstract_inverted_index.Existing | 26 |
| abstract_inverted_index.analyses | 123 |
| abstract_inverted_index.analysis | 101 |
| abstract_inverted_index.analysts | 45 |
| abstract_inverted_index.approach | 4 |
| abstract_inverted_index.balance. | 63 |
| abstract_inverted_index.commonly | 116 |
| abstract_inverted_index.existing | 147 |
| abstract_inverted_index.frontier | 99, 113 |
| abstract_inverted_index.matching | 6, 59, 74, 79, 95, 112 |
| abstract_inverted_index.maximize | 35, 41, 66 |
| abstract_inverted_index.metrics. | 119 |
| abstract_inverted_index.possible | 86 |
| abstract_inverted_index.programs | 134 |
| abstract_inverted_index.training | 133 |
| abstract_inverted_index.tweaking | 57 |
| abstract_inverted_index.calculate | 110 |
| abstract_inverted_index.frontier, | 75 |
| abstract_inverted_index.inference | 9 |
| abstract_inverted_index.introduce | 72, 141 |
| abstract_inverted_index.knowledge | 145 |
| abstract_inverted_index.optimizes | 12 |
| abstract_inverted_index.solutions | 50, 80, 96 |
| abstract_inverted_index.algorithms | 108 |
| abstract_inverted_index.approaches | 27 |
| abstract_inverted_index.iterating, | 91 |
| abstract_inverted_index.rechecking | 62 |
| abstract_inverted_index.simplified | 3 |
| abstract_inverted_index.suboptimal | 49 |
| abstract_inverted_index.(similarity | 15 |
| abstract_inverted_index.demonstrate | 121 |
| abstract_inverted_index.iteratively | 56 |
| abstract_inverted_index.researchers | 92 |
| abstract_inverted_index.optimization | 54 |
| abstract_inverted_index.simultaneously | 11 |
| cited_by_percentile_year | |
| corresponding_author_ids | https://openalex.org/A5109724351 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 1 |
| citation_normalized_percentile |