An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2109.07287
In particle physics, semi-supervised machine learning is an attractive option to reduce model dependencies searches beyond the Standard Model. When utilizing semi-supervised techniques in training machine learning models in the search for bosons at the Large Hadron Collider, the over-training of the model must be investigated. Internal fluctuations of the phase space and bias in training can cause semi-supervised models to label false signals within the phase space due to over-fitting. The issue of false signal generation in semi-supervised models has not been fully analyzed and therefore utilizing a toy Monte Carlo model, the probability of such situations occurring must be quantified. This investigation of $Zγ$ resonances is performed using a pure background Monte Carlo sample. Through unique pure background samples extracted to mimic ATLAS data in a background-plus-signal region, multiple runs enable the probability of these fake signals occurring due to over-training to be thoroughly investigated.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2109.07287
- https://arxiv.org/pdf/2109.07287
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3199611033
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3199611033Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2109.07287Digital Object Identifier
- Title
-
An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHCWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-09-15Full publication date if available
- Authors
-
Benjamin Lieberman, Joshua Choma, S. Dahbi, B. Mellado, X. RuanList of authors in order
- Landing page
-
https://arxiv.org/abs/2109.07287Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2109.07287Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2109.07287Direct OA link when available
- Concepts
-
Large Hadron Collider, Monte Carlo method, Artificial intelligence, Machine learning, Computer science, Supervised learning, Particle physics, Phase space, SIGNAL (programming language), Artificial neural network, Physics, Mathematics, Statistics, Thermodynamics, Programming languageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3199611033 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2109.07287 |
| ids.doi | https://doi.org/10.48550/arxiv.2109.07287 |
| ids.mag | 3199611033 |
| ids.openalex | https://openalex.org/W3199611033 |
| fwci | |
| type | preprint |
| title | An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10048 |
| topics[0].field.id | https://openalex.org/fields/31 |
| topics[0].field.display_name | Physics and Astronomy |
| topics[0].score | 0.9997000098228455 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3106 |
| topics[0].subfield.display_name | Nuclear and High Energy Physics |
| topics[0].display_name | Particle physics theoretical and experimental studies |
| topics[1].id | https://openalex.org/T11044 |
| topics[1].field.id | https://openalex.org/fields/31 |
| topics[1].field.display_name | Physics and Astronomy |
| topics[1].score | 0.9987000226974487 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/3106 |
| topics[1].subfield.display_name | Nuclear and High Energy Physics |
| topics[1].display_name | Particle Detector Development and Performance |
| topics[2].id | https://openalex.org/T10527 |
| topics[2].field.id | https://openalex.org/fields/31 |
| topics[2].field.display_name | Physics and Astronomy |
| topics[2].score | 0.996399998664856 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/3106 |
| topics[2].subfield.display_name | Nuclear and High Energy Physics |
| topics[2].display_name | High-Energy Particle Collisions Research |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C87668248 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8873273730278015 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q40605 |
| concepts[0].display_name | Large Hadron Collider |
| concepts[1].id | https://openalex.org/C19499675 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6831016540527344 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q232207 |
| concepts[1].display_name | Monte Carlo method |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6194157600402832 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C119857082 |
| concepts[3].level | 1 |
| concepts[3].score | 0.598991334438324 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[3].display_name | Machine learning |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.5611302256584167 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C136389625 |
| concepts[5].level | 3 |
| concepts[5].score | 0.4893788695335388 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q334384 |
| concepts[5].display_name | Supervised learning |
| concepts[6].id | https://openalex.org/C109214941 |
| concepts[6].level | 1 |
| concepts[6].score | 0.4584067165851593 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q18334 |
| concepts[6].display_name | Particle physics |
| concepts[7].id | https://openalex.org/C151342819 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4376016855239868 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q62542 |
| concepts[7].display_name | Phase space |
| concepts[8].id | https://openalex.org/C2779843651 |
| concepts[8].level | 2 |
| concepts[8].score | 0.42315468192100525 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q7390335 |
| concepts[8].display_name | SIGNAL (programming language) |
| concepts[9].id | https://openalex.org/C50644808 |
| concepts[9].level | 2 |
| concepts[9].score | 0.31715503334999084 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[9].display_name | Artificial neural network |
| concepts[10].id | https://openalex.org/C121332964 |
| concepts[10].level | 0 |
| concepts[10].score | 0.30422940850257874 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[10].display_name | Physics |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.17355984449386597 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C105795698 |
| concepts[12].level | 1 |
| concepts[12].score | 0.16905361413955688 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[12].display_name | Statistics |
| concepts[13].id | https://openalex.org/C97355855 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q11473 |
| concepts[13].display_name | Thermodynamics |
| concepts[14].id | https://openalex.org/C199360897 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[14].display_name | Programming language |
| keywords[0].id | https://openalex.org/keywords/large-hadron-collider |
| keywords[0].score | 0.8873273730278015 |
| keywords[0].display_name | Large Hadron Collider |
| keywords[1].id | https://openalex.org/keywords/monte-carlo-method |
| keywords[1].score | 0.6831016540527344 |
| keywords[1].display_name | Monte Carlo method |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.6194157600402832 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/machine-learning |
| keywords[3].score | 0.598991334438324 |
| keywords[3].display_name | Machine learning |
| keywords[4].id | https://openalex.org/keywords/computer-science |
| keywords[4].score | 0.5611302256584167 |
| keywords[4].display_name | Computer science |
| keywords[5].id | https://openalex.org/keywords/supervised-learning |
| keywords[5].score | 0.4893788695335388 |
| keywords[5].display_name | Supervised learning |
| keywords[6].id | https://openalex.org/keywords/particle-physics |
| keywords[6].score | 0.4584067165851593 |
| keywords[6].display_name | Particle physics |
| keywords[7].id | https://openalex.org/keywords/phase-space |
| keywords[7].score | 0.4376016855239868 |
| keywords[7].display_name | Phase space |
| keywords[8].id | https://openalex.org/keywords/signal |
| keywords[8].score | 0.42315468192100525 |
| keywords[8].display_name | SIGNAL (programming language) |
| keywords[9].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[9].score | 0.31715503334999084 |
| keywords[9].display_name | Artificial neural network |
| keywords[10].id | https://openalex.org/keywords/physics |
| keywords[10].score | 0.30422940850257874 |
| keywords[10].display_name | Physics |
| keywords[11].id | https://openalex.org/keywords/mathematics |
| keywords[11].score | 0.17355984449386597 |
| keywords[11].display_name | Mathematics |
| keywords[12].id | https://openalex.org/keywords/statistics |
| keywords[12].score | 0.16905361413955688 |
| keywords[12].display_name | Statistics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2109.07287 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2109.07287 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2109.07287 |
| locations[1].id | doi:10.48550/arxiv.2109.07287 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2109.07287 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5072042723 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5281-8937 |
| authorships[0].author.display_name | Benjamin Lieberman |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Benjamin Lieberman |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5005904301 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-8848-211X |
| authorships[1].author.display_name | Joshua Choma |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Joshua Choma |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5101695572 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5222-7894 |
| authorships[2].author.display_name | S. Dahbi |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Salah-Eddine Dahbi |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5108748925 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | B. Mellado |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Bruce Mellado |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5009691987 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-5621-6677 |
| authorships[4].author.display_name | X. Ruan |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Xifeng Ruan |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2109.07287 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2021-09-27T00:00:00 |
| display_name | An investigation of over-training within semi-supervised machine learning models in the search for heavy resonances at the LHC |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10048 |
| primary_topic.field.id | https://openalex.org/fields/31 |
| primary_topic.field.display_name | Physics and Astronomy |
| primary_topic.score | 0.9997000098228455 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3106 |
| primary_topic.subfield.display_name | Nuclear and High Energy Physics |
| primary_topic.display_name | Particle physics theoretical and experimental studies |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2031643159, https://openalex.org/W4382984329, https://openalex.org/W2895132917, https://openalex.org/W2919860591, https://openalex.org/W188003848, https://openalex.org/W3027795944, https://openalex.org/W171602090, https://openalex.org/W2323558309, https://openalex.org/W2950032071 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2109.07287 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2109.07287 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2109.07287 |
| primary_location.id | pmh:oai:arXiv.org:2109.07287 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2109.07287 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2109.07287 |
| publication_date | 2021-09-15 |
| publication_year | 2021 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 88, 110, 127 |
| abstract_inverted_index.In | 0 |
| abstract_inverted_index.an | 7 |
| abstract_inverted_index.at | 33 |
| abstract_inverted_index.be | 44, 100, 144 |
| abstract_inverted_index.in | 23, 28, 54, 77, 126 |
| abstract_inverted_index.is | 6, 107 |
| abstract_inverted_index.of | 40, 48, 73, 95, 104, 135 |
| abstract_inverted_index.to | 10, 60, 69, 122, 141, 143 |
| abstract_inverted_index.The | 71 |
| abstract_inverted_index.and | 52, 85 |
| abstract_inverted_index.can | 56 |
| abstract_inverted_index.due | 68, 140 |
| abstract_inverted_index.for | 31 |
| abstract_inverted_index.has | 80 |
| abstract_inverted_index.not | 81 |
| abstract_inverted_index.the | 16, 29, 34, 38, 41, 49, 65, 93, 133 |
| abstract_inverted_index.toy | 89 |
| abstract_inverted_index.This | 102 |
| abstract_inverted_index.When | 19 |
| abstract_inverted_index.been | 82 |
| abstract_inverted_index.bias | 53 |
| abstract_inverted_index.data | 125 |
| abstract_inverted_index.fake | 137 |
| abstract_inverted_index.must | 43, 99 |
| abstract_inverted_index.pure | 111, 118 |
| abstract_inverted_index.runs | 131 |
| abstract_inverted_index.such | 96 |
| abstract_inverted_index.$Zγ$ | 105 |
| abstract_inverted_index.ATLAS | 124 |
| abstract_inverted_index.Carlo | 91, 114 |
| abstract_inverted_index.Large | 35 |
| abstract_inverted_index.Monte | 90, 113 |
| abstract_inverted_index.cause | 57 |
| abstract_inverted_index.false | 62, 74 |
| abstract_inverted_index.fully | 83 |
| abstract_inverted_index.issue | 72 |
| abstract_inverted_index.label | 61 |
| abstract_inverted_index.mimic | 123 |
| abstract_inverted_index.model | 12, 42 |
| abstract_inverted_index.phase | 50, 66 |
| abstract_inverted_index.space | 51, 67 |
| abstract_inverted_index.these | 136 |
| abstract_inverted_index.using | 109 |
| abstract_inverted_index.Hadron | 36 |
| abstract_inverted_index.Model. | 18 |
| abstract_inverted_index.beyond | 15 |
| abstract_inverted_index.bosons | 32 |
| abstract_inverted_index.enable | 132 |
| abstract_inverted_index.model, | 92 |
| abstract_inverted_index.models | 27, 59, 79 |
| abstract_inverted_index.option | 9 |
| abstract_inverted_index.reduce | 11 |
| abstract_inverted_index.search | 30 |
| abstract_inverted_index.signal | 75 |
| abstract_inverted_index.unique | 117 |
| abstract_inverted_index.within | 64 |
| abstract_inverted_index.Through | 116 |
| abstract_inverted_index.machine | 4, 25 |
| abstract_inverted_index.region, | 129 |
| abstract_inverted_index.sample. | 115 |
| abstract_inverted_index.samples | 120 |
| abstract_inverted_index.signals | 63, 138 |
| abstract_inverted_index.Internal | 46 |
| abstract_inverted_index.Standard | 17 |
| abstract_inverted_index.analyzed | 84 |
| abstract_inverted_index.learning | 5, 26 |
| abstract_inverted_index.multiple | 130 |
| abstract_inverted_index.particle | 1 |
| abstract_inverted_index.physics, | 2 |
| abstract_inverted_index.searches | 14 |
| abstract_inverted_index.training | 24, 55 |
| abstract_inverted_index.Collider, | 37 |
| abstract_inverted_index.extracted | 121 |
| abstract_inverted_index.occurring | 98, 139 |
| abstract_inverted_index.performed | 108 |
| abstract_inverted_index.therefore | 86 |
| abstract_inverted_index.utilizing | 20, 87 |
| abstract_inverted_index.attractive | 8 |
| abstract_inverted_index.background | 112, 119 |
| abstract_inverted_index.generation | 76 |
| abstract_inverted_index.resonances | 106 |
| abstract_inverted_index.situations | 97 |
| abstract_inverted_index.techniques | 22 |
| abstract_inverted_index.thoroughly | 145 |
| abstract_inverted_index.probability | 94, 134 |
| abstract_inverted_index.quantified. | 101 |
| abstract_inverted_index.dependencies | 13 |
| abstract_inverted_index.fluctuations | 47 |
| abstract_inverted_index.investigated. | 45, 146 |
| abstract_inverted_index.investigation | 103 |
| abstract_inverted_index.over-fitting. | 70 |
| abstract_inverted_index.over-training | 39, 142 |
| abstract_inverted_index.semi-supervised | 3, 21, 58, 78 |
| abstract_inverted_index.background-plus-signal | 128 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |