MTAdam: Automatic Balancing of Multiple Training Loss Terms Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.18653/v1/2021.emnlp-main.837
When training neural models, it is common to combine multiple loss terms. The balancing of these terms requires considerable human effort and is computationally demanding. Moreover, the optimal trade-off between the loss terms can change as training progresses, e.g., for adversarial terms. In this work, we generalize the Adam optimization algorithm to handle multiple loss terms. The guiding principle is that for every layer, the gradient magnitude of the terms should be balanced. To this end, the Multi-Term Adam (MTAdam) computes the derivative of each loss term separately, infers the first and second moments per parameter and loss term, and calculates a first moment for the magnitude per layer of the gradients arising from each loss. This magnitude is used to continuously balance the gradients across all layers, in a manner that both varies from one layer to the next and dynamically changes over time. Our results show that training with the new method leads to fast recovery from suboptimal initial loss weighting and to training outcomes that match or improve conventional training with the prescribed hyperparameters of each method.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.18653/v1/2021.emnlp-main.837
- https://aclanthology.org/2021.emnlp-main.837.pdf
- OA Status
- hybrid
- Cited By
- 12
- References
- 39
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3037545792
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3037545792Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.18653/v1/2021.emnlp-main.837Digital Object Identifier
- Title
-
MTAdam: Automatic Balancing of Multiple Training Loss TermsWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-01-01Full publication date if available
- Authors
-
Itzik Malkiel, Lior WolfList of authors in order
- Landing page
-
https://doi.org/10.18653/v1/2021.emnlp-main.837Publisher landing page
- PDF URL
-
https://aclanthology.org/2021.emnlp-main.837.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
hybridOpen access status per OpenAlex
- OA URL
-
https://aclanthology.org/2021.emnlp-main.837.pdfDirect OA link when available
- Concepts
-
Hyperparameter, Computer science, Weighting, Term (time), Magnitude (astronomy), Training (meteorology), Moment (physics), Layer (electronics), Algorithm, Mathematical optimization, Artificial intelligence, Mathematics, Classical mechanics, Meteorology, Medicine, Radiology, Organic chemistry, Astronomy, Chemistry, Physics, Quantum mechanicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
12Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 4, 2023: 3, 2022: 1, 2021: 2, 2020: 2Per-year citation counts (last 5 years)
- References (count)
-
39Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3037545792 |
|---|---|
| doi | https://doi.org/10.18653/v1/2021.emnlp-main.837 |
| ids.doi | https://doi.org/10.18653/v1/2021.emnlp-main.837 |
| ids.mag | 3037545792 |
| ids.openalex | https://openalex.org/W3037545792 |
| fwci | 0.63345162 |
| type | article |
| title | MTAdam: Automatic Balancing of Multiple Training Loss Terms |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | 10729 |
| biblio.first_page | 10713 |
| topics[0].id | https://openalex.org/T10812 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9988999962806702 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Human Pose and Action Recognition |
| topics[1].id | https://openalex.org/T10036 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9976000189781189 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Advanced Neural Network Applications |
| topics[2].id | https://openalex.org/T11689 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.996999979019165 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Adversarial Robustness in Machine Learning |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C8642999 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7511935234069824 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q4171168 |
| concepts[0].display_name | Hyperparameter |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6984790563583374 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C183115368 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6250454783439636 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q856577 |
| concepts[2].display_name | Weighting |
| concepts[3].id | https://openalex.org/C61797465 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5916267037391663 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1188986 |
| concepts[3].display_name | Term (time) |
| concepts[4].id | https://openalex.org/C126691448 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5593929886817932 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2028919 |
| concepts[4].display_name | Magnitude (astronomy) |
| concepts[5].id | https://openalex.org/C2777211547 |
| concepts[5].level | 2 |
| concepts[5].score | 0.551886260509491 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q17141490 |
| concepts[5].display_name | Training (meteorology) |
| concepts[6].id | https://openalex.org/C179254644 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4855746626853943 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q13222844 |
| concepts[6].display_name | Moment (physics) |
| concepts[7].id | https://openalex.org/C2779227376 |
| concepts[7].level | 2 |
| concepts[7].score | 0.44898492097854614 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q6505497 |
| concepts[7].display_name | Layer (electronics) |
| concepts[8].id | https://openalex.org/C11413529 |
| concepts[8].level | 1 |
| concepts[8].score | 0.44134002923965454 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[8].display_name | Algorithm |
| concepts[9].id | https://openalex.org/C126255220 |
| concepts[9].level | 1 |
| concepts[9].score | 0.39730751514434814 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[9].display_name | Mathematical optimization |
| concepts[10].id | https://openalex.org/C154945302 |
| concepts[10].level | 1 |
| concepts[10].score | 0.33604440093040466 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[10].display_name | Artificial intelligence |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.24436193704605103 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C74650414 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q11397 |
| concepts[12].display_name | Classical mechanics |
| concepts[13].id | https://openalex.org/C153294291 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q25261 |
| concepts[13].display_name | Meteorology |
| concepts[14].id | https://openalex.org/C71924100 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q11190 |
| concepts[14].display_name | Medicine |
| concepts[15].id | https://openalex.org/C126838900 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q77604 |
| concepts[15].display_name | Radiology |
| concepts[16].id | https://openalex.org/C178790620 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q11351 |
| concepts[16].display_name | Organic chemistry |
| concepts[17].id | https://openalex.org/C1276947 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q333 |
| concepts[17].display_name | Astronomy |
| concepts[18].id | https://openalex.org/C185592680 |
| concepts[18].level | 0 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[18].display_name | Chemistry |
| concepts[19].id | https://openalex.org/C121332964 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[19].display_name | Physics |
| concepts[20].id | https://openalex.org/C62520636 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[20].display_name | Quantum mechanics |
| keywords[0].id | https://openalex.org/keywords/hyperparameter |
| keywords[0].score | 0.7511935234069824 |
| keywords[0].display_name | Hyperparameter |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6984790563583374 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/weighting |
| keywords[2].score | 0.6250454783439636 |
| keywords[2].display_name | Weighting |
| keywords[3].id | https://openalex.org/keywords/term |
| keywords[3].score | 0.5916267037391663 |
| keywords[3].display_name | Term (time) |
| keywords[4].id | https://openalex.org/keywords/magnitude |
| keywords[4].score | 0.5593929886817932 |
| keywords[4].display_name | Magnitude (astronomy) |
| keywords[5].id | https://openalex.org/keywords/training |
| keywords[5].score | 0.551886260509491 |
| keywords[5].display_name | Training (meteorology) |
| keywords[6].id | https://openalex.org/keywords/moment |
| keywords[6].score | 0.4855746626853943 |
| keywords[6].display_name | Moment (physics) |
| keywords[7].id | https://openalex.org/keywords/layer |
| keywords[7].score | 0.44898492097854614 |
| keywords[7].display_name | Layer (electronics) |
| keywords[8].id | https://openalex.org/keywords/algorithm |
| keywords[8].score | 0.44134002923965454 |
| keywords[8].display_name | Algorithm |
| keywords[9].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[9].score | 0.39730751514434814 |
| keywords[9].display_name | Mathematical optimization |
| keywords[10].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[10].score | 0.33604440093040466 |
| keywords[10].display_name | Artificial intelligence |
| keywords[11].id | https://openalex.org/keywords/mathematics |
| keywords[11].score | 0.24436193704605103 |
| keywords[11].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.18653/v1/2021.emnlp-main.837 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4363608991 |
| locations[0].source.issn | |
| locations[0].source.type | conference |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://aclanthology.org/2021.emnlp-main.837.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| locations[0].landing_page_url | https://doi.org/10.18653/v1/2021.emnlp-main.837 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5067773841 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-4151-9119 |
| authorships[0].author.display_name | Itzik Malkiel |
| authorships[0].countries | IL |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I16391192 |
| authorships[0].affiliations[0].raw_affiliation_string | Tel Aviv University |
| authorships[0].institutions[0].id | https://openalex.org/I16391192 |
| authorships[0].institutions[0].ror | https://ror.org/04mhzgx49 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I16391192 |
| authorships[0].institutions[0].country_code | IL |
| authorships[0].institutions[0].display_name | Tel Aviv University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Itzik Malkiel |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Tel Aviv University |
| authorships[1].author.id | https://openalex.org/A5078102229 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-5578-8892 |
| authorships[1].author.display_name | Lior Wolf |
| authorships[1].countries | IL |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I16391192 |
| authorships[1].affiliations[0].raw_affiliation_string | Tel Aviv University |
| authorships[1].institutions[0].id | https://openalex.org/I16391192 |
| authorships[1].institutions[0].ror | https://ror.org/04mhzgx49 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I16391192 |
| authorships[1].institutions[0].country_code | IL |
| authorships[1].institutions[0].display_name | Tel Aviv University |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Lior Wolf |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Tel Aviv University |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://aclanthology.org/2021.emnlp-main.837.pdf |
| open_access.oa_status | hybrid |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | MTAdam: Automatic Balancing of Multiple Training Loss Terms |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10812 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9988999962806702 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Human Pose and Action Recognition |
| related_works | https://openalex.org/W4390421286, https://openalex.org/W4280563792, https://openalex.org/W2140186469, https://openalex.org/W4389724018, https://openalex.org/W4318719684, https://openalex.org/W2775233965, https://openalex.org/W2180954594, https://openalex.org/W2376418092, https://openalex.org/W4318559728, https://openalex.org/W4360995913 |
| cited_by_count | 12 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 4 |
| counts_by_year[1].year | 2023 |
| counts_by_year[1].cited_by_count | 3 |
| counts_by_year[2].year | 2022 |
| counts_by_year[2].cited_by_count | 1 |
| counts_by_year[3].year | 2021 |
| counts_by_year[3].cited_by_count | 2 |
| counts_by_year[4].year | 2020 |
| counts_by_year[4].cited_by_count | 2 |
| locations_count | 1 |
| best_oa_location.id | doi:10.18653/v1/2021.emnlp-main.837 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4363608991 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | conference |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://aclanthology.org/2021.emnlp-main.837.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | proceedings-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| best_oa_location.landing_page_url | https://doi.org/10.18653/v1/2021.emnlp-main.837 |
| primary_location.id | doi:10.18653/v1/2021.emnlp-main.837 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4363608991 |
| primary_location.source.issn | |
| primary_location.source.type | conference |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://aclanthology.org/2021.emnlp-main.837.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
| primary_location.landing_page_url | https://doi.org/10.18653/v1/2021.emnlp-main.837 |
| publication_date | 2021-01-01 |
| publication_year | 2021 |
| referenced_works | https://openalex.org/W2121927366, https://openalex.org/W2100438617, https://openalex.org/W2963981733, https://openalex.org/W2913340405, https://openalex.org/W2953106684, https://openalex.org/W60686164, https://openalex.org/W189587150, https://openalex.org/W1522301498, https://openalex.org/W2908510526, https://openalex.org/W2131241448, https://openalex.org/W2099471712, https://openalex.org/W2963073614, https://openalex.org/W2556522401, https://openalex.org/W4313908941, https://openalex.org/W3099577420, https://openalex.org/W2963866308, https://openalex.org/W4301206121, https://openalex.org/W2331128040, https://openalex.org/W2106411961, https://openalex.org/W2619503996, https://openalex.org/W2963815651, https://openalex.org/W2963150697, https://openalex.org/W2432541215, https://openalex.org/W582055897, https://openalex.org/W1498436455, https://openalex.org/W2896457183, https://openalex.org/W2251324968, https://openalex.org/W2962793481, https://openalex.org/W2963470893, https://openalex.org/W2963677766, https://openalex.org/W4239072543, https://openalex.org/W3124229194, https://openalex.org/W2108598243, https://openalex.org/W3108989341, https://openalex.org/W2405673391, https://openalex.org/W4385245566, https://openalex.org/W131533222, https://openalex.org/W1791560514, https://openalex.org/W2133665775 |
| referenced_works_count | 39 |
| abstract_inverted_index.a | 101, 129 |
| abstract_inverted_index.In | 42 |
| abstract_inverted_index.To | 73 |
| abstract_inverted_index.as | 35 |
| abstract_inverted_index.be | 71 |
| abstract_inverted_index.in | 128 |
| abstract_inverted_index.is | 5, 22, 59, 118 |
| abstract_inverted_index.it | 4 |
| abstract_inverted_index.of | 14, 67, 83, 109, 177 |
| abstract_inverted_index.or | 169 |
| abstract_inverted_index.to | 7, 51, 120, 137, 155, 164 |
| abstract_inverted_index.we | 45 |
| abstract_inverted_index.Our | 145 |
| abstract_inverted_index.The | 12, 56 |
| abstract_inverted_index.all | 126 |
| abstract_inverted_index.and | 21, 91, 96, 99, 140, 163 |
| abstract_inverted_index.can | 33 |
| abstract_inverted_index.for | 39, 61, 104 |
| abstract_inverted_index.new | 152 |
| abstract_inverted_index.one | 135 |
| abstract_inverted_index.per | 94, 107 |
| abstract_inverted_index.the | 26, 30, 47, 64, 68, 76, 81, 89, 105, 110, 123, 138, 151, 174 |
| abstract_inverted_index.Adam | 48, 78 |
| abstract_inverted_index.This | 116 |
| abstract_inverted_index.When | 0 |
| abstract_inverted_index.both | 132 |
| abstract_inverted_index.each | 84, 114, 178 |
| abstract_inverted_index.end, | 75 |
| abstract_inverted_index.fast | 156 |
| abstract_inverted_index.from | 113, 134, 158 |
| abstract_inverted_index.loss | 10, 31, 54, 85, 97, 161 |
| abstract_inverted_index.next | 139 |
| abstract_inverted_index.over | 143 |
| abstract_inverted_index.show | 147 |
| abstract_inverted_index.term | 86 |
| abstract_inverted_index.that | 60, 131, 148, 167 |
| abstract_inverted_index.this | 43, 74 |
| abstract_inverted_index.used | 119 |
| abstract_inverted_index.with | 150, 173 |
| abstract_inverted_index.e.g., | 38 |
| abstract_inverted_index.every | 62 |
| abstract_inverted_index.first | 90, 102 |
| abstract_inverted_index.human | 19 |
| abstract_inverted_index.layer | 108, 136 |
| abstract_inverted_index.leads | 154 |
| abstract_inverted_index.loss. | 115 |
| abstract_inverted_index.match | 168 |
| abstract_inverted_index.term, | 98 |
| abstract_inverted_index.terms | 16, 32, 69 |
| abstract_inverted_index.these | 15 |
| abstract_inverted_index.time. | 144 |
| abstract_inverted_index.work, | 44 |
| abstract_inverted_index.across | 125 |
| abstract_inverted_index.change | 34 |
| abstract_inverted_index.common | 6 |
| abstract_inverted_index.effort | 20 |
| abstract_inverted_index.handle | 52 |
| abstract_inverted_index.infers | 88 |
| abstract_inverted_index.layer, | 63 |
| abstract_inverted_index.manner | 130 |
| abstract_inverted_index.method | 153 |
| abstract_inverted_index.moment | 103 |
| abstract_inverted_index.neural | 2 |
| abstract_inverted_index.second | 92 |
| abstract_inverted_index.should | 70 |
| abstract_inverted_index.terms. | 11, 41, 55 |
| abstract_inverted_index.varies | 133 |
| abstract_inverted_index.arising | 112 |
| abstract_inverted_index.balance | 122 |
| abstract_inverted_index.between | 29 |
| abstract_inverted_index.changes | 142 |
| abstract_inverted_index.combine | 8 |
| abstract_inverted_index.guiding | 57 |
| abstract_inverted_index.improve | 170 |
| abstract_inverted_index.initial | 160 |
| abstract_inverted_index.layers, | 127 |
| abstract_inverted_index.method. | 179 |
| abstract_inverted_index.models, | 3 |
| abstract_inverted_index.moments | 93 |
| abstract_inverted_index.optimal | 27 |
| abstract_inverted_index.results | 146 |
| abstract_inverted_index.(MTAdam) | 79 |
| abstract_inverted_index.computes | 80 |
| abstract_inverted_index.gradient | 65 |
| abstract_inverted_index.multiple | 9, 53 |
| abstract_inverted_index.outcomes | 166 |
| abstract_inverted_index.recovery | 157 |
| abstract_inverted_index.requires | 17 |
| abstract_inverted_index.training | 1, 36, 149, 165, 172 |
| abstract_inverted_index.Moreover, | 25 |
| abstract_inverted_index.algorithm | 50 |
| abstract_inverted_index.balanced. | 72 |
| abstract_inverted_index.balancing | 13 |
| abstract_inverted_index.gradients | 111, 124 |
| abstract_inverted_index.magnitude | 66, 106, 117 |
| abstract_inverted_index.parameter | 95 |
| abstract_inverted_index.principle | 58 |
| abstract_inverted_index.trade-off | 28 |
| abstract_inverted_index.weighting | 162 |
| abstract_inverted_index.Multi-Term | 77 |
| abstract_inverted_index.calculates | 100 |
| abstract_inverted_index.demanding. | 24 |
| abstract_inverted_index.derivative | 82 |
| abstract_inverted_index.generalize | 46 |
| abstract_inverted_index.prescribed | 175 |
| abstract_inverted_index.suboptimal | 159 |
| abstract_inverted_index.adversarial | 40 |
| abstract_inverted_index.dynamically | 141 |
| abstract_inverted_index.progresses, | 37 |
| abstract_inverted_index.separately, | 87 |
| abstract_inverted_index.considerable | 18 |
| abstract_inverted_index.continuously | 121 |
| abstract_inverted_index.conventional | 171 |
| abstract_inverted_index.optimization | 49 |
| abstract_inverted_index.computationally | 23 |
| abstract_inverted_index.hyperparameters | 176 |
| cited_by_percentile_year.max | 98 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile.value | 0.7967128 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |