An Analysis on Automated Metrics for Evaluating Japanese-English Chat Translation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.18190
This paper analyses how traditional baseline metrics, such as BLEU and TER, and neural-based methods, such as BERTScore and COMET, score several NMT models performance on chat translation and how these metrics perform when compared to human-annotated scores. The results show that for ranking NMT models in chat translations, all metrics seem consistent in deciding which model outperforms the others. This implies that traditional baseline metrics, which are faster and simpler to use, can still be helpful. On the other hand, when it comes to better correlation with human judgment, neural-based metrics outperform traditional metrics, with COMET achieving the highest correlation with the human-annotated score on a chat translation. However, we show that even the best metric struggles when scoring English translations from sentences with anaphoric zero-pronoun in Japanese.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.18190
- https://arxiv.org/pdf/2412.18190
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405783866
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405783866Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.18190Digital Object Identifier
- Title
-
An Analysis on Automated Metrics for Evaluating Japanese-English Chat TranslationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-24Full publication date if available
- Authors
-
Andre Rusli, Makoto ShishidoList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.18190Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.18190Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.18190Direct OA link when available
- Concepts
-
Computer science, Translation (biology), Natural language processing, Data science, Artificial intelligence, World Wide Web, Chemistry, Biochemistry, Gene, Messenger RNATop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405783866 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.18190 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.18190 |
| ids.openalex | https://openalex.org/W4405783866 |
| fwci | |
| type | preprint |
| title | An Analysis on Automated Metrics for Evaluating Japanese-English Chat Translation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.8528000116348267 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.6569323539733887 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C149364088 |
| concepts[1].level | 4 |
| concepts[1].score | 0.6542978286743164 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q185917 |
| concepts[1].display_name | Translation (biology) |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.5424313545227051 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C2522767166 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3725373148918152 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2374463 |
| concepts[3].display_name | Data science |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.35382556915283203 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C136764020 |
| concepts[5].level | 1 |
| concepts[5].score | 0.32919228076934814 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q466 |
| concepts[5].display_name | World Wide Web |
| concepts[6].id | https://openalex.org/C185592680 |
| concepts[6].level | 0 |
| concepts[6].score | 0.0 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[6].display_name | Chemistry |
| concepts[7].id | https://openalex.org/C55493867 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[7].display_name | Biochemistry |
| concepts[8].id | https://openalex.org/C104317684 |
| concepts[8].level | 2 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[8].display_name | Gene |
| concepts[9].id | https://openalex.org/C105580179 |
| concepts[9].level | 3 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q188928 |
| concepts[9].display_name | Messenger RNA |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.6569323539733887 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/translation |
| keywords[1].score | 0.6542978286743164 |
| keywords[1].display_name | Translation (biology) |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.5424313545227051 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/data-science |
| keywords[3].score | 0.3725373148918152 |
| keywords[3].display_name | Data science |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.35382556915283203 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/world-wide-web |
| keywords[5].score | 0.32919228076934814 |
| keywords[5].display_name | World Wide Web |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.18190 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.18190 |
| locations[0].version | publishedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.18190 |
| locations[1].id | doi:10.48550/arxiv.2412.18190 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.18190 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5034479802 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7907-2394 |
| authorships[0].author.display_name | Andre Rusli |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Rusli, Andre |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5067026252 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Makoto Shishido |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Shishido, Makoto |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.18190 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-12-26T00:00:00 |
| display_name | An Analysis on Automated Metrics for Evaluating Japanese-English Chat Translation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.8528000116348267 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W3188962172, https://openalex.org/W2772917594, https://openalex.org/W4312825515, https://openalex.org/W2512040214, https://openalex.org/W4306742369, https://openalex.org/W2778913187, https://openalex.org/W4303457083, https://openalex.org/W2131146434, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.18190 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.18190 |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.18190 |
| primary_location.id | pmh:oai:arXiv.org:2412.18190 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.18190 |
| primary_location.version | publishedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.18190 |
| publication_date | 2024-12-24 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 106 |
| abstract_inverted_index.On | 77 |
| abstract_inverted_index.as | 8, 16 |
| abstract_inverted_index.be | 75 |
| abstract_inverted_index.in | 46, 53, 127 |
| abstract_inverted_index.it | 82 |
| abstract_inverted_index.on | 25, 105 |
| abstract_inverted_index.to | 35, 71, 84 |
| abstract_inverted_index.we | 110 |
| abstract_inverted_index.NMT | 22, 44 |
| abstract_inverted_index.The | 38 |
| abstract_inverted_index.all | 49 |
| abstract_inverted_index.and | 10, 12, 18, 28, 69 |
| abstract_inverted_index.are | 67 |
| abstract_inverted_index.can | 73 |
| abstract_inverted_index.for | 42 |
| abstract_inverted_index.how | 3, 29 |
| abstract_inverted_index.the | 58, 78, 98, 102, 114 |
| abstract_inverted_index.BLEU | 9 |
| abstract_inverted_index.TER, | 11 |
| abstract_inverted_index.This | 0, 60 |
| abstract_inverted_index.best | 115 |
| abstract_inverted_index.chat | 26, 47, 107 |
| abstract_inverted_index.even | 113 |
| abstract_inverted_index.from | 122 |
| abstract_inverted_index.seem | 51 |
| abstract_inverted_index.show | 40, 111 |
| abstract_inverted_index.such | 7, 15 |
| abstract_inverted_index.that | 41, 62, 112 |
| abstract_inverted_index.use, | 72 |
| abstract_inverted_index.when | 33, 81, 118 |
| abstract_inverted_index.with | 87, 95, 101, 124 |
| abstract_inverted_index.COMET | 96 |
| abstract_inverted_index.comes | 83 |
| abstract_inverted_index.hand, | 80 |
| abstract_inverted_index.human | 88 |
| abstract_inverted_index.model | 56 |
| abstract_inverted_index.other | 79 |
| abstract_inverted_index.paper | 1 |
| abstract_inverted_index.score | 20, 104 |
| abstract_inverted_index.still | 74 |
| abstract_inverted_index.these | 30 |
| abstract_inverted_index.which | 55, 66 |
| abstract_inverted_index.COMET, | 19 |
| abstract_inverted_index.better | 85 |
| abstract_inverted_index.faster | 68 |
| abstract_inverted_index.metric | 116 |
| abstract_inverted_index.models | 23, 45 |
| abstract_inverted_index.English | 120 |
| abstract_inverted_index.highest | 99 |
| abstract_inverted_index.implies | 61 |
| abstract_inverted_index.metrics | 31, 50, 91 |
| abstract_inverted_index.others. | 59 |
| abstract_inverted_index.perform | 32 |
| abstract_inverted_index.ranking | 43 |
| abstract_inverted_index.results | 39 |
| abstract_inverted_index.scores. | 37 |
| abstract_inverted_index.scoring | 119 |
| abstract_inverted_index.several | 21 |
| abstract_inverted_index.simpler | 70 |
| abstract_inverted_index.However, | 109 |
| abstract_inverted_index.analyses | 2 |
| abstract_inverted_index.baseline | 5, 64 |
| abstract_inverted_index.compared | 34 |
| abstract_inverted_index.deciding | 54 |
| abstract_inverted_index.helpful. | 76 |
| abstract_inverted_index.methods, | 14 |
| abstract_inverted_index.metrics, | 6, 65, 94 |
| abstract_inverted_index.BERTScore | 17 |
| abstract_inverted_index.Japanese. | 128 |
| abstract_inverted_index.achieving | 97 |
| abstract_inverted_index.anaphoric | 125 |
| abstract_inverted_index.judgment, | 89 |
| abstract_inverted_index.sentences | 123 |
| abstract_inverted_index.struggles | 117 |
| abstract_inverted_index.consistent | 52 |
| abstract_inverted_index.outperform | 92 |
| abstract_inverted_index.correlation | 86, 100 |
| abstract_inverted_index.outperforms | 57 |
| abstract_inverted_index.performance | 24 |
| abstract_inverted_index.traditional | 4, 63, 93 |
| abstract_inverted_index.translation | 27 |
| abstract_inverted_index.neural-based | 13, 90 |
| abstract_inverted_index.translation. | 108 |
| abstract_inverted_index.translations | 121 |
| abstract_inverted_index.zero-pronoun | 126 |
| abstract_inverted_index.translations, | 48 |
| abstract_inverted_index.human-annotated | 36, 103 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |