Design of Reward Functions for Autonomous Driving Based on Reinforcement Learning: Balancing Safety and Efficiency Article Swipe
Autonomous driving, leveraging artificial intelligence and reinforcement learning (RL), has made significant strides in improving traffic efficiency and safety. However, current RL-based approaches often focus on single-objective optimization, such as maximizing either efficiency or safety. In real-world driving, multiple conflicting objectivessuch as safety, efficiency, and comfortmust be balanced simultaneously, which remains underexplored. This paper proposes a multi-objective reward function design to balance safety and efficiency in autonomous driving. Using the Proximal Policy Optimization (PPO) algorithm, we train seven autonomous driving models with varying collision penalty strategies in the MetaDrive simulation environment. The results show that dynamic collision penalties outperform fixed penalties in balancing safety and efficiency, with Model 5 achieving the best overall performance. Despite this, all models underperform in left-turn scenarios, highlighting the need for further optimization in lateral control. This work provides insights into effective reward design for multi-objective reinforcement learning in autonomous driving.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.54254/2755-2721/2025.tj21921
- https://www.ewadirect.com/proceedings/ace/article/view/21921/pdf
- OA Status
- hybrid
- Cited By
- 1
- References
- 11
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4409314994
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4409314994Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.54254/2755-2721/2025.tj21921Digital Object Identifier
- Title
-
Design of Reward Functions for Autonomous Driving Based on Reinforcement Learning: Balancing Safety and EfficiencyWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-04-10Full publication date if available
- Authors
-
Wenjun ZhangList of authors in order
- Landing page
-
https://doi.org/10.54254/2755-2721/2025.tj21921Publisher landing page
- PDF URL
-
https://www.ewadirect.com/proceedings/ace/article/view/21921/pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
hybridOpen access status per OpenAlex
- OA URL
-
https://www.ewadirect.com/proceedings/ace/article/view/21921/pdfDirect OA link when available
- Concepts
-
Reinforcement learning, Reinforcement, Computer science, Cognitive psychology, Psychology, Artificial intelligence, Social psychologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- References (count)
-
11Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4409314994 |
|---|---|
| doi | https://doi.org/10.54254/2755-2721/2025.tj21921 |
| ids.doi | https://doi.org/10.54254/2755-2721/2025.tj21921 |
| ids.openalex | https://openalex.org/W4409314994 |
| fwci | 2.17737573 |
| type | article |
| title | Design of Reward Functions for Autonomous Driving Based on Reinforcement Learning: Balancing Safety and Efficiency |
| biblio.issue | 1 |
| biblio.volume | 146 |
| biblio.last_page | 22 |
| biblio.first_page | 9 |
| topics[0].id | https://openalex.org/T11099 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.890999972820282 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2203 |
| topics[0].subfield.display_name | Automotive Engineering |
| topics[0].display_name | Autonomous Vehicle Technology and Safety |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8032110333442688 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C67203356 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6667987704277039 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[1].display_name | Reinforcement |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.4453504979610443 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C180747234 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3612349331378937 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q23373 |
| concepts[3].display_name | Cognitive psychology |
| concepts[4].id | https://openalex.org/C15744967 |
| concepts[4].level | 0 |
| concepts[4].score | 0.3404175043106079 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[4].display_name | Psychology |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.2560131847858429 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C77805123 |
| concepts[6].level | 1 |
| concepts[6].score | 0.17270830273628235 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q161272 |
| concepts[6].display_name | Social psychology |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8032110333442688 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/reinforcement |
| keywords[1].score | 0.6667987704277039 |
| keywords[1].display_name | Reinforcement |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.4453504979610443 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/cognitive-psychology |
| keywords[3].score | 0.3612349331378937 |
| keywords[3].display_name | Cognitive psychology |
| keywords[4].id | https://openalex.org/keywords/psychology |
| keywords[4].score | 0.3404175043106079 |
| keywords[4].display_name | Psychology |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.2560131847858429 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/social-psychology |
| keywords[6].score | 0.17270830273628235 |
| keywords[6].display_name | Social psychology |
| language | en |
| locations[0].id | doi:10.54254/2755-2721/2025.tj21921 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4387281889 |
| locations[0].source.issn | 2755-2721, 2755-273X |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 2755-2721 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Applied and Computational Engineering |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.ewadirect.com/proceedings/ace/article/view/21921/pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Applied and Computational Engineering |
| locations[0].landing_page_url | https://doi.org/10.54254/2755-2721/2025.tj21921 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5100447801 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-7973-8769 |
| authorships[0].author.display_name | Wenjun Zhang |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I139759216 |
| authorships[0].affiliations[0].raw_affiliation_string | International School, Beijing University of Posts and Telecommunications |
| authorships[0].institutions[0].id | https://openalex.org/I139759216 |
| authorships[0].institutions[0].ror | https://ror.org/04w9fbh59 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I139759216 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | Beijing University of Posts and Telecommunications |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Weijin Zhang |
| authorships[0].is_corresponding | True |
| authorships[0].raw_affiliation_strings | International School, Beijing University of Posts and Telecommunications |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.ewadirect.com/proceedings/ace/article/view/21921/pdf |
| open_access.oa_status | hybrid |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Design of Reward Functions for Autonomous Driving Based on Reinforcement Learning: Balancing Safety and Efficiency |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11099 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.890999972820282 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2203 |
| primary_topic.subfield.display_name | Automotive Engineering |
| primary_topic.display_name | Autonomous Vehicle Technology and Safety |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W4310083477, https://openalex.org/W2328553770, https://openalex.org/W2920061524, https://openalex.org/W1977959518, https://openalex.org/W2038908348, https://openalex.org/W2107890255, https://openalex.org/W2106552856 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 1 |
| best_oa_location.id | doi:10.54254/2755-2721/2025.tj21921 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4387281889 |
| best_oa_location.source.issn | 2755-2721, 2755-273X |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 2755-2721 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Applied and Computational Engineering |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.ewadirect.com/proceedings/ace/article/view/21921/pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Applied and Computational Engineering |
| best_oa_location.landing_page_url | https://doi.org/10.54254/2755-2721/2025.tj21921 |
| primary_location.id | doi:10.54254/2755-2721/2025.tj21921 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4387281889 |
| primary_location.source.issn | 2755-2721, 2755-273X |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 2755-2721 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Applied and Computational Engineering |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.ewadirect.com/proceedings/ace/article/view/21921/pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Applied and Computational Engineering |
| primary_location.landing_page_url | https://doi.org/10.54254/2755-2721/2025.tj21921 |
| publication_date | 2025-04-10 |
| publication_year | 2025 |
| referenced_works | https://openalex.org/W6677174981, https://openalex.org/W6770443880, https://openalex.org/W1592601589, https://openalex.org/W3003533476, https://openalex.org/W2623431351, https://openalex.org/W2781726626, https://openalex.org/W6740801417, https://openalex.org/W1777239053, https://openalex.org/W1983738939, https://openalex.org/W2032924574, https://openalex.org/W6779562873 |
| referenced_works_count | 11 |
| abstract_inverted_index.5 | 108 |
| abstract_inverted_index.a | 55 |
| abstract_inverted_index.In | 35 |
| abstract_inverted_index.as | 29, 41 |
| abstract_inverted_index.be | 46 |
| abstract_inverted_index.in | 13, 65, 86, 101, 119, 128, 143 |
| abstract_inverted_index.on | 25 |
| abstract_inverted_index.or | 33 |
| abstract_inverted_index.to | 60 |
| abstract_inverted_index.we | 75 |
| abstract_inverted_index.The | 91 |
| abstract_inverted_index.all | 116 |
| abstract_inverted_index.and | 5, 17, 44, 63, 104 |
| abstract_inverted_index.for | 125, 139 |
| abstract_inverted_index.has | 9 |
| abstract_inverted_index.the | 69, 87, 110, 123 |
| abstract_inverted_index.This | 52, 131 |
| abstract_inverted_index.best | 111 |
| abstract_inverted_index.into | 135 |
| abstract_inverted_index.made | 10 |
| abstract_inverted_index.need | 124 |
| abstract_inverted_index.show | 93 |
| abstract_inverted_index.such | 28 |
| abstract_inverted_index.that | 94 |
| abstract_inverted_index.with | 81, 106 |
| abstract_inverted_index.work | 132 |
| abstract_inverted_index.(PPO) | 73 |
| abstract_inverted_index.(RL), | 8 |
| abstract_inverted_index.Model | 107 |
| abstract_inverted_index.Using | 68 |
| abstract_inverted_index.fixed | 99 |
| abstract_inverted_index.focus | 24 |
| abstract_inverted_index.often | 23 |
| abstract_inverted_index.paper | 53 |
| abstract_inverted_index.seven | 77 |
| abstract_inverted_index.this, | 115 |
| abstract_inverted_index.train | 76 |
| abstract_inverted_index.which | 49 |
| abstract_inverted_index.Policy | 71 |
| abstract_inverted_index.design | 59, 138 |
| abstract_inverted_index.either | 31 |
| abstract_inverted_index.models | 80, 117 |
| abstract_inverted_index.reward | 57, 137 |
| abstract_inverted_index.safety | 62, 103 |
| abstract_inverted_index.Despite | 114 |
| abstract_inverted_index.balance | 61 |
| abstract_inverted_index.current | 20 |
| abstract_inverted_index.driving | 79 |
| abstract_inverted_index.dynamic | 95 |
| abstract_inverted_index.further | 126 |
| abstract_inverted_index.lateral | 129 |
| abstract_inverted_index.overall | 112 |
| abstract_inverted_index.penalty | 84 |
| abstract_inverted_index.remains | 50 |
| abstract_inverted_index.results | 92 |
| abstract_inverted_index.safety, | 42 |
| abstract_inverted_index.safety. | 18, 34 |
| abstract_inverted_index.strides | 12 |
| abstract_inverted_index.traffic | 15 |
| abstract_inverted_index.varying | 82 |
| abstract_inverted_index.However, | 19 |
| abstract_inverted_index.Proximal | 70 |
| abstract_inverted_index.RL-based | 21 |
| abstract_inverted_index.balanced | 47 |
| abstract_inverted_index.control. | 130 |
| abstract_inverted_index.driving, | 1, 37 |
| abstract_inverted_index.driving. | 67, 145 |
| abstract_inverted_index.function | 58 |
| abstract_inverted_index.insights | 134 |
| abstract_inverted_index.learning | 7, 142 |
| abstract_inverted_index.multiple | 38 |
| abstract_inverted_index.proposes | 54 |
| abstract_inverted_index.provides | 133 |
| abstract_inverted_index.MetaDrive | 88 |
| abstract_inverted_index.achieving | 109 |
| abstract_inverted_index.balancing | 102 |
| abstract_inverted_index.collision | 83, 96 |
| abstract_inverted_index.effective | 136 |
| abstract_inverted_index.improving | 14 |
| abstract_inverted_index.left-turn | 120 |
| abstract_inverted_index.penalties | 97, 100 |
| abstract_inverted_index.Autonomous | 0 |
| abstract_inverted_index.algorithm, | 74 |
| abstract_inverted_index.approaches | 22 |
| abstract_inverted_index.artificial | 3 |
| abstract_inverted_index.autonomous | 66, 78, 144 |
| abstract_inverted_index.efficiency | 16, 32, 64 |
| abstract_inverted_index.leveraging | 2 |
| abstract_inverted_index.maximizing | 30 |
| abstract_inverted_index.outperform | 98 |
| abstract_inverted_index.real-world | 36 |
| abstract_inverted_index.scenarios, | 121 |
| abstract_inverted_index.simulation | 89 |
| abstract_inverted_index.strategies | 85 |
| abstract_inverted_index.comfortmust | 45 |
| abstract_inverted_index.conflicting | 39 |
| abstract_inverted_index.efficiency, | 43, 105 |
| abstract_inverted_index.significant | 11 |
| abstract_inverted_index.Optimization | 72 |
| abstract_inverted_index.environment. | 90 |
| abstract_inverted_index.highlighting | 122 |
| abstract_inverted_index.intelligence | 4 |
| abstract_inverted_index.optimization | 127 |
| abstract_inverted_index.performance. | 113 |
| abstract_inverted_index.underperform | 118 |
| abstract_inverted_index.optimization, | 27 |
| abstract_inverted_index.reinforcement | 6, 141 |
| abstract_inverted_index.objectivessuch | 40 |
| abstract_inverted_index.underexplored. | 51 |
| abstract_inverted_index.multi-objective | 56, 140 |
| abstract_inverted_index.simultaneously, | 48 |
| abstract_inverted_index.single-objective | 26 |
| cited_by_percentile_year.max | 95 |
| cited_by_percentile_year.min | 91 |
| corresponding_author_ids | https://openalex.org/A5100447801 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 1 |
| corresponding_institution_ids | https://openalex.org/I139759216 |
| citation_normalized_percentile.value | 0.7632577 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |