Accelerating Multi-Task Temporal Difference Learning under Low-Rank Representation Article Swipe
We study policy evaluation problems in multi-task reinforcement learning (RL) under a low-rank representation setting. In this setting, we are given $N$ learning tasks where the corresponding value function of these tasks lie in an $r$-dimensional subspace, with $r
Related Topics
Concepts
No concepts available.
Metadata
- Type
- article
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.02030
- https://arxiv.org/pdf/2503.02030
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415334475
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415334475Canonical identifier for this work in OpenAlex
- Title
-
Accelerating Multi-Task Temporal Difference Learning under Low-Rank RepresentationWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-03Full publication date if available
- Authors
-
Yuanchao Bai, Sihan Zeng, Justin Romberg, Thinh T. DoanList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.02030Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.02030Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.02030Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415334475 |
|---|---|
| doi | |
| ids.openalex | https://openalex.org/W4415334475 |
| fwci | 0.0 |
| type | article |
| title | Accelerating Multi-Task Temporal Difference Learning under Low-Rank Representation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11307 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9096999764442444 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Domain Adaptation and Few-Shot Learning |
| topics[1].id | https://openalex.org/T10812 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9020000100135803 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Human Pose and Action Recognition |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.02030 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.02030 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.02030 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5024093994 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3449-6537 |
| authorships[0].author.display_name | Yuanchao Bai |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Bai, Yitao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5069872754 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-0061-5780 |
| authorships[1].author.display_name | Sihan Zeng |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zeng, Sihan |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5108137552 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Justin Romberg |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Romberg, Justin |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5035207859 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-5135-3429 |
| authorships[3].author.display_name | Thinh T. Doan |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Doan, Thinh T. |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.02030 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-19T00:00:00 |
| display_name | Accelerating Multi-Task Temporal Difference Learning under Low-Rank Representation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T04:12:42.849631 |
| primary_topic.id | https://openalex.org/T11307 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9096999764442444 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Domain Adaptation and Few-Shot Learning |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.02030 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.02030 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.02030 |
| primary_location.id | pmh:oai:arXiv.org:2503.02030 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.02030 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.02030 |
| publication_date | 2025-03-03 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 11, 95, 193, 212 |
| abstract_inverted_index.In | 15, 63 |
| abstract_inverted_index.TD | 87, 99, 116, 123, 156, 182, 229 |
| abstract_inverted_index.To | 89 |
| abstract_inverted_index.We | 0, 191 |
| abstract_inverted_index.an | 34, 186 |
| abstract_inverted_index.as | 163 |
| abstract_inverted_index.at | 211 |
| abstract_inverted_index.in | 5, 33, 69 |
| abstract_inverted_index.is | 217 |
| abstract_inverted_index.of | 29, 59, 78, 86, 98, 115, 172, 220, 226 |
| abstract_inverted_index.on | 188 |
| abstract_inverted_index.to | 82, 125, 131, 136 |
| abstract_inverted_index.we | 18, 66, 93, 103, 204 |
| abstract_inverted_index.$N$ | 21 |
| abstract_inverted_index.$r$ | 166 |
| abstract_inverted_index.$t$ | 216 |
| abstract_inverted_index.One | 39 |
| abstract_inverted_index.Our | 144 |
| abstract_inverted_index.are | 19, 67 |
| abstract_inverted_index.can | 40, 73 |
| abstract_inverted_index.due | 130 |
| abstract_inverted_index.for | 48 |
| abstract_inverted_index.gap | 161 |
| abstract_inverted_index.its | 142 |
| abstract_inverted_index.lie | 32 |
| abstract_inverted_index.low | 133 |
| abstract_inverted_index.new | 96 |
| abstract_inverted_index.not | 201 |
| abstract_inverted_index.one | 72 |
| abstract_inverted_index.the | 25, 42, 56, 75, 79, 84, 105, 113, 127, 132, 138, 149, 154, 159, 164, 169, 175, 189, 198, 207, 218, 227 |
| abstract_inverted_index.(RL) | 9 |
| abstract_inverted_index.(TD) | 45 |
| abstract_inverted_index.From | 168 |
| abstract_inverted_index.This | 118, 222 |
| abstract_inverted_index.does | 200 |
| abstract_inverted_index.each | 60 |
| abstract_inverted_index.into | 112, 181 |
| abstract_inverted_index.rank | 134, 165 |
| abstract_inverted_index.rate | 213, 223 |
| abstract_inverted_index.show | 147 |
| abstract_inverted_index.step | 111, 120, 180 |
| abstract_inverted_index.task | 61 |
| abstract_inverted_index.that | 148, 197, 206, 225 |
| abstract_inverted_index.this | 16, 53, 64, 91 |
| abstract_inverted_index.will | 121 |
| abstract_inverted_index.with | 37 |
| abstract_inverted_index.apply | 41 |
| abstract_inverted_index.cause | 185 |
| abstract_inverted_index.given | 20 |
| abstract_inverted_index.might | 184 |
| abstract_inverted_index.point | 171 |
| abstract_inverted_index.prove | 205 |
| abstract_inverted_index.study | 1 |
| abstract_inverted_index.tasks | 23, 31 |
| abstract_inverted_index.these | 30, 50 |
| abstract_inverted_index.under | 10 |
| abstract_inverted_index.value | 27, 57, 109, 178 |
| abstract_inverted_index.view, | 173 |
| abstract_inverted_index.where | 24, 52, 102, 158, 215 |
| abstract_inverted_index.$r<N$. | 38 |
| abstract_inverted_index.answer | 90 |
| abstract_inverted_index.enable | 122 |
| abstract_inverted_index.learns | 55 |
| abstract_inverted_index.method | 47, 54, 151, 209 |
| abstract_inverted_index.number | 219 |
| abstract_inverted_index.paper, | 65 |
| abstract_inverted_index.policy | 2 |
| abstract_inverted_index.result | 195 |
| abstract_inverted_index.update | 114, 137 |
| abstract_inverted_index.classic | 43, 155 |
| abstract_inverted_index.exploit | 74, 126 |
| abstract_inverted_index.happen. | 202 |
| abstract_inverted_index.matches | 224 |
| abstract_inverted_index.method, | 101 |
| abstract_inverted_index.propose | 94 |
| abstract_inverted_index.provide | 192 |
| abstract_inverted_index.results | 146 |
| abstract_inverted_index.setting | 81 |
| abstract_inverted_index.showing | 196 |
| abstract_inverted_index.solving | 49 |
| abstract_inverted_index.variant | 97 |
| abstract_inverted_index.whether | 71 |
| abstract_inverted_index.dominant | 128 |
| abstract_inverted_index.function | 28, 58 |
| abstract_inverted_index.learning | 8, 22, 46, 100, 124, 183 |
| abstract_inverted_index.low-rank | 12, 76 |
| abstract_inverted_index.problems | 4, 51 |
| abstract_inverted_index.proposed | 150, 208 |
| abstract_inverted_index.setting, | 17 |
| abstract_inverted_index.setting. | 14 |
| abstract_inverted_index.singular | 108, 177 |
| abstract_inverted_index.standard | 228 |
| abstract_inverted_index.updates. | 190 |
| abstract_inverted_index.converges | 210 |
| abstract_inverted_index.empirical | 145 |
| abstract_inverted_index.improving | 141 |
| abstract_inverted_index.increases | 162 |
| abstract_inverted_index.integrate | 104 |
| abstract_inverted_index.iterates, | 139 |
| abstract_inverted_index.learning, | 157 |
| abstract_inverted_index.learning. | 88, 117, 230 |
| abstract_inverted_index.question, | 92 |
| abstract_inverted_index.so-called | 106 |
| abstract_inverted_index.structure | 77, 135 |
| abstract_inverted_index.subspace, | 36 |
| abstract_inverted_index.truncated | 107, 176 |
| abstract_inverted_index.accelerate | 83 |
| abstract_inverted_index.additional | 119 |
| abstract_inverted_index.decreases. | 167 |
| abstract_inverted_index.directions | 129 |
| abstract_inverted_index.evaluation | 3 |
| abstract_inverted_index.interested | 68 |
| abstract_inverted_index.multi-task | 6, 80 |
| abstract_inverted_index.therefore, | 140 |
| abstract_inverted_index.instability | 187, 199 |
| abstract_inverted_index.introducing | 174 |
| abstract_inverted_index.iterations. | 221 |
| abstract_inverted_index.outperforms | 153 |
| abstract_inverted_index.performance | 85, 160 |
| abstract_inverted_index.theoretical | 170, 194 |
| abstract_inverted_index.performance. | 143 |
| abstract_inverted_index.Specifically, | 203 |
| abstract_inverted_index.corresponding | 26 |
| abstract_inverted_index.decomposition | 110, 179 |
| abstract_inverted_index.reinforcement | 7 |
| abstract_inverted_index.significantly | 152 |
| abstract_inverted_index.understanding | 70 |
| abstract_inverted_index.independently. | 62 |
| abstract_inverted_index.representation | 13 |
| abstract_inverted_index.$r$-dimensional | 35 |
| abstract_inverted_index.temporal-difference | 44 |
| abstract_inverted_index.$\mathcal{O}(\frac{\ln(t)}{t})$, | 214 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile.value | 0.22635621 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |