Episodic Contextual Bandits with Knapsacks under Conversion Models Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2507.06859
We study an online setting, where a decision maker (DM) interacts with contextual bandit-with-knapsack (BwK) instances in repeated episodes. These episodes start with different resource amounts, and the contexts' probability distributions are non-stationary in an episode. All episodes share the same latent conversion model, which governs the random outcome contingent upon a request's context and an allocation decision. Our model captures applications such as dynamic pricing on perishable resources with episodic replenishment, and first price auctions in repeated episodes with different starting budgets. We design an online algorithm that achieves a regret sub-linear in $T$, the number of episodes, assuming access to a \emph{confidence bound oracle} that achieves an $o(T)$-regret. Such an oracle is readily available from existing contextual bandit literature. We overcome the technical challenge with arbitrarily many possible contexts, which leads to a reinforcement learning problem with an unbounded state space. Our framework provides improved regret bounds in certain settings when the DM is provided with unlabeled feature data, which is novel to the contextual BwK literature.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2507.06859
- https://arxiv.org/pdf/2507.06859
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416104240
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416104240Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2507.06859Digital Object Identifier
- Title
-
Episodic Contextual Bandits with Knapsacks under Conversion ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-09Full publication date if available
- Authors
-
Ziyi Li, Wang CheungList of authors in order
- Landing page
-
https://arxiv.org/abs/2507.06859Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2507.06859Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2507.06859Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416104240 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2507.06859 |
| ids.doi | https://doi.org/10.48550/arxiv.2507.06859 |
| ids.openalex | https://openalex.org/W4416104240 |
| fwci | |
| type | preprint |
| title | Episodic Contextual Bandits with Knapsacks under Conversion Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2507.06859 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2507.06859 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2507.06859 |
| locations[1].id | doi:10.48550/arxiv.2507.06859 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2507.06859 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100389146 |
| authorships[0].author.orcid | https://orcid.org/0009-0007-2708-2471 |
| authorships[0].author.display_name | Ziyi Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Li, Zitian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5120341144 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Wang Cheung |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Cheung, Wang Chi |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2507.06859 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Episodic Contextual Bandits with Knapsacks under Conversion Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T08:54:21.124428 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2507.06859 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2507.06859 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2507.06859 |
| primary_location.id | pmh:oai:arXiv.org:2507.06859 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2507.06859 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2507.06859 |
| publication_date | 2025-07-09 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 6, 51, 90, 102, 134 |
| abstract_inverted_index.DM | 154 |
| abstract_inverted_index.We | 0, 83, 121 |
| abstract_inverted_index.an | 2, 34, 55, 85, 108, 111, 139 |
| abstract_inverted_index.as | 63 |
| abstract_inverted_index.in | 16, 33, 76, 93, 149 |
| abstract_inverted_index.is | 113, 155, 162 |
| abstract_inverted_index.of | 97 |
| abstract_inverted_index.on | 66 |
| abstract_inverted_index.to | 101, 133, 164 |
| abstract_inverted_index.All | 36 |
| abstract_inverted_index.BwK | 167 |
| abstract_inverted_index.Our | 58, 143 |
| abstract_inverted_index.and | 26, 54, 72 |
| abstract_inverted_index.are | 31 |
| abstract_inverted_index.the | 27, 39, 46, 95, 123, 153, 165 |
| abstract_inverted_index.$T$, | 94 |
| abstract_inverted_index.(DM) | 9 |
| abstract_inverted_index.Such | 110 |
| abstract_inverted_index.from | 116 |
| abstract_inverted_index.many | 128 |
| abstract_inverted_index.same | 40 |
| abstract_inverted_index.such | 62 |
| abstract_inverted_index.that | 88, 106 |
| abstract_inverted_index.upon | 50 |
| abstract_inverted_index.when | 152 |
| abstract_inverted_index.with | 11, 22, 69, 79, 126, 138, 157 |
| abstract_inverted_index.(BwK) | 14 |
| abstract_inverted_index.These | 19 |
| abstract_inverted_index.bound | 104 |
| abstract_inverted_index.data, | 160 |
| abstract_inverted_index.first | 73 |
| abstract_inverted_index.leads | 132 |
| abstract_inverted_index.maker | 8 |
| abstract_inverted_index.model | 59 |
| abstract_inverted_index.novel | 163 |
| abstract_inverted_index.price | 74 |
| abstract_inverted_index.share | 38 |
| abstract_inverted_index.start | 21 |
| abstract_inverted_index.state | 141 |
| abstract_inverted_index.study | 1 |
| abstract_inverted_index.where | 5 |
| abstract_inverted_index.which | 44, 131, 161 |
| abstract_inverted_index.access | 100 |
| abstract_inverted_index.bandit | 119 |
| abstract_inverted_index.bounds | 148 |
| abstract_inverted_index.design | 84 |
| abstract_inverted_index.latent | 41 |
| abstract_inverted_index.model, | 43 |
| abstract_inverted_index.number | 96 |
| abstract_inverted_index.online | 3, 86 |
| abstract_inverted_index.oracle | 112 |
| abstract_inverted_index.random | 47 |
| abstract_inverted_index.regret | 91, 147 |
| abstract_inverted_index.space. | 142 |
| abstract_inverted_index.certain | 150 |
| abstract_inverted_index.context | 53 |
| abstract_inverted_index.dynamic | 64 |
| abstract_inverted_index.feature | 159 |
| abstract_inverted_index.governs | 45 |
| abstract_inverted_index.oracle} | 105 |
| abstract_inverted_index.outcome | 48 |
| abstract_inverted_index.pricing | 65 |
| abstract_inverted_index.problem | 137 |
| abstract_inverted_index.readily | 114 |
| abstract_inverted_index.achieves | 89, 107 |
| abstract_inverted_index.amounts, | 25 |
| abstract_inverted_index.assuming | 99 |
| abstract_inverted_index.auctions | 75 |
| abstract_inverted_index.budgets. | 82 |
| abstract_inverted_index.captures | 60 |
| abstract_inverted_index.decision | 7 |
| abstract_inverted_index.episode. | 35 |
| abstract_inverted_index.episodes | 20, 37, 78 |
| abstract_inverted_index.episodic | 70 |
| abstract_inverted_index.existing | 117 |
| abstract_inverted_index.improved | 146 |
| abstract_inverted_index.learning | 136 |
| abstract_inverted_index.overcome | 122 |
| abstract_inverted_index.possible | 129 |
| abstract_inverted_index.provided | 156 |
| abstract_inverted_index.provides | 145 |
| abstract_inverted_index.repeated | 17, 77 |
| abstract_inverted_index.resource | 24 |
| abstract_inverted_index.setting, | 4 |
| abstract_inverted_index.settings | 151 |
| abstract_inverted_index.starting | 81 |
| abstract_inverted_index.algorithm | 87 |
| abstract_inverted_index.available | 115 |
| abstract_inverted_index.challenge | 125 |
| abstract_inverted_index.contexts' | 28 |
| abstract_inverted_index.contexts, | 130 |
| abstract_inverted_index.decision. | 57 |
| abstract_inverted_index.different | 23, 80 |
| abstract_inverted_index.episodes, | 98 |
| abstract_inverted_index.episodes. | 18 |
| abstract_inverted_index.framework | 144 |
| abstract_inverted_index.instances | 15 |
| abstract_inverted_index.interacts | 10 |
| abstract_inverted_index.request's | 52 |
| abstract_inverted_index.resources | 68 |
| abstract_inverted_index.technical | 124 |
| abstract_inverted_index.unbounded | 140 |
| abstract_inverted_index.unlabeled | 158 |
| abstract_inverted_index.allocation | 56 |
| abstract_inverted_index.contextual | 12, 118, 166 |
| abstract_inverted_index.contingent | 49 |
| abstract_inverted_index.conversion | 42 |
| abstract_inverted_index.perishable | 67 |
| abstract_inverted_index.sub-linear | 92 |
| abstract_inverted_index.arbitrarily | 127 |
| abstract_inverted_index.literature. | 120, 168 |
| abstract_inverted_index.probability | 29 |
| abstract_inverted_index.applications | 61 |
| abstract_inverted_index.distributions | 30 |
| abstract_inverted_index.reinforcement | 135 |
| abstract_inverted_index.$o(T)$-regret. | 109 |
| abstract_inverted_index.non-stationary | 32 |
| abstract_inverted_index.replenishment, | 71 |
| abstract_inverted_index.\emph{confidence | 103 |
| abstract_inverted_index.bandit-with-knapsack | 13 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |