Constructing Per-Shot Bitrate Ladders using Visual Information Fidelity Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2408.01932
Video service providers need their delivery systems to be able to adapt to network conditions, user preferences, display settings, and other factors. HTTP Adaptive Streaming (HAS) offers dynamic switching between different video representations to simultaneously enhance bandwidth consumption and users' streaming experiences. Per-shot encoding, pioneered by Netflix, optimizes the encoding parameters on each scene or shot. The Dynamic Optimizer (DO) uses the Video Multi-Method Assessment Fusion (VMAF) perceptual video quality prediction engine to deliver high-quality videos at reduced bitrates. Here we develop a perceptually optimized method of constructing optimal per-shot bitrate and quality ladders, using an ensemble of low-level features and Visual Information Fidelity (VIF) features. During inference, our method predicts the bitrate or quality ladder of a source video without any compression or quality estimation. We compare the performance of our model against other content-adaptive bitrate ladder prediction methods, a fixed bitrate ladder, and reference bitrate ladders constructed via exhaustive encoding using Bjontegaard-delta (BD) metrics. Our proposed method shows excellent gains in bitrate and quality against the fixed bitrate ladder and only small losses against the reference bitrate ladder, while providing significant computational advantages.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2408.01932
- https://arxiv.org/pdf/2408.01932
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406489805
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406489805Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2408.01932Digital Object Identifier
- Title
-
Constructing Per-Shot Bitrate Ladders using Visual Information FidelityWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-04Full publication date if available
- Authors
-
Krishna Srikar Durbha, Alan C. BovikList of authors in order
- Landing page
-
https://arxiv.org/abs/2408.01932Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2408.01932Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2408.01932Direct OA link when available
- Concepts
-
Shot (pellet), Fidelity, Computer science, High fidelity, Artificial intelligence, Computer vision, Physics, Telecommunications, Acoustics, Materials science, MetallurgyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406489805 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2408.01932 |
| ids.doi | https://doi.org/10.48550/arxiv.2408.01932 |
| ids.openalex | https://openalex.org/W4406489805 |
| fwci | |
| type | preprint |
| title | Constructing Per-Shot Bitrate Ladders using Visual Information Fidelity |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12357 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9962999820709229 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Digital Media Forensic Detection |
| topics[1].id | https://openalex.org/T11165 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9876999855041504 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1707 |
| topics[1].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[1].display_name | Image and Video Quality Assessment |
| topics[2].id | https://openalex.org/T10741 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9825000166893005 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1711 |
| topics[2].subfield.display_name | Signal Processing |
| topics[2].display_name | Video Coding and Compression Technologies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2778344882 |
| concepts[0].level | 2 |
| concepts[0].score | 0.821230411529541 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q278938 |
| concepts[0].display_name | Shot (pellet) |
| concepts[1].id | https://openalex.org/C2776459999 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5872613191604614 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q2119376 |
| concepts[1].display_name | Fidelity |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5357211232185364 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C113364801 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5001363754272461 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q26674 |
| concepts[3].display_name | High fidelity |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.35212671756744385 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C31972630 |
| concepts[5].level | 1 |
| concepts[5].score | 0.33973029255867004 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q844240 |
| concepts[5].display_name | Computer vision |
| concepts[6].id | https://openalex.org/C121332964 |
| concepts[6].level | 0 |
| concepts[6].score | 0.1478765904903412 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[6].display_name | Physics |
| concepts[7].id | https://openalex.org/C76155785 |
| concepts[7].level | 1 |
| concepts[7].score | 0.14509356021881104 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q418 |
| concepts[7].display_name | Telecommunications |
| concepts[8].id | https://openalex.org/C24890656 |
| concepts[8].level | 1 |
| concepts[8].score | 0.08339151740074158 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q82811 |
| concepts[8].display_name | Acoustics |
| concepts[9].id | https://openalex.org/C192562407 |
| concepts[9].level | 0 |
| concepts[9].score | 0.07061508297920227 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[9].display_name | Materials science |
| concepts[10].id | https://openalex.org/C191897082 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11467 |
| concepts[10].display_name | Metallurgy |
| keywords[0].id | https://openalex.org/keywords/shot |
| keywords[0].score | 0.821230411529541 |
| keywords[0].display_name | Shot (pellet) |
| keywords[1].id | https://openalex.org/keywords/fidelity |
| keywords[1].score | 0.5872613191604614 |
| keywords[1].display_name | Fidelity |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5357211232185364 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/high-fidelity |
| keywords[3].score | 0.5001363754272461 |
| keywords[3].display_name | High fidelity |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.35212671756744385 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/computer-vision |
| keywords[5].score | 0.33973029255867004 |
| keywords[5].display_name | Computer vision |
| keywords[6].id | https://openalex.org/keywords/physics |
| keywords[6].score | 0.1478765904903412 |
| keywords[6].display_name | Physics |
| keywords[7].id | https://openalex.org/keywords/telecommunications |
| keywords[7].score | 0.14509356021881104 |
| keywords[7].display_name | Telecommunications |
| keywords[8].id | https://openalex.org/keywords/acoustics |
| keywords[8].score | 0.08339151740074158 |
| keywords[8].display_name | Acoustics |
| keywords[9].id | https://openalex.org/keywords/materials-science |
| keywords[9].score | 0.07061508297920227 |
| keywords[9].display_name | Materials science |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2408.01932 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2408.01932 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2408.01932 |
| locations[1].id | doi:10.48550/arxiv.2408.01932 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2408.01932 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5062368899 |
| authorships[0].author.orcid | https://orcid.org/0009-0006-0042-725X |
| authorships[0].author.display_name | Krishna Srikar Durbha |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Durbha, Krishna Srikar |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5075463806 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-6067-710X |
| authorships[1].author.display_name | Alan C. Bovik |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Bovik, Alan C. |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2408.01932 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Constructing Per-Shot Bitrate Ladders using Visual Information Fidelity |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-19T23:35:23.961156 |
| primary_topic.id | https://openalex.org/T12357 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9962999820709229 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Digital Media Forensic Detection |
| related_works | https://openalex.org/W2772917594, https://openalex.org/W2036807459, https://openalex.org/W2058170566, https://openalex.org/W2755342338, https://openalex.org/W2166024367, https://openalex.org/W3116076068, https://openalex.org/W2229312674, https://openalex.org/W2951359407, https://openalex.org/W2079911747, https://openalex.org/W1969923398 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2408.01932 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2408.01932 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2408.01932 |
| primary_location.id | pmh:oai:arXiv.org:2408.01932 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2408.01932 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2408.01932 |
| publication_date | 2024-08-04 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 82, 117, 140 |
| abstract_inverted_index.We | 126 |
| abstract_inverted_index.an | 95 |
| abstract_inverted_index.at | 76 |
| abstract_inverted_index.be | 8 |
| abstract_inverted_index.by | 45 |
| abstract_inverted_index.in | 162 |
| abstract_inverted_index.of | 86, 97, 116, 130 |
| abstract_inverted_index.on | 51 |
| abstract_inverted_index.or | 54, 113, 123 |
| abstract_inverted_index.to | 7, 10, 12, 33, 72 |
| abstract_inverted_index.we | 80 |
| abstract_inverted_index.Our | 156 |
| abstract_inverted_index.The | 56 |
| abstract_inverted_index.and | 19, 38, 91, 100, 144, 164, 171 |
| abstract_inverted_index.any | 121 |
| abstract_inverted_index.our | 108, 131 |
| abstract_inverted_index.the | 48, 61, 111, 128, 167, 176 |
| abstract_inverted_index.via | 149 |
| abstract_inverted_index.(BD) | 154 |
| abstract_inverted_index.(DO) | 59 |
| abstract_inverted_index.HTTP | 22 |
| abstract_inverted_index.Here | 79 |
| abstract_inverted_index.able | 9 |
| abstract_inverted_index.each | 52 |
| abstract_inverted_index.need | 3 |
| abstract_inverted_index.only | 172 |
| abstract_inverted_index.user | 15 |
| abstract_inverted_index.uses | 60 |
| abstract_inverted_index.(HAS) | 25 |
| abstract_inverted_index.(VIF) | 104 |
| abstract_inverted_index.Video | 0, 62 |
| abstract_inverted_index.adapt | 11 |
| abstract_inverted_index.fixed | 141, 168 |
| abstract_inverted_index.gains | 161 |
| abstract_inverted_index.model | 132 |
| abstract_inverted_index.other | 20, 134 |
| abstract_inverted_index.scene | 53 |
| abstract_inverted_index.shot. | 55 |
| abstract_inverted_index.shows | 159 |
| abstract_inverted_index.small | 173 |
| abstract_inverted_index.their | 4 |
| abstract_inverted_index.using | 94, 152 |
| abstract_inverted_index.video | 31, 68, 119 |
| abstract_inverted_index.while | 180 |
| abstract_inverted_index.(VMAF) | 66 |
| abstract_inverted_index.During | 106 |
| abstract_inverted_index.Fusion | 65 |
| abstract_inverted_index.Visual | 101 |
| abstract_inverted_index.engine | 71 |
| abstract_inverted_index.ladder | 115, 137, 170 |
| abstract_inverted_index.losses | 174 |
| abstract_inverted_index.method | 85, 109, 158 |
| abstract_inverted_index.offers | 26 |
| abstract_inverted_index.source | 118 |
| abstract_inverted_index.users' | 39 |
| abstract_inverted_index.videos | 75 |
| abstract_inverted_index.Dynamic | 57 |
| abstract_inverted_index.against | 133, 166, 175 |
| abstract_inverted_index.between | 29 |
| abstract_inverted_index.bitrate | 90, 112, 136, 142, 146, 163, 169, 178 |
| abstract_inverted_index.compare | 127 |
| abstract_inverted_index.deliver | 73 |
| abstract_inverted_index.develop | 81 |
| abstract_inverted_index.display | 17 |
| abstract_inverted_index.dynamic | 27 |
| abstract_inverted_index.enhance | 35 |
| abstract_inverted_index.ladder, | 143, 179 |
| abstract_inverted_index.ladders | 147 |
| abstract_inverted_index.network | 13 |
| abstract_inverted_index.optimal | 88 |
| abstract_inverted_index.quality | 69, 92, 114, 124, 165 |
| abstract_inverted_index.reduced | 77 |
| abstract_inverted_index.service | 1 |
| abstract_inverted_index.systems | 6 |
| abstract_inverted_index.without | 120 |
| abstract_inverted_index.Adaptive | 23 |
| abstract_inverted_index.Fidelity | 103 |
| abstract_inverted_index.Netflix, | 46 |
| abstract_inverted_index.Per-shot | 42 |
| abstract_inverted_index.delivery | 5 |
| abstract_inverted_index.encoding | 49, 151 |
| abstract_inverted_index.ensemble | 96 |
| abstract_inverted_index.factors. | 21 |
| abstract_inverted_index.features | 99 |
| abstract_inverted_index.ladders, | 93 |
| abstract_inverted_index.methods, | 139 |
| abstract_inverted_index.metrics. | 155 |
| abstract_inverted_index.per-shot | 89 |
| abstract_inverted_index.predicts | 110 |
| abstract_inverted_index.proposed | 157 |
| abstract_inverted_index.Optimizer | 58 |
| abstract_inverted_index.Streaming | 24 |
| abstract_inverted_index.bandwidth | 36 |
| abstract_inverted_index.bitrates. | 78 |
| abstract_inverted_index.different | 30 |
| abstract_inverted_index.encoding, | 43 |
| abstract_inverted_index.excellent | 160 |
| abstract_inverted_index.features. | 105 |
| abstract_inverted_index.low-level | 98 |
| abstract_inverted_index.optimized | 84 |
| abstract_inverted_index.optimizes | 47 |
| abstract_inverted_index.pioneered | 44 |
| abstract_inverted_index.providers | 2 |
| abstract_inverted_index.providing | 181 |
| abstract_inverted_index.reference | 145, 177 |
| abstract_inverted_index.settings, | 18 |
| abstract_inverted_index.streaming | 40 |
| abstract_inverted_index.switching | 28 |
| abstract_inverted_index.Assessment | 64 |
| abstract_inverted_index.exhaustive | 150 |
| abstract_inverted_index.inference, | 107 |
| abstract_inverted_index.parameters | 50 |
| abstract_inverted_index.perceptual | 67 |
| abstract_inverted_index.prediction | 70, 138 |
| abstract_inverted_index.Information | 102 |
| abstract_inverted_index.advantages. | 184 |
| abstract_inverted_index.compression | 122 |
| abstract_inverted_index.conditions, | 14 |
| abstract_inverted_index.constructed | 148 |
| abstract_inverted_index.consumption | 37 |
| abstract_inverted_index.estimation. | 125 |
| abstract_inverted_index.performance | 129 |
| abstract_inverted_index.significant | 182 |
| abstract_inverted_index.Multi-Method | 63 |
| abstract_inverted_index.constructing | 87 |
| abstract_inverted_index.experiences. | 41 |
| abstract_inverted_index.high-quality | 74 |
| abstract_inverted_index.perceptually | 83 |
| abstract_inverted_index.preferences, | 16 |
| abstract_inverted_index.computational | 183 |
| abstract_inverted_index.simultaneously | 34 |
| abstract_inverted_index.representations | 32 |
| abstract_inverted_index.content-adaptive | 135 |
| abstract_inverted_index.Bjontegaard-delta | 153 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |