Explanipedia

Variational image compression with a scale hyperprior Open

Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, Nick Johnston · 2018

We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to sid…

Variational image compression with a scale hyperprior Open

Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, Nick Johnston · 2018

Computer science Political science Materials science

We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to sid…

Design, Implementation, and Evaluation of a Point Cloud Codec for Tele-Immersive Video Open

Rufael Mekuria, Kees Blom, Pablo César · 2016

Computer science

we present a generic and real-time time-varying point cloud codec for 3D immersive video. This codec is suitable for mixed reality applications where 3D point clouds are acquired at a fast rate. In this codec, intra frames are coded progre…

High Fidelity Neural Audio Compression Open

Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi · 2022

Computer science Engineering Physics

We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks. It consists in a streaming encoder-decoder architecture with quantized latent space trained in an end-to-end fashion. We simplify and speed-u…

Lossy Image Compression with Compressive Autoencoders Open

Lucas Theis, Wenzhe Shi, Andrew Cunningham, Ferenc Huszár · 2020

Computer science Materials science

We propose a new approach to the problem of optimizing autoencoders for lossy image compression. New media formats, changing hardware technology, as well as diverse requirements and content types create a need for compression algo- rithms …

Lossy Image Compression with Compressive Autoencoders Open

Lucas Theis, Wenzhe Shi, Andrew Cunningham, Ferenc Huszár · 2017

Computer science Materials science

We propose a new approach to the problem of optimizing autoencoders for lossy image compression. New media formats, changing hardware technology, as well as diverse requirements and content types create a need for compression algorithms wh…

CompressAI: a PyTorch library and evaluation platform for end-to-end compression research Open

Jean Bégaint, Fabien Racapé, Simon Feltman, Akshay Pushparaja · 2020

Computer science Materials science

This paper presents CompressAI, a platform that provides custom operations, layers, models and tools to research, develop and evaluate end-to-end image and video compression codecs. In particular, CompressAI includes pre-trained models and…

CompressAI: a PyTorch library and evaluation platform for end-to-end\n compression research Open

Jean Bégaint, Fabien Racapé, Simon Feltman, Akshay Pushparaja · 2020

Computer science Materials science

This paper presents CompressAI, a platform that provides custom operations,\nlayers, models and tools to research, develop and evaluate end-to-end image and\nvideo compression codecs. In particular, CompressAI includes pre-trained models\n…

Generative Compression Open

Shibani Santurkar, David Budden, Nir Shavit · 2018

Computer science Mathematics Materials science

Traditional image and video compression algorithms rely on hand-crafted encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the data being compressed. We describe the concept of generative compression, the compression…

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Open

Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia , et al. · 2021

Computer science Philosophy Economics

We propose using self-supervised discrete representations for the task of\nspeech resynthesis. To generate disentangled representation, we separately\nextract low-bitrate representations for speech content, prosodic information,\nand speak…

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers Open

Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou , et al. · 2023

Computer science Physics Biology

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called Vall-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS …

Haptic Codecs for the Tactile Internet Open

Eckehard Steinbach, Matti Strese, Mohamad Eid, Xun Liu, Amit Bhardwaj , et al. · 2018

Computer science Psychology Mathematics

The Tactile Internet will enable users to physically explore remote environments and to make their skills available across distances. An important technological aspect in this context is the acquisition, compression, transmission, and disp…

Causal Contextual Prediction for Learned Image Compression Open

Zongyu Guo, Zhizheng Zhang, Runsen Feng, Zhibo Chen · 2021

Computer science Chemistry Physics

Over the past several years, we have witnessed impressive progress in the field of learned image compression. Recent learned image codecs are commonly based on autoencoders, that first encode an image into low-dimensional latent representa…

Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression Open

Jiahao Li, Bin Li, Yan Lu · 2022

Computer science Physics

For neural video codec, it is critical, yet challenging, to design an\nefficient entropy model which can accurately predict the probability\ndistribution of the quantized latent representation. However, most existing\nvideo codecs directly…

Deep Learning-Based Video Coding Open

Dong Liu, Yue Li, Jianping Lin, Houqiang Li, Feng Wu · 2020

Computer science Mathematics

The past decade has witnessed the great success of deep learning in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. We review the representative works …

Scalable Image Coding for Humans and Machines Open

Hyomin Choi, Ivan V. Bajić · 2022

Computer science Engineering Mathematics

At present, and increasingly so in the future, much of the captured visual\ncontent will not be seen by humans. Instead, it will be used for automated\nmachine vision analytics and may require occasional human viewing. Examples of\nsuch ap…

MPEG Immersive Video Coding Standard Open

Jill M. Boyce, Renaud Doré, Adrian Dziembowski, Julien Fleureau, Joël Jung , et al. · 2021

Computer science Mathematics Economics

This article introduces the ISO/IEC MPEG Immersive Video (MIV) standard, MPEG-I Part 12, which is undergoing standardization. The draft MIV standard provides support for viewing immersive volumetric content captured by multiple cameras wit…

Coarse-to-Fine Hyper-Prior Modeling for Learned Image Compression Open

Yueyu Hu, Wenhan Yang, Jiaying Liu · 2020

Computer science

Approaches to image compression with machine learning now achieve superior performance on the compression rate compared to existing hybrid codecs. The conventional learning-based methods for image compression exploits hyper-prior and spati…

Intra Prediction and Mode Coding in VVC Open

Jonathan Pfaff, Alexey Filippov, Shan Liu, Xin Zhao, Jianle Chen , et al. · 2021

Computer science Mathematics

This paper presents the intra prediction and mode coding of the Versatile Video Coding (VVC) standard. This standard was collaboratively developed by the Joint Video Experts Team (JVET). It follows the traditional architecture of a hybrid …

A comprehensive study of the rate-distortion performance in MPEG point cloud compression Open

Evangelos Alexiou, Irene Viola, Tomás M. Borges, Tiago A. da Fonseca, Ricardo L. de Queiroz , et al. · 2019

Computer science

Recent trends in multimedia technologies indicate the need for richer imaging modalities to increase user engagement with the content. Among other alternatives, point clouds denote a viable solution that offers an immersive content represe…

Real-Time Adaptive Image Compression Open

Oren Rippel, Lubomir Bourdev · 2017

Computer science Mathematics Chemistry

We present a machine learning-based approach to lossy image compression which outperforms all existing codecs, while running in real-time. Our algorithm typically produces files 2.5 times smaller than JPEG and JPEG 2000, 2 times smaller th…

MPEG-H 3D Audio—The New Standard for Coding of Immersive Spatial Audio Open

Jürgen Herre, Johannes Hilpert, Achim Kuntz, Jan Plogsties · 2015

Computer science Physics Mathematics

S.770-779

WhatsApp network forensics: Decrypting and understanding the WhatsApp call signaling messages Open

Filip Karpíšek, Ibrahim Baggili, Frank Breitinger · 2015

Computer science Medicine Philosophy

WhatsApp is a widely adopted mobile messaging application with over 800 million users. Recently, a calling feature was added to the application and no comprehensive digital forensic analysis has been performed with regards to this feature …

Light Field Compression With Disparity-Guided Sparse Coding Based on Structural Key Views Open

Jie Chen, Junhui Hou, Lap‐Pui Chau · 2017

Computer science Mathematics

Recent imaging technologies are rapidly evolving for sampling richer and more immersive representations of the 3D world. One of the emerging technologies is light field (LF) cameras based on micro-lens arrays. To record the directional inf…

Image Coding For Machines: an End-To-End Learned Approach Open

Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Esa Rahtu · 2021

Computer science Mathematics

Over recent years, deep learning-based computer vision systems have been\napplied to images at an ever-increasing pace, oftentimes representing the only\ntype of consumption for those images. Given the dramatic explosion in the\nnumber of …

JPEG XL next-generation image compression architecture and coding tools Open

Jyrki Alakuijala, Ruud van Asseldonk, Sami Boukortt, Zoltán Szabadka, Martin Bruse , et al. · 2019

Computer science

An update on the JPEG XL standardization effort: JPEG XL is a practical approach focused on scalable web distribution and efficient compression of high-quality images. It will provide various benefits compared to existing image formats: si…

A Novel Hybrid Cryptosystem for Secure Streaming of High Efficiency H.265 Compressed Videos in IoT Multimedia Applications Open

Abdulaziz Alarifi, Syam Sankar, Torki Altameem, K. C. Jithin, Mohammed Amoon , et al. · 2020

Computer science

In this modernistic age of innovative technologies like big data processing, cloud computing, and Internet of things, the utilization of multimedia information is growing daily. In contrast to other forms of multimedia, videos are extensiv…

Quality Evaluation Of Static Point Clouds Encoded Using MPEG Codecs Open

Stuart Perry, Huy Phi Cong, Luís A. da Silva Cruz, João Prazeres, Manuela Pereira , et al. · 2020

Computer science Mathematics Engineering

This paper presents a quality evaluation study of point cloud codecs that have been recently standardised by the MPEG committee. In particular, a subjective experiment to assess their performance in terms of bitrate against visual quality …

A Generalized Hausdorff Distance Based Quality Metric for Point Cloud Geometry Open

Alireza Javaheri, Catarina Brites, Fernando Pereira, João Ascenso · 2020

Mathematics Computer science Economics

Reliable quality assessment of decoded point cloud geometry is essential to evaluate the compression performance of emerging point cloud coding solutions and guarantee some target quality of experience. This paper proposes a novel point cl…

Convolutional Neural Networks to Enhance Coded Speech Open

Ziyue Zhao, Huijun Liu, Tim Fingscheidt · 2018

Computer science

Enhancing coded speech suffering from far-end acoustic background noise, quantization noise, and potentially transmission errors is a challenging task. In this paper, we propose two postprocessing approaches applying convolutional neural n…

Codec ≈ Codec