Our group works on different topics in the domain of image and video compression. In this direction, we study state-of-the-art coding standards like HEVC and we develop new methods for upcoming standards like VVC. In addition, we tackle new and unconventional approaches for coding content like medical images videos, fisheye videos, or 360°-videos. Currently, the following topics are addressed:

  • Compression exploiting machine learning techniques
  • Compression of medical images
  • Energy efficient video compression

Contact person: Dr.-Ing. Christian Herglotz

Coding of Medical Content

Scalable Lossless Coding of Dynamic Medical Data Using Compensated Multi-Dimensional Wavelet-Lifting:

Contact Person: Daniela Lanz, M.Sc.

This project focuses on scalable lossless coding of dynamic medical data. An efficient scalable representation of dynamic volume data from medical devices like Computed Tomography is very important for telemedicine applications. Thereby, lossless reconstruction is regulated by law and has to be guaranteed. Compensated Wavelet-Lifting combines scalability features and lossless reconstruction by only one processing step.

A wavelet transform (WT) decomposes a signal into a high- and lowpass subband. This allows for analysing the signal in multiple resolutions and provides an efficient coding of the volume by the energy compaction in the lowpass subband. Further, the quality of the lowpass subband can be increased by suitable approaches for motion compensation. By applying this coding scheme quality scalability as well as spatial and temporal scalability can be achieved. The block diagram above shows the single processing steps of 3-dimensional Wavelet-Lifting.


Energy and Power Efficient Video Communications

Energy Efficient Video Coding

Contact person: Matthias Kränzler, M.Sc.

In recent years, the amount and share of video-data in the global internet data traffic has steadily increasing. Both the encoding on the transmitter side and the decoding on the receiver side have a high energy demand. Research on energy-efficient video decoding has shown that it is possible to optimize the energy demand of the decoding process. This research area deals with the modeling of the energy required for the encoding of compressed video data. The aim of the modeling is to optimize the energy efficiency of the entire video coding.

„Big Buck Bunny“ by Big Buck Bunny is licensed under CC BY 3.0



Energy Efficient Video Decoding

Contact person: Dr.-Ing. Christian Herglotz

This field of research tackles the power consumption of video decoding systems. In this respect, software as well as hardware systems are studied in high detail. An detailed analysis of the decoding energy on various platforms with various conditions can be found on the DEVISTO homepage:

Decoding Energy Visualization Tool (DEVISTO)

With the help of a high number of measurements, sophisticated energy models could be constructed which are able to accurately estimate the overall power and energy. A visualization of the modeling process for software decoding is given on the DENESTO homepage:

Decoding Energy Estimation Tool (DENESTO)

Finally, the information from the model can be exploited in rate-distortion optimization during encoding to obtain bit streams requiring less decoding energy. The source code of such an encoder can be downloaded here:

Decoding-Energy-Rate-Distortion Optimization for Video Coding (DERDO)

Coding with Machine Learning

Coding High Resolution Video Data

Contact Person: Kristian Fischer, M.Sc.

Nowadays, more and more video data is provided in very high resolution. Media-services provider like Netflix or Amazon Prime are rapidly increasing their service of 4K videos. One 4K frame contains 3840×2160 pixels. Hence, the question arises how to transmit those enormous amounts of data to the end user.

In the past, it was shown that it is beneficial for scenarios with low bitrate, especially for videos with large resolution like 4K, to downscale the single 4K frames by half the size in each dimension before transmitting them. By doing so, the size of the video to transmit via the channel is only at full HD resolution (1920×1080 pixels). Therewith, coding artifacts can be reduced significantly. Naturally, the resolution of the arriving video has to be increased back to 4K resolution at the receiver side. For this purpose, several upscaling algorithm are evaluated to achieve the most appealing quality for the end user.

Deep Learning for Video coding

Contact Person: Fabian Brand, M.Sc

The increased processing power of mobile devices will make it possible in long-term to employ deep learning techniques in coding standards. Many components of modern video coders can be implemented using neural networks. The focus of this research is the area of intra-frame-prediction. This concept has been part of video coders for a long time. The technique is used to estimate the content of a block just from the spatial neighborhood, such that only the difference has to be transmitted. In contrast to the so-called inter-frame-prediction which employs different reference pictures to reduce temporal redundancy, intra-frame-prediction only uses the to-be-coded image itself, thus reducting spatial redundancy.

Previous coding standards mainly use angular prediction. There, pixels from the border area are copied in a certain angle into the block. This method is very efficient but not able to predict non-linear structures. Since neural networks are so-called universal approximators, which describes the ability to approximate arbitrary functions arbitrarily close, they are able to also predict more compley structures. The following picture shows an example block, which has been predicted with a traditional method, and with a neural-network-based approach. We see, that the neural network performs better in predicting the round shape of the block.

Left: Original, Center: Traditional Method (VTM 4.2), Right: Prediction with a neural network.