Author: Patrick Vogler, IAG, University of Stuttgart

The steady increase of available computer resources has enabled engineers and scientists to use progressively more complex models to simulate a myriad of fluid flow problems. Yet, whereas modern high performance computers (HPC) have seen a steady growth in computing power, the same trend has not been mirrored by a significant gain in data transfer rates. Current systems are capable of producing and processing high amounts of data quickly, while the overall performance is oftentimes hampered by how fast a system can transfer and store the computed data. Considering that CFD (computational fluid dynamics) researchers invariably seek to study simulations with increasingly higher temporal resolution on fine grained computational grids, the imminent move to exascale performance will consequently only exacerbate this problem. [6]

**Figure 1:** Stream-wise velocity field from a numerical simulation of a plate flow at Re_{θ,0} = 300.

**Figure 2:** Dyadic decomposition into sub-*bands for the stream-wise velocity field of **a plate fl**ow.*

One way to alleviate the I/O bottleneck would be to reduce the number of time steps which are written to the file system. While this trivial data reduction method may be tolerable for simulations that reach a steady state solution after a minuscule amount of time, the same approach would be fatal for highly transient physical phenomena. Considering that most fluid flow problems are subject to diffusion, however, we can conclude that our numerical data sets will typically be smooth and continuous, resulting in a frequency spectrum that is dominated by lower modes (see Figure 1 and 2). [6] Thus our best way forward should be to use the otherwise wasted compute cycles to exploit these inherent statistical redundancies to create a more compact form of the information content. Since effective data storage is a pervasive problem in information technologies, much effort has already been spent on developing newer and better compression algorithms. Most of the prevalent compression techniques, however, are so called dictionary encoders (i.e. Lempel-Ziv encoding) that merely act upon the statistical redundancies in the underlying binary data structure, unable to exploit the spacial redundancies present in numerical dataset. Furthermore, these so-called lossless compression schemes are limited to a size reduction of only 10-20\%, not allowing for an efficient lossy compression by neglecting parts of the original data that contribute little to the overall information content. [5]

A prominent compression standard that allows for both lossy and lossless compression in one code-stream, however, can be found in the world of entertainment technology. JPEG2000 is an image compression standard which is typically utilized to store natural and computer generated images of any bit depth and colour space (i.e. 16-bit grey scale images) and combines many desirable features which are applicable to numerical simulations. Unlike the original JPEG standard, which takes advantage of the discrete cosine transform (DCT) to reduce the spacial redundancy during its compression stage, the JPEG 2000 codec is based on the discrete wavelet transform (DWT). The DWT can be performed by either the reversible LeGall-5/3 taps filter for lossless or the non reversible Cohen-Daubechies-Faveau-9/7 tabs filter for lossy coding. [1] While fourier-based transforms are simple and efficient in exploiting the low frequency nature of most numerical data sets, their major disadvantage lies in the non-locality of their basis functions. Thus, if a DCT coefficient is quantized, the effect of a lossy compression stage will be felt throughout the entire flow field. Discrete wavelet transforms, on the other hand, allow for the definition of so called Regions of Interest (ROI) which are to be coded and transmitted with better quality and less distortion than the rest of the flow field (see Figure 3). Furthermore, the dyadic decomposition into sub-bands (see Figure 2) and JPEG2000’s entropy encoder (Embedded Block Coding with Optimized Truncation) allows for the numerical dataset to be transmitted with increasing sample accuracy or spatial resolution. Finally, JPEG2000s volumetric extension (JP3D) translates the same capabilities to multi-dimensional data sets by applying the 1-dimensional discrete wavelet transform along the axis of each subsequent dimension. (see Figure 4). [2]

**Figure 3:** ROI mask generation in the wavelet domain.

**Figure 4:** 2-level 3D Mallat decomposition for volumetric numerical datasets.

One of the downsides of using image compression codecs, however, is that they have been designed for integer datasets and do not allow for the lossy and lossless encoding of IEEE 754 double precision floating point numbers. Our first approach to handle floating point values is to use the fixed point number format Q, which maps the floating point values onto the dynamic range of a specified integer type (i.e. 64bit). [4] Due to the floating point arithmetic and the many-to-one mapping, however, information would be irreversibly lost during the pre-processing of the floating point data and thus true lossless compression cannot be achieved. To circumvent this problem we plan to split the floating point data sets into its sign, bit and mantissa integer fields (see Figure 5 and 6) and compress them separately.

**Figure 5:** Exponent field of the stream-wise velocity from a numerical simulation of a plate flow at Re_{θ,0} = 300.

**Figure 6:** Mantissa field of the stream-wise velocity from a numerical simulation of a plate flow at Re_{θ,0} = 300.

The IEEE 754 representation, however, is highly non-linear. The distance between consecutive floating point numbers depends on the value of their exponent, inevitably introducing high-frequency signals into the mantissa field. This in turn results in large coefficients for the detail sub-bands of the wavelet decomposition, thus degrading the overall compression efficiency. This could be addressed by applying a so-called shape-adaptive discrete wavelet transform inside the smooth regions of the mantissa field. [3] Yet this approach only works as long as the smooth regions the SA-DWT is applied to remain larger than one pixel and the signal segment starts at an even numbered position. If this is not the case the shape-adaptive wavelet transform will introduce a phase shift in the wavelet coefficients and subsequently distort the shape of the sub-band images. Since the compression algorithm applies the wavelet decomposition on multiple lower resolution approximations of the original dataset, the likelihood of this happening every time a compression is attempted is very high. Our hope is that so-called intraband prediction methods, which are used in the High Efficiency Video Coding Standard (HEVC), could instead be used to overcome this problem. [7]

If you want to find out more about the contents of this article, please contact Patrick Vogler, Dr. Jing Zhang, Björn Dick, Uwe Küster and Prof. Ulrich Rist.

**References**

[1] Tinku Acharya and Ping-Sing Tsai. JPEG2000 Standard for Image Compression: Concepts, Algorithms and VLSI Architectures. John Wiley & Sons, Hoboken, New Jersey, 2005.

[2] C. Christopoulos, A. Skodras, and T. Ebrahimi. The jpeg2000 still image coding system: an overview. IEEE Transactions on Consumer Electronics, 46(4):1103{1127, Nov 2000.

[3] M. N. Gamito and M. Salles Dias. Lossless coding of floating point data with JPEG 2000 Part 10. In A. G. Tescher, editor, Applications of Digital Image Processing XXVII, volume 5558, pages 276{287, November 2004.

[4] P. Lindstrom. Fixed-rate compressed floating-point arrays. IEEE Transactions on Visualization and Computer Graphics, 20(12):2674{2683, Dec 2014.

[5] Alexander Loddoch and Jrg Schmalzl. Variable quality compression of fluid dynamical data sets using a 3-d dct technique. Geochemistry, Geophysics, Geosystems, 7(1):n/a{n/a, 2006. Q01003.

[6] J. Schmalzl. Using standard image compression algorithms to store data from computational fluid dynamics. Computers and Geosciences, 29:1021{1031, October 2003.

[7] Vivienne Sze, Madhukar Budagavi, and Gary J. Sullivan. High Efficiency Video Coding (HEVC). Springer International Publishing, Switzerland, 2014.