Author: Martin Vymazal, Imperial College London


High-order methods on unstructured grids are now increasingly being used to improve the accuracy of flow simulations since they simultaneously provide geometric flexibility and high fidelity. We are particularly interested in efficient algorithms for incompressible Navier-Stokes equations that employ high-order space discretization and a time splitting scheme. The cost of one step in time is largely determined by the amount of work needed to obtain the pressure field, which is defined as a solution to a scalar elliptic problem. Several Galerkin-type methods are available for this task, each of them have specific advantages and drawbacks.

High-order continuous Galerkin (CG) method is the oldest. Compared to its discontinuous counterparts, it involves a smaller number of unknowns (figure 1), especially in a low-order setting. The CG solution can be accelerated by means of static condensation, which produces a globally coupled system involving only those degrees of freedom on the mesh skeleton. The element interior unknowns are subsequently obtained from the mesh skeleton data by solving independent local problems that do not require any parallel communication. 

Author: Carlos Falconi, ASCS


The 8th regular general meeting of asc(s took place in Leinfelden-Echterdingen on 12th May 2016. The Managing Director Alexander F. Walser reported a positive premium increase of 22% and briefed network members and invited guests on project activities of asc(s during 2015. The Scientific Project Manager Carlos J. Falconi D. presented ExaFLOW to the audience. He gave an overview of all technical objectives to be accomplished during the project period and informed of intermediate results. A discussion with members followed aiming at collecting feedback in regards exploitability of ExaFLOW results. The recently-elected board members of asc(s including Prof. Dr.-Ing. Dr. h.c. Dr. h.c. Prof. e.h. Michael M. Resch from the High Performance Computing Center at the University of Stuttgart, Dr. Steffen Frik from Adam Opel AG, Jürgen Kohler from Daimler AG, Dr. Detlef Schneider from Altair Engineering GmbH and Nurcan Rasig, Sales Manager at Cray Computer Deutschland underlined the importance of transferring know-how and applying achieved project results to more industrial cases.

Author: Christian Jacobs, University of Southampton


ExaFLOW-funded researchers at the University of Southampton have developed and released a new software framework called OpenSBLI, which can automatically generate CFD model code from high-level problem specifications. This generated code is then targetted towards a variety of hardware backends (e.g. MPI, OpenMP, CUDA, OpenCL and OpenACC) through source-to-source translation using the OPS library.

A key feature of OpenSBLI is the separation of concerns that is introduced between model developers, numerical analysts, and parallel computing experts. The numerical analysts can focus on the numerical methods and their variants, while the writing of architecture-dependant code for HPC systems is the task of the parallel computing experts who can support and introduce optimisations for the exascale hardware architectures and the associated backends, once available in the future. As a result of this abstraction, the end-user model developer need not change or re-write their high-level problem specification in order to run their simulations on new HPC hardware.

More details about OpenSBLI can be found in the paper: 

C. T. Jacobs, S. P. Jammy, N. D. Sandham (In Press). OpenSBLI: A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures. Journal of Computational Science. DOI: 10.1016/j.jocs.2016.11.001. Pre-print: and on the project's website:

Author: Adam Peplinski, KTH


Computational Fluid Dynamics (CFD) relies on the numerical solution of partial differential equations and is one of the most computationally intensive parts of a wide range of scientific and engineering applications. Its accuracy is strongly dependent on the quality of the adopted grid and approximation space on which the solution is computed. Unfortunately for most interesting cases, finding the optimal grid in advance is not possible and some self-adapting algorithms have to be used during simulation. One of them is Adaptive Mesh Refinement (AMR), which allows to dynamically modify both grid structure and approximation space and provides the possibility to control the computational error during the simulation and to increase the accuracy of the numerical simulations at minimal computational cost.

Figure 1: The error of the stream-wise velocity component for the conforming mesh. Black lines show element boundaries.

Author: Michael Bareford, EPCC


Nektar++ [1] is an open-source MPI-based spectral element code that combines the accuracy of spectral methods with the geometric flexibility of finite elements, specifically, hp-version FEM. Nektar++ was initially developed by Imperial College London and is one of the ExaFLOW co-design applications being actively developed by the consortium. It supports several scalable solvers for many sets of partial differential equations, from (in)compressible Navier-Stokes to the bidomain model of cardiac electrophysiology. The test case named in the title is a simulation of the blood flow through an aorta using the unsteady diffusion equations with a continuous Galerkin projection [2]. This is a small and well-understood problem used as a benchmark to enable understanding of the I/O performance of this code. The results of this work lead to improved I/O efficiency for the ExaFLOW use cases.

The aorta dataset is a mesh of a pair of intercostal arterial branches in the descending aorta, as described by Cantwell et al. [1], see supplementary material S6 therein. The original aortic mesh contained approximately sixty thousand elements, prisms and tetrahedra. However, the tests discussed in this report use a more refined version of this dataset, one that features curved elements of aorta. The test case itself is run using the advection-diffusion-reaction solver (ADRSolver) to simulate mass transport. We executed the test case for a range of node counts, 2n, where n is in the range 1 - 8 on ARCHER [3] in order to generate the various checkpoint files that could then be used by a specially written IO benchmarker.