GRChombo and GRTeclyn

Panel discussion introduction

Miren Radia

University of Cambridge

Wednesday 23 July 2025

Predecessor/current code: GRChombo

Built on the Chombo libraries for “fully-adaptive” block-structured AMR
Written in C++14
Explicit vectorization through C++ templates to achieve good performance
Hybrid MPI + OpenMP parallelization
Project started in ~2012 with code becoming open source in 2018.
Intended for more exotic problems than some other NR codes:
- Cosmology: Inflation, cosmic strings, etc.
- Boson stars/oscillatons/axion stars
- Modified gravity
- Higher-dimensional BHs
- Much more

Successor/future code: GRTeclyn

GRTeclyn is an in-development port of GRChombo to AMReX.
“Teclyn” is Welsh for “tool”.
Features:
- Built on the AMReX framework for block-structured AMR with good GPU support.
- Higher-order spatial interpolation between coarser and finer levels than GRChombo.
- Black-hole binary evolution
- Familiar structure for existing GRChombo users
- Tested and runs well on Nvidia, AMD and Intel GPUs

A bar chart showing the mean walltime taken to evolve a single timestep on various different GPUs and a CPU. The results are as follows: Intel Xeon Platinum 8480 CPU (480s), Nvidia GH200 GPU (56s), AMD MI300X GPU (76s), AMD MI210 GPU (358s), Nvidia A100 GPU (105s) and Intel PVC 1550 (176s).

Methods

Cell-centred finite-difference discretization
Block-structured AMR
- Can emulate “moving boxes” for compact object binaries
- Generally something more adaptive for more exotic spacetimes
- Can fix some of the coarser grids to avoid regridding noise near GW extraction spheres
- Arbitrary refinement criterion
Method of lines
- Fourth order Runge-Kutta time integration
- Fourth or sixth-order spatial discretization

Kreiss-Oliger dissipation
Boundary conditions:
- Fourth order interpolation at finer level boundaries from coarser levels
- Outgoing radiation (Sommerfeld)
- Symmetric/reflective
- Extrapolating
CCZ4 evolution system
Moving puncture gauge conditions

Coding paradigms

Simple structure in modern C++ that is easy to adapt to new problems.
- We inherit this from Chombo/AMReX (both very similar).
No complicated code-generation procedures
- Easier to debug and/or modify
Low number of dependencies
- Easier to set up/simplifies portability
Performance optimization e.g.:
- GRChombo: Explicit vector intrinsics instead of relying on compiler auto-vectorization
- GRTeclyn: No GPU managed memory to keep data GPU-resident and avoid unnecessary transfers

Challenges and solutions

Shorter term

Challenges

Transitioning GRChombo users to GRTeclyn
Training research users how to modify/develop/run in a GPU-performant way
Improving scientific productivity (time and computational resources)

Solutions

Training/group hackathons
Better documentation
Collaborate with other codes on developing shared tools

Longer term

Future GPU performance is only really improving in lower precision floating point datatypes (e.g. 32 bit, 16 bit, 8 bit and even lower).
- Like everyone else we currently use FP64 (double precision) everywhere.
- We will need more accuracy for future GW detectors, not less!

Any questions?

Why AMReX?

Mature and performant cross-vendor (Nvidia/AMD/Intel) GPU support
Similar block-structured AMR capabilities to Chombo¹
AMReX previously received significant support as part of the US Exascale Computing Program (ECP).
- There are many AMReX applications across diverse research areas.
- Large user community (Slack workspace/GitHub discussions)
AMReX is now an established project in the High Performance Software Foundation.
Helpful and very active development team:
- In particular, Weiqun Zhang has been instrumental in helping us to get started.