FTorch: an ICCS success story
How can a programming language developed in the 1950s combine with cutting-edge machine learning and modelling? ICCS Research Software Engineers saw an opportunity to make a difference for the scientific community.
What was the problem?
Many fields, from climate science to nuclear fusion, make use of large-scale scientific modelling codes running on high-performance supercomputers to conduct research. Recently there has been interest in combining these with new machine learning (ML) techniques to create ‘hybrid models’, leveraging new techniques in data-driven science.
What are the benefits? These hybrid models allow us to replace time-consuming processes with ML emulators to reduce cost and time. They also mean we can learn from real-world data, rather than using physical approximations, to improve accuracy, and they let us leverage emulation of higher-resolution models.
Combining ML with older modelling codes is not straightforward, however. The process presents technical challenges as well as new scientific questions and methods that arise from exploring these new frontiers.
Fortran – a programming language about to celebrate its 70th birthday – was originally developed for scientific computing and much of its code has therefore been around for a long time. On top of this, it has a native array syntax which is very useful for writing numerical code for partial differential equations and grids, upon which many models are based.
Modern ML code development is done using Python software and libraries, for which PyTorch is popular in science.
ICCS engineers recognised an opportunity to bridge the gap between Fortran and modern ML technologies, which led to the initial idea for a potential solution.
Developing a solution
What became known as FTorch started as part of a collaboration with the DataWave project where their researchers had worked out how to couple ML to their model (Model of an idealized Moist Atmosphere – MiMA), but it was very slow. A team of ICCS Research Software Engineers (RSEs) used their software and computing knowledge to engineer a better approach, as detailed in this previous blog post.
Following this, ICCS RSE Jack Atkinson noted that there were several projects facing similar challenges – coupling ML models from Python to Fortran codes – so he decided to make this a more general solution, and FTorch was born. The code became a standalone library with a focus on FAIR principles.
What impact has FTorch had?
Since its initial development in 2022, FTorch has been adopted globally by research projects including those working with CESM (the Community Earth System Model), ICON (the German weather forecasting model), and nuclear fusion research at the UK Atomic Energy Authority (UKAEA).
In the United States, the National Center for Atmospheric Research (NCAR) develops an open-source Earth-system model, CESM, that is widely used for studying climate projections, extreme events and more.
Brian Dobbins, CESM Chief Software Engineer at NCAR, sets out how valuable his team has found FTorch:
“Improving [CESM] via the use of AI/ML is a high-priority for a number of teams around the world. The well-designed FTorch package has been instrumental in enabling some of this work already, and is a testament to the vision, expertise and collaborative spirit of Jack and the [ICCS] team.
“Our scientists have highlighted FTorch’s user-friendly interface as a significant advantage, often preferring it over other tools. This ease of use has been instrumental in its adoption across research projects. FTorch aligns well with our requirements for an open, high-performance, easy-to-use and multi-platform solution, reflecting the goals that Jack and his team have successfully achieved.
“Motivated by these early successes and positive experiences, we’ve submitted a proposal to the US National Science Foundation to fund additional development work, in collaboration with Jack and his team, to further improve the software. I expect it will become the cornerstone for integrating inference-based physics not just in a future version of CESM, but quite possibly a number of other modeling systems as well.”
At the UKAEA, FTorch has been used to deploy Gaussian process surrogate models for turbulence into the nuclear fusion codebase for simulating plasma, leading to a 100x speedup in simulation. This is used in simulations for the UK’s Spherical Tokamak for Energy Production (a device used for generating power through nuclear fusion), and showcases the significant performance improvements facilitated by FTorch. Most recently, FTorch has been adopted by Microsoft Research to provide a Fortran interface to their density functional theory code Skala.
These high-profile applications across diverse scientific domains demonstrate FTorch's maturity, reliability and substantial value to UK research infrastructure.
Browse a full list of FTorch users and papers.
Ease of use and "developer efficiency"
What has made FTorch so effective? Jack Atkinson explains: “We believe that FTorch’s focus on ease of use is what has given us a lot of success. For example, we often only discover users when they cite our paper – after they have successfully used it in the research without needing our input.
“I like to say that there are two types of efficiency in research – computational and developer. Whilst we have a focus on being computationally-efficient for running high-performance code, we also focus on developer efficiency.
“Developers are often making many changes to code to explore a variety of scenarios. We want this to be easy to do, and for a PhD or postdoc without detailed computer science or software expertise to be able to use FTorch to solve their problem with minimal effort to focus on their science. To this end we also work to abstract away many of the more complicated aspects away from the end user.”
Funding from the Accelerate Programme for Scientific Discovery and the Cambridge Centre for Data-Driven Discovery supported an ML coupling workshop. This brought together FTorch users, but also more generally people involved in hybrid modelling from a range of domains from climate and fusion to chemistry, from academia, industry and modelling centres across the UK and Europe.
This shows the power of projects, software and RSEs to bring together diverse domains and facilitate collaboration, transfer of knowledge and interconnections.
What are the next steps for FTorch?
While FTorch has already proven its worth, the ICCS team is always looking for improvements and future collaborations.
Senior RSE Dr Joe Wallwork has led work to bring Autograd into FTorch, which will lead to online training and more capabilities. The RSE team has a 6-month resource allocation to support maintenance and carry out improvements for FTorch. ICCS has recently applied for funding to respond to changes made by PyTorch and ensure that FTorch stays up-to-date with their upstream changes, as users have requested this.
Jack Atkinson summarises: "The FTorch team are always keen to collaborate with anyone who has interesting ideas and to extend FTorch to support their use-cases.
"Community contributions are welcome, and FTorch contains code from several external contributors. Some of these have worked on the project during student or industrial placements, whilst others use it in their projects and joined hackathons and developer meetings to add back to the open-source community project. FTorch showcases the power of collaboration between software engineers and researchers."