Fortran and Docker: How to Combine Legacy Code with Cutting-Edge Components
*This sentence was changed to correct an error in the Docker build command (corrected 6/8/18).
When you think about Fortran, you might conjure up images of punch cards, mainframes, and engineers from the past. You might not think about fast-running web-based tools or modern architecture. But here at Urban, Fortran still has a place alongside cutting edge tools.
In this post, I’ll walk you behind the scenes to share the benefits of Fortran and how we combine it with other, “modern” technologies, such as containers, to provide greater flexibility, portability, and scaling without the pain of re-writing the model.
Why Fortran?
Organizations with long institutional memory, like Urban, often have many complex models in older programming languages that would be time-consuming to rewrite in a more “modern” language (e.g., Python). These models are frequently referred to as legacy systems. When these systems are stable, well-documented and still actively developed, there is no reason to read the word “legacy” as a pejorative.
But also, for many of our models, Fortran is the right tool for the right problem. Economists frequently think in formulas and mathematical expressions. Fortran (derived from FORmula TRANslation) is excellent at being a simple way to code these expressions. (To be clear, for the Fortran experts out there, we are talking about modern Fortran here, with very little in the way of C-comment blocks and GOTO statements.) The inputs and outputs are sufficiently large that at the time of original development, Fortran was the proper choice.
Further, a compiled language, like Fortran, is fast. Interpreted languages, like Python, save the parsing and translation to runtime. Compiled languages do that up front once with a compiler that translates the code to a machine-understandable format. This means that the Fortran code executes more efficiently.
Most important, when combined with other technologies, maintaining our models in Fortran lets the modeling team continue to work in the way in which they are accustomed and allows us greater flexibility in creating tools, sharing them, and scaling our operations.
Change the way you deploy, not the way you develop
Now to the exciting part: combining the existing Fortran with the newer technology concept of containers. (We are using containers more and more here at Urban. You can read more about how Urban uses containers for Natural Language Processing here.)
If you aren’t familiar with containers, think of it as the wrapping paper on the present of the Fortran code that runs the model. It defines the operating system, installs programs, and can even ensure that we have database connections and compiled model code. Most important, it ensures that we can replicate the environment and code easily. It also allows us to move the code around, change versions, test changes and deploy those changes without having to manually copy files. For those interested in diving into the details of Docker, the inventor of container technology, you can read their overview here.
Once you have installed Docker, we can deploy some Fortran code with the following steps.
If you already know about Fortran compilation, you can skip ahead to the Dockerfile section. If you want to try out gfortran and make, I suggest these resources:
- https://www.gnu.org/software/make/
- https://www.gnu.org/software/make/manual/make.html
- https://gcc.gnu.org
- https://gcc.gnu.org/onlinedocs/gfortran/Invoking-GNU-Fortran.html
You can clone this repository to get all the source files that are referenced and shown throughout.
A simple HelloWorld program
As a simple illustration, we’ll use the classic example of a small HelloWorld program.
If you’re a frequent programmer, it shouldn’t be too difficult to read.
There are some variable declarations, an input file is read, a calculation is done, and output is written. All these pieces are also a part of our web calculators, just with a lot more code. In order to compile and build this code, using gfortran we would execute the command gfortran helloworld.f90 -o HelloWorld
, which gives us an executable named HelloWorld that we could then run at the command line using the command ./HelloWorld
.
A successful run of this code using the input shown will create an output file like the one above, and print to the console:
Hello Humans!Starting to read file.Done my calculations.Results successfully written.
Automate compilation and build
We can automate the compilation and build by creating a Makefile. (In many cases, if you are using a development environment such as Visual Studio, this step gets taken care of by the development environment via settings menus.)
As the code gets more complicated, automation becomes more valuable — without it, we would have to manually compile each file in the right order and issue the command to build the final executable by hand — and we’d inevitably make a few mistakes. Make has very robust capabilities, as does gfortran.
Here our Makefile is about as simple as it gets:
You can see the compile line at line 5, with a few extra options added for good measure.
-g
this flag will help us when debugging if we run into errors-c
this flag signals that we should compile this file into an object (the .o files)-ffree-line-length
signifies that the entire file will contain code (not a certain number of columns wide of text)
Line 3 contains the command with an -o
flag to signify that these are objects that should be combined together to form the final executable.
The last section gives instructions for removing files when we want to clean up and force a full recompilation. Make will attempt to decide if things need compiling based on whether they have changed, but as more files and objects get added, sometimes we want to force full re-builds.
We just have one file and one object here, but projects typically have multiple files and multiple objects.
We run this Makefile by executing the command make HelloWorld
and we would then run the executable at the command line in the same way as before: ./HelloWorld
. To clean up and remove everything we execute make clean
.
Create the Dockerfile
As we briefly mentioned above, Docker is used to build containers in which we will copy our Fortran code. We’ll then compile, build, and execute our code inside the container. To define what we need the container to be and do, we use a Dockerfile.
As with the Makefile, we are keeping the Dockerfile as simple as possible:
Again, this is about as simple as we can get. You will see in the Dockerfile, the creation of the base image — here CentOS — as our operating system. Then several commands to install components for Fortran. Then, you will see commands to copy the HelloWorld Fortran code and other required files into the container, run the make command, and lastly the command that will execute when we issue a docker run
command.
Now let’s create and run this container locally and test it out!
First, our Docker image needs to be built. We do this by executing docker build -t helloworld .
(note the period; it’s important!) in the directory where we saved our Dockerfile.* Depending on how you have installed Docker, you might need to start your default docker-machine before this will properly run. A successful build will show docker pulling images, installing the Fortran components, and compiling the code.
Next, run docker image ls
to see that the docker image we named “helloworld” in the build command was created successfully.
Lastly, we will run our code in the container with docker run helloworld
. If this works properly, we will see the same messages echoed out to the console as before, plus some additional printing of our output:
running fortranHello Humans!Starting to read file.Done my calculations.Results successfully written.fortran completeprint output file20.0000000 18.0000000 16.0000000 14.0000000 12.0000000 12.0000000 14.0000000 16.0000000 18.0000000 20.0000000
There is a lot more to dig into with Docker, this is just scratching the very surface.
The possibilities have opened up!
Now that we have the Fortran encapsulated in Docker, we can move it wherever we want and call it as a service from other applications.
For example, we could use this Fortran container in conjunction with a web application container, and call the Fortran from Python or Node.js after building a nice shiny web-based front end. Much like in the Tax Proposal Calculator, this would give a user a more modern and user friendly way of interacting with the model code, combining a fast and well-validated model with a modern interface. It also enables web development teams to change out the interface as needed, without ever having to touch the model itself.
We could also create a server in the cloud that runs the model for our internal modeling teams and can easily scale up and down as needed. If we needed to run thousands of runs, we can replicate our modeling environment easily and programmatically by adding more cloud computing capacity on demand. Researchers could then worry about finding compelling research questions to answer and be less constrained by CPU or memory limits that might have existed were they only able to run code on desktop computers or internal servers.
In future posts, we’ll talk more about how you can combine Docker with other web- and cloud-based technologies.
Don’t underestimate legacy systems
Just because an application is coded in a language that might seem old-fashioned, it should not be overlooked, put on the shelf, or assumed that it cannot be used in more modern ways. You may not need to change the way you develop — just the way you deploy.