Chemistry - Why is FORTRAN so commonly used in computational chemistry?

Solution 1:

I don't think that's really true anymore.

Some Fortran use is historical (i.e., early codes were developed in FORTRAN because that was the best programming language for number crunching in the 70s and 80s). Heck, the name stands for "formula translation."

Some Fortran use is because of performance. The language was designed to be:

especially suited to numeric computation and scientific computing.

Many times, I find chemistry coders sticking to Fortran because they know it and have existing highly optimized numeric code-bases.

I think the performance side isn't necessarily true anymore when using modern, highly optimizing C and C++ compilers.

I write a lot of code in C and C++ for performance and glue a lot of things with Python. I know some quantum programs are written exclusively or primarily in C++. Here are a few open source examples:

  • Psi4 - Written in C++ and Python
  • MPQC - Written in C++
  • LibInt - Written in C++ for efficient quantum integrals.
  • LibXC - Written in C with Fortran "bindings" for DFT exchange-correlation functionals

This is my opinion, but my recommendation for faster performance in chemistry would be Python with some C or C++ mixed in.

I find I'm more efficient coding in Python, partly because of the language, partly because of the many packages, partly since I don't have to compile, and that's all important.

Also, you can run Python scripts and functions in parallel, on the GPU, and even compile them, e.g. with Numba. As I said, if I think performance is crucial, I'll write pieces in C or usually C++ and link to Python as needed.

Solution 2:

I think it does make sense to provide a somewhat alternative view and to clarify the matter.

FORTRAN vs. Fortran

First off, one has to distinguish the old FORTRAN from the new Fortran, where, by convention, the name of the old language is written usually in all caps. The old FORTRAN (all the way up to FORTRAN 77) is indeed still used because of tons of legacy code, but the new Fortran (starting from Fortran 90) is used mainly because it is a very elegant and simple yet powerful and efficiently implemented language for number crunching.

DSL vs. GPL in general

Note specifically, that even modern Fortran, in my opinion, is domain-specific language (DSL) and herein lies its relative weakness comparing to some general-purpose languages (GPL) like the mentioned C++: Fortran is specialised for a particular task (number crunching) and might be not so suitable for some related tasks (say, automated analysis of the final results, their graphical representation, etc.)

General-purpose languages, such as C++, give you more flexibility (in language features, in 3rd party libraries, etc.) so that you can solve not just the primary task (number crunching) but also the related task using the same language. If, however, you choose Fortran for number crunching, you often have to use one more language (e.g. Python) for these related tasks. Think about it as of using two different DSLs: one for the primary task, another one for related tasks. Of course, you could also use Python together with C++, but an experienced C++ developer (which you are supposed to be if you choose to do some number crunching in it) would not necessarily take advantage of using another language instead of his C++ beast.

Fortran vs. C++ specifically

All this is somewhat subjective, but anyway, here are my 5 cents. Overall, Fortran is simpler than C++, but (and because) C++ is more feature-rich. Basically, it because Fortran is DSL, while C++ is GPL. And as I said, this is subjective to some degree, and besides, the complexity is one of the most complex things in the universe, so we could start a debate about it, but hey, just one word: templates and the discussion is in principle over. Templates make C++ the beast, but everything comes for its price.

Note that I didn't say that C++ is more powerful, since, in my opinion, "more feature-rich" does not always mean "more powerful". It depends first of all on do you actually need these additional features in the first place or not? Seriously, do you need the level of generosity (and metaprogramming) C++ templates provides for number crunching? Not necessarily. And if so, C++ would not be more powerful than Fortran, although, it will still be more feature-rich.

Templates are not the only feature of C++ that Fortran does not have. Exceptions and Standard Library - are two other noticeable features that Fortran does not have. Again, it is not so likely that you will greatly benefit from these features for number crunching. But some other tasks they might be very helpful, so C++ as GPL includes them while Fortran as DSL for number crunching does not. Think about Fortran as of a "suitcase language" just for number crunching and about C++ as of a "trunk language" for everything.

There is also one great feature of the Fortran language which C++ still could not catch up with: modules. I mean, real modules, and not ancient preprocessor machinery which irritates me more than everything else.


Solution 3:

FORTRAN used to have an edge when it came to speed due to having much better optimizing compilers, in part due to its relative simplicity. Now that C and C++ compilers are nearly on par (and occasionally better), other factors such as programmability come into play. However, there is a lot of legacy code out there as well.

I work in quantum chemistry, and many of the programs that I work with have a mix but are starting to incorporate more C++. The PSI4 package even mixes Python and C++ to gain the benefits of speed and usability of each.

Here is a list of various programs and the language(s) they use.

  • ORCA - C++
  • MPQC - C++
  • PSI4 - C++, Python
  • PySCF - Python
  • Q-Chem - FORTRAN, C++
  • CFour/ACES - FORTRAN, recently some C++
  • NWChem - FORTRAN, C
  • GAMESS(US) and GAMESS(UK) - FORTRAN
  • Gaussian - FORTRAN
  • Molpro - FORTRAN
  • Dalton - FORTRAN, some C
  • MRCC - FORTRAN
  • MOLCAS/OpenMOLCAS - FORTRAN
  • DIRAC - FORTRAN
  • ADF - FORTRAN
  • CASINO - FORTRAN
  • COLUMBUS - FORTRAN
  • CP2K - FORTRAN
  • TURBOMOL - FORTRAN

Quantum chemistry packages


Solution 4:

FORTRAN is a less expressive language than C++, and that allowed older compilers to optimize much more efficiently. In modern compilers, there is little to no difference in performance. There are just a few places where FORTRAN's weaker safety guarantees allow more extreme optimizations. You likely won't notice the difference unless you spend 6-10 years developing FORTRAN.

Generally speaking, it is easier to do development in the newer languages because they better take advantage of modern computing capabilities. FORTRAN is very strict in its format because it had to be compiled by computers of the mid-1970's. You tell the computer exactly what to do in its language, rather than telling it what you want to do in a more human readable language.

FORTRAN is still used for two major reasons that I can uncover:

  • Some problems are simple enough where the expressiveness of C++ is wasted, and you can actually make a clearer program with the simpler FORTRAN.
  • Many programs are made by the previous generation. FORTRAN was, in the past, simply better for computational work.

Solution 5:

I stumbled on an article in J. Appl. Cryst. that partially is related to the original question. Comparing Fortran to other programming languages in the context of crystallographic algorithms in a reusable software framework yields this condensed diagram:

enter image description here

(doi 10.1107/S0021889801017824)

It should be added that the article relates to software packages written in the old Fortran 77 dialect.

Tags: