C++ project organisation (with gtest, cmake and doxygen)

C++ build systems are a bit of a black art and the older the project the more weird stuff you can find so it is not surprising that a lot of questions come up. I'll try to walk through the questions one by one and mention some general things regarding building C++ libraries.

Separating headers and cpp files in directories. This is only essential if you are building a component that is supposed to be used as a library as opposed to an actual application. Your headers are the basis for users to interact with what you offer and must be installed. This means they have to be in a subdirectory (no-one wants lots of headers ending up in top-level /usr/include/) and your headers must be able to include themselves with such a setup.

└── prj
    ├── include
    │   └── prj
    │       ├── header2.h
    │       └── header.h
    └── src
        └── x.cpp

works well, because include paths work out and you can use easy globbing for install targets.

Bundling dependencies: I think this largely depends on the ability of the build system to locate and configure dependencies and how dependent your code on a single version is. It also depends on how able your users are and how easy is the dependency to install on their platform. CMake comes with a find_package script for Google Test. This makes things a lot easier. I would go with bundling only when necessary and avoid it otherwise.

How to build: Avoid in-source builds. CMake makes out of source-builds easy and it makes life a lot easier.

I suppose you also want to use CTest to run tests for your system (it also comes with build-in support for GTest). An important decision for directory layout and test organization will be: Do you end up with subprojects? If so, you need some more work when setting up CMakeLists and should split your subprojects into subdirectories, each with its own include and src files. Maybe even their own doxygen runs and outputs (combining multiple doxygen projects is possible, but not easy or pretty).

You will end up with something like this:

└── prj
    ├── CMakeLists.txt <-- (1)
    ├── include
    │   └── prj
    │       ├── header2.hpp
    │       └── header.hpp
    ├── src
    │   ├── CMakeLists.txt <-- (2)
    │   └── x.cpp
    └── test
        ├── CMakeLists.txt <-- (3)
        ├── data
        │   └── testdata.yyy
        └── testcase.cpp

where

  • (1) configures dependencies, platform specifics and output paths
  • (2) configures the library you are going to build
  • (3) configures the test executables and test-cases

In case you have sub-components I would suggest adding another hierarchy and use the tree above for each sub-project. Then things get tricky, because you need to decide if sub-components search and configure their dependencies or if you do that in the top-level. This should be decided on a case-by-case basis.

Doxygen: After you managed to go through the configuration dance of doxygen, it is trivial to use CMake add_custom_command to add a doc target.

This is how my projects end up and I have seen some very similar projects, but of course this is no cure all.

Addendum At some point you will want to generate a config.hpp file that contains a version define and maybe a define to some version control identifier (a Git hash or SVN revision number). CMake has modules to automate finding that information and to generate files. You can use CMake's configure_file to replace variables in a template file with variables defined inside the CMakeLists.txt.

If you are building libraries you will also need an export define to get the difference between compilers right, e.g. __declspec on MSVC and visibility attributes on GCC/clang.


As a starter, there are some conventional names for directories that you cannot ignore, these are based on the long tradition with the Unix file system. These are:

trunk
├── bin     : for all executables (applications)
├── lib     : for all other binaries (static and shared libraries (.so or .dll))
├── include : for all header files
├── src     : for source files
└── doc     : for documentation

It is probably a good idea to stick to this basic layout, at least at the top-level.

About splitting the header files and source files (cpp), both schemes are fairly common. However, I tend to prefer keeping them together, it is just more practical on day-to-day tasks to have the files together. Also, when all the code is under one top-level folder, i.e., the trunk/src/ folder, you can notice that all the other folders (bin, lib, include, doc, and maybe some test folder) at the top level, in addition to the "build" directory for an out-of-source build, are all folders that contain nothing more than files that are generated in the build process. And thus, only the src folder needs to be backed up, or much better, kept under a version control system / server (like Git or SVN).

And when it comes to installing your header files on the destination system (if you want to eventually distribute your library), well, CMake has a command for installing files (implicitly creates a "install" target, to do "make install") which you can use to put all the headers into the /usr/include/ directory. I just use the following cmake macro for this purpose:

# custom macro to register some headers as target for installation:
#  setup_headers("/path/to/header/something.h" "/relative/install/path")
macro(setup_headers HEADER_FILES HEADER_PATH)
  foreach(CURRENT_HEADER_FILE ${HEADER_FILES})
    install(FILES "${SRCROOT}${CURRENT_HEADER_FILE}" DESTINATION "${INCLUDEROOT}${HEADER_PATH}")
  endforeach(CURRENT_HEADER_FILE)
endmacro(setup_headers)

Where SRCROOT is a cmake variable that I set to the src folder, and INCLUDEROOT is cmake variable that I configure to wherever to headers need to go. Of course, there are many other ways to do this, and I'm sure my way is not the best. The point is, there is no reason to split the headers and sources just because only the headers need to be installed on the target system, because it is very easy, especially with CMake (or CPack), to pick out and configure the headers to be installed without having to have them in a separate directory. And this is what I have seen in most libraries.

Quote: Secondly I would like to use the Google C++ Testing Framework for unit testing my code as it seems fairly easy to use. Do you suggest bundling this with my code, for example in a "inc/gtest" or "contrib/gtest" folder? If bundled, do you suggest using the fuse_gtest_files.py script to reduce the number or files, or leaving it as is? If not bundled how is this dependency handled?

Don't bundle dependencies with your library. This is generally a pretty horrible idea, and I always hate it when I'm stuck trying to build a library that did that. It should be your last resort, and beware of the pitfalls. Often, people bundle dependencies with their library either because they target a terrible development environment (e.g., Windows), or because they only support an old (deprecated) version of the library (dependency) in question. The main pitfall is that your bundled dependency might clash with already installed versions of the same library / application (e.g., you bundled gtest, but the person trying to build your library already has a newer (or older) version of gtest already installed, then the two might clash and give that person a very nasty headache). So, as I said, do it at your own risk, and I would say only as a last resort. Asking the people to install a few dependencies before being able to compile your library is a much lesser evil than trying to resolve clashes between your bundled dependencies and existing installations.

Quote: When it comes to writing tests, how are these generally organised? I was thinking to have one cpp file for each class (test_vector3.cpp for example) but all compiled in to one binary so that they can all be run together easily?

One cpp file per class (or small cohesive group of classes and functions) is more usual and practical in my opinion. However, definitely, don't compile them all into one binary just so that "they can all be run together". That's a really bad idea. Generally, when it comes to coding, you want to split things up as much as it is reasonable to do so. In the case of unit-tests, you don't want one binary to run all the tests, because that means that any little change that you make to anything in your library is likely to cause a near total recompilation of that unit-test program, and that's just minutes / hours lost waiting for recompilation. Just stick to a simple scheme: 1 unit = 1 unit-test program. Then, use either a script or a unit-test framework (such as gtest and/or CTest) to run all the test programs and report to failure/success rates.

Quote: Since the gtest library is generally build using cmake and make, I was thinking that it would make sense for my project to also be built like this? If I decided to use the following project layout:

I would rather suggest this layout:

trunk
├── bin
├── lib
│   └── project
│       └── libvector3.so
│       └── libvector3.a        products of installation / building
├── docs
│   └── Doxyfile
├── include
│   └── project
│       └── vector3.hpp
│_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
│
├── src
│   └── CMakeLists.txt
│   └── Doxyfile.in
│   └── project                 part of version-control / source-distribution
│       └── CMakeLists.txt
│       └── vector3.hpp
│       └── vector3.cpp
│       └── test
│           └── test_vector3.cpp
│_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
│
├── build
└── test                        working directories for building / testing
    └── test_vector3

A few things to notice here. First, the sub-directories of your src directory should mirror the sub-directories of your include directory, this is just to keep things intuitive (also, try to keep your sub-directory structure reasonably flat (shallow), because deep nesting of folders is often more of a hassle than anything else). Second, the "include" directory is just an installation directory, its contents are just whatever headers are picked out of the src directory.

Third, the CMake system is intended to be distributed over the source sub-directories, not as one CMakeLists.txt file at the top-level. This keeps things local, and that's good (in the spirit of splitting things up into independent pieces). If you add a new source, a new header, or a new test program, all you need is to edit one small and simple CMakeLists.txt file in the sub-directory in question, without affecting anything else. This also allows you to restructure the directories with ease (CMakeLists are local and contained in the sub-directories being moved). The top-level CMakeLists should contain most of the top-level configurations, such as setting up destination directories, custom commands (or macros), and finding packages installed on the system. The lower-level CMakeLists should contain only simple lists of headers, sources, and unit-test sources, and the cmake commands that register them to compilation targets.

Quote: How would the CMakeLists.txt have to look so that it can either build just the library or the library and the tests?

Basic answer is that CMake allows you to specifically exclude certain targets from "all" (which is what is built when you type "make"), and you can also create specific bundles of targets. I can't do a CMake tutorial here, but it is fairly straight forward to find out by yourself. In this specific case, however, the recommended solution is, of course, to use CTest, which is just an additional set of commands that you can use in the CMakeLists files to register a number of targets (programs) that are marked as unit-tests. So, CMake will put all the tests in a special category of builds, and that is exactly what you asked for, so, problem solved.

Quote: Also I have seen quite a few projects that have a build ad a bin directory. Does the build happen in the build directory and then the binaries moved out in to the bin directory? Would the binaries for the tests and the library live in the same place? Or would it make more sense to structure it as follows:

Having a build directory outside the source ("out-of-source" build) is really the only sane thing to do, it is the de facto standard these days. So, definitely, have a separate "build" directory, outside the source directory, just as the CMake people recommend, and as every programmer I have ever met does. As for the bin directory, well, that is a convention, and it is probably a good idea to stick to it, as I said in the beginning of this post.

Quote: I would also like to use doxygen to document my code. Is it possible to get this to automatically run with cmake and make?

Yes. It is more than possible, it is awesome. Depending on how fancy you want to get, there are several possibilities. CMake does have a module for Doxygen (i.e., find_package(Doxygen)) which allows you to register targets that will run Doxygen on some files. If you want to do more fancy things, like updating the version number in the Doxyfile, or automatically entering a date / author stamps for source files and so on, it is all possible with a bit of CMake kung-fu. Generally, doing this will involve that you keep a source Doxyfile (e.g., the "Doxyfile.in" that I put in the folder layout above) which has tokens to be found and replaced by CMake's parsing commands. In my top-level CMakeLists file, you will find one such piece of CMake kung-fu that does a few fancy things with cmake-doxygen together.