Profiling a CMake Build

Author: thothonegan
Tags: c++ cmake makefile profiling

It is pretty hard to profile a build system to see where compile time is, especially for C++ since slowdowns generally involves either templates or extra header files.

Assuming you've done things already such as using ninja, using multiple build jobs, and basic header hygine, it is hard to know where to optimize next.

So lets figure out a way to measure it using cmake's makefile generator. Roughly, we just want to see what specific compiles are taking longer, so we can look at the headers used and optimize those specifically, instead of guessing and spending hours with almost no effect (like i did).

First generate the build with cmake. Only important part is using the unix makefiles generator (which is default if unspecified) and whatever other options you need for your build.

cmake "$HACKERGUILD/Repo/Wolf"
    -DCMAKE_BUILD_TYPE="Debug"
    -DCMAKE_EXPORT_COMPILE_COMMANDS=YES

Then we need to alter the generated makefiles so we can measure it.

First, create a file called rusage in your home directory [or somewhere] and mark it executable. This file will act like a shell and use GNU time to measure every command going through it in a specific format.

It contains: (thanks to http://alangrow.com/blog/profiling-every-command-in-a-makefile )

#!/bin/sh
exec time -f 'rc=%x elapsed=%e user=%U system=%S maxrss=%M avgrss=%t ins=%I outs=%O minflt=%R majflt=%F swaps=%W avgmem=%K avgdata=%D argv="%C"' "$@"

Now we need to update the makefiles cmake wrote to use it

find . -type f -print0 | xargs -0 sed -i 's@SHELL = /bin/sh@SHELL = ${HOME}/rusage /bin/sh@'

And last, we turn off cmake updating the build system so it doesnt overwrite our changes. You'll want to trash the build dir when you're done anyways.

sed -i 's/all: cmake_check_build_system/all:/' Makefile

Ok now we do a normal build, outputting to both the terminal and a file [usage.txt] so we can analyze later

make 2>&1 | tee timings.txt

Now wait a long time, cause your build is slow and we're purposly not using multiple jobs to get better timings (might work, but havent tried it).

Done? Ok, now we need to analyze the information.

We need to throw out any line that doesnt have timing information, sort by the elapsed time (which is realtime clock), and then output that to another file.

cat timings.txt | grep 'elapsed' | sort -t '=' -k 3 -r -n > timings-sorted.txt

Now we have a nicer usage file. Biggest things to note here:

  • Multiple directories in cmake will have their own time entries as it recursively calls make. As such, a lot of the top calls might just be make calling the next level.
  • Depending on your program, linking might also be a huge cost compared to the individual files. While theirs solutions here too such as using GCC's visibility attributes, or switching to gold/lld, might not be as worth looking at.