C++17

Author: thothonegan

Tags: c++ c++17 programming code

As you might have known, I'm excited about C++17 However you might not have a clue why, other than me ranting about things like Modules, UCS, and so on.

A recent list of possibilities came out, and decided to list what i’m most interested in and why.

https://docs.google.com/viewer?a=v&pid=forums&srcid=MTc1NDc3NTUyODUyOTc0MTkyOTUBMTExNzY5NTk0OTYwMjgzMDc4NzYBcDJPYmgzMWVENjBKATAuMQEBdjI

Modules

This is the mother of all features, and I’m surprised its even on the list. To understand it, we’ll never to cover how C++ compiles stuff.

C++ like C is mostly a textual compiler. Every file you compile is a ‘translation unit’, that gets parsed from start to end. Every time it hits #include, it jumps to another file, dumps it into the current file, and keeps going. So the simple program:

#include <iostream>
int main () {
    std::cout << "Hello World" << std::endl;
    return 0;
}

when ran through only the preprocessor is 17271 lines. 17k lines before its even started to compile it. And this happens for every source file, which ends up including a lot of the same files over and over again. Now there are some simple fixes for this : most compilers have precompiled headers which will try to compile some of this ahead of time to speed this up. Other languages like Java and C# instead understand the entire project and can be a lot smarter about how other files are included. Modules tries to do something similar for C++.

Theirs two major parts to modules. The first is creating a module, the second is using. Using is a bit simpler and is how most programs will use it at first. Basically when you do #include or @import (keyword isnt defined yet, so using the objc ones), if theirs a module it’ll include the module instead. The module is precompiled by the compiler, so instead of doing a textual include, it can just add it to its internal AST (or whatever representation it uses). This is a whole lot faster than opening <iostream>, then includes its headers, then including its headers, etc, etc. It also allows more module separation : right now its easy for any header to leak stuff into your files (for example, the macro ‘min’ on windows, the macro ‘True’ on X11, ‘near’ and ‘far’ which are reserved on windows, etc). Modules instead restrict what a file can import. Lastly, tools can then reason about modules a whole lot better. When I start up visual studio on any of my projects, it takes about 5 minutes for it to load everything, cause it ends up parsing about 20k worth of headers, even though I dont use 95% of them directly. Modules allow this to be handled way better.

The second part of modules, is how to create them? At the moment theirs two methods: the backwards compatible way, and the ‘do everything as a module’ way. Some of this is clang specific, and its hard to say what will be in the C++ version.

A module defines a file called the ‘module map’. The module map basically defines what headers are part of a module, and what rules are used for a specific module. A simple example:

module std [system] {
    module vector {
        requires cplusplus
        header "vector"
    }
}

which is then used by

@import std.vector;
std::vector<int> v; // etc

The module map can define what headers are in what, what languages it supports, and so on. The compiler will create a module bundle out of it (clang does on first compile using the module), which it can then reuse for every other file that uses the module.

If you’re going all in with modules, an older proposal (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2073.pdf) instead would allow you mark public and private sections in your source files. The compiler would then take your source file, and generate a header file for you with the public section (which could then be used for the module). This would have some very nice sideeffects, like private functions and members can be completely hidden in the source file. Unfortunately I dont see it in clang’s manual anymore, and its definitely a much bigger change. However theirs a lot of cool things going this direction.

For more info, clang’s rules for modules (which are the basis for the C++ proposal) can be found here : http://clang.llvm.org/docs/Modules.html .

Uniform Call Syntax (UCS)

This is a major bombshell of a change. Its kindof small syntax wise, but it has huge implications. Basically in C++ and the STL, there are generally two ways to perform an action on an object. The first is to call a member function:

object.function(a, b, c)

and the second is to pass it to another function to handle:

function(object, a, b, c)

(think std::sort() for an STL of this second case). Generally for me, i greatly prefer the first case because it makes the connection more obvious. UCS basically proposes unifying these two formats so if you do a call using one syntax, the other will work too. So for example, the following code will work.

class Object
{
    public:
    void memfunc (); // internal function
};
void outerfunc (Object&amp; outer); // outer function

int main (int argc, char** argv) // some random function
{
    Object o;
    o.memfunc(); // C++ allows this already
    o.outerfunc(); // degenerates to outerfunc (o);
    memfunc(o); // degenerates to o.memfunc()
    outerfunc(o); // C++ allows already
    return 0;
}

So it looks like its just syntax right? If you prefer one style over the other, you’re probably using it already, and its a minor change. The reason i’m excited about it is it can simulate reopening classes! For example, in the previous example outerfunc() could be definitely in a different header, different source file, or even different library and yet its used like it was part of the object. If you’ve used Obj-C’s categories before, you’ll underestand how powerful this is. For a trival example, say you have std::string. Its a nice class, but you want a lowerCase() function to return the lowercase version of a string. Right now you have to create a seperate function, and call it a totally different way, which makes it feel non-native. With UCS, instead you can do

namespace std { // reopen std
    std::string lowerCase (const std::string&amp; str)
    { return std::transform (str.begin(), str.end(), str.begin(), ::tolower); }
}

and used via:

{
    std::string str = "AWESOME";
    std::string lc = str.lowerCase();
}

Very very nice. Now i’m sure this will cause a lot of problems for some people (like the people who hate macros and operator overloading), but for me the ability to add methods to classes I have no control over is one of the few things I miss from other languages.

Nested Namespace Definitions

Already added to c++17, but very important for what I do. Its a pretty simple change, basically instead of:

namespace A { namespace B { int x; } }

you can just do:

namespace A::B { int x; }

Wolf/Endless has a ton of namespaces for various modules, generally about 3 deep, but some up to 5 or 6 for internal stuff (Wolf::Window::Driver::Wayland::Window for example). At the moment I do:

namespace Wolf { namespace Window { namespace Driver { namespace Wayland
{
    int stuff;
} } } }

which is semi ok, but anything that makes formatting easier is something I want.

The Future

Of course no matter what happens, will probably have to wait 5 years for MSVC to catch up (and embedded compilers), but every improvement helps. C++ has made huge improvements since C++11, and even if only half of the features proposed make it in, it’ll simplify a lot of the stuff i’m working on.