Coding Conventions

Author: thothonegan

Tags: programming c++

I've had a few people ask how I layout my code. So here's a quick overview. Of course coding conventions is a very touchy topic for a lot of programmers, so don't take this as the 'best' approach or anything. It's just what works for me. A lot of it has been inspired by various other guides such as the Google C++ Style Guide so look at those too.

Files

Naming Conventions

Rule: .cpp for C++ source files, .hpp for C++ header files.

Justification: Mostly makes it obvious what a file is written in. If you've ever included a C file without extern C and it didnt support c++ correctly, you know why this is useful.

General Indentation

Rule: Tabs for indentation, spaces for alignment.

Justification: This is probably the number one things programmers argue about. Personally, I find the benefit of having a specific alignment character nicer then not. And I hate typing lots of spaces in code (python) that requires it (or YAML : even if my editor is setup for tab=spaces, it always puts the cursor in the wrong place). Or even this stupid markdown editor because Tumblr doesnt understand triple grave code blocks properly *kicks*.

Header Guards

Rule: Use old style C #define header guards, generally named from the project down (like WOLFCORE_THREAD_HPP ).

Justification: I know some people prefer #pragma once and it's supported practically everywhere, but it's still non-standard. This is mostly a 'I'm used to it' type thing, and theirs a few cases where its useful. Like for example, you want to guarantee a certain file was included before you (like the hell that is Windows.h).

Forward declarations

Rule: Prefer forward declarations for classes to using includes.

Justification: This is something I never cared about till my compile times increased due to how big my projects have gotten. C++ also has a lot of constructs (templates, inlines) which require headers to be included so its best to give it as much of a help as possible. When modules becomes standardized, hopefully it'll help with this.

Documentation

Rule: Everything in headers should be documented. In source files, just readable comments are needed.

Justfication: For me, header files are like the index of a book. The main reason I'm looking through them is for an API I need, or need to learn how to use a class. Having everything documented and neatly sectioned makes this easy (especially when combined with doxygen). I definitely document more than most people do, but its come in handy too many times to count when coming back to a class I wrote years before and completely forgot the quirks a class has.

The one other related big rule is if something happens such as a crash due to using an API wrong, the documentation for that function must have pointed that out. If it did, I as a user of the function was wrong : if it doesn't, that's a bug in the function (either it should point it out, or the function needs to be fixed). This rule alone has improved stability of my programs, because just making it not crash anymore doesn't mean its actually fixed. And if its possible for the function to detect that condition (in debug mode) and abort, it is required to. If you can get every crash down to where the app detects it before the system does, that is huge. Its much nicer to get a 'Abort: Tried to access a null pointer here [backtrace]' instead of 'SIGSEGV'.

Naming Conventions

Macros

Rule: Macros will be named in all capitals. WOLF_APPLICATION() for example.

Justification: This keeps the preprocessor neatly seperated from the rest of the language. Anything in all caps or with #stuff you know happens before the compiler runs.

Constants

Rule: Constants will be CapitalCamelCase. Note this generally only applies to full constants, and not just variables marked 'const'.

Justifcation: With variables being different, this makes it easy to tell if something's designed to be assignable or not.

Variables

Rule: Variables will be lowerCamelCase (with the exceptions of prefixes). Variable names can have a prefix of '[letters]_[name]' if they fall into specific categories. Namely:

m : A member variable
s : A static variable (usually file scope)

A variable might also have postfix names too, though thats mostly up to the module. Common ones are S for seconds (durationS), str for string (durationStr) and so on.

Justfication: The lowerCamelCase is basically to counter with constants, so at a glance you can tell the basics of a type. With the prefixes, you also know the scope of a variable and cannot easily get confused between local variables and long living variables like members. The final part about prefixes is a form of Application Hungarian Notation (not to be confused with Systems Hungarian (ala lpCmdLine) which is what Windows uses and is annoying to work with).

Namespaces/Classes/Structs/Enums/Unions

Rule: Containers will use CapitalCamelCase, very much like constants. Add 'Private' somewhere to the name if its for internal use.

Justfication: Mostly distinctiveness again. The mixing between constants and containers also generally isnt a problem, and the compiler will point it out if you accidently use one at the other.

Functions/Methods

Rule: Functions/Methods will use lowerCamelCase with one of the prefixes below if required. Generally functions if they return a value should have the return as part of the name (aka do not do val = calculate(), instead val = calculateLength()). Setters start with set while get just name the value. Prefixes used include:

p : Protected/Private function - not callable outside the class and/or decendents.
i : Internal function - might be public, but isn't part of the common/normal interface. Be careful!
s : Static function - limited to the file. Note that this ISNT used for class static functions (since they are public interface).
v : Virtual function. See Private Virtuals below for more details. Not public generally.
r : CRTP function. Called by one of the classes we inherit from. Not public.

Justification: The difference from other constants makes it easier to deal names. You can tell instantly that WolfCore::Module is some form of container, while WolfCore::Module::v_init is a virtual function. The mixing with variables isn't an issue since they are never valid in the same contexts.

The setter/getter thing is mostly a question of style. The goal is to make public usage as straightforward as possible, and allow information hiding as easily as possible (since it costs 0 due to inlining). With the member variable rule above, this then means you have something like:

m_value : The member variable 'value' which stores the value.
value() : A function to retreieve the member variable 'value'.
setValue(Type&) : A function to set the member variable 'value'.

And now probably the most distinct thing in my style: the function prefixes. Anything off the normal public path has a prefix, which allows devs to know the category of a function without affecting users of the class. I'll explain each rule individually, and justify them seperately.

p_function() : Protected/Private functions. These are functions the normal user shouldnt ever care about, so theirs no reason it should come up in code-completion/etc. When using certain APIs, I've tried to call a function only to realize it was private, and was just part of the code because of how classes work. While IDEs try to show them differently, a little icon change isn't easily noticable. This makes it a lot more obvious. There is no distinction between protected/private because I default everything to private, and the few cases you use protected, you know what you're doing (since you're working on the backend of the class).
i_function() : 'Internal' public functions. There is a few cases where you need a function to be public that really shouldn't be - but for some reason friend wont solve it, or makes an even bigger mess. Internal functions allow you to public something, yet its still not a 'normal' function. If you ever see a call to one of these, its a big warning sign that its doing something odd. An example in Wolf is Matrix::i_fromEigenMatrix() : nothing in Wolf publically should depend on my matrix implementation being eigen, yet its needed for some of the helper classes to be able to easily pass things around.
s_function() : Static per file functions. This is mostly used for private helper functions for classes which arent part of the public interface. Mostly they'll end up in an anonymous namespace. Things like s_initKeyboardTable() in various window drivers that just setup a global table for looking up keyboard keys. Note this is not used for static class functions. Those use normal naming conventions (like Manager::manager()).
v_function() : This is Private Virtuals. More details at the end, but it is a way to seperate virtual implementations from the actual calling interface. You almost never call a v_function() directly, but it marks the interface a parent class might call you with.
r_function() : CRTP. If you don't need the runtime calling abilities that virtuals provide, CRTP allows you to require a specific interface thats completely resolved at compile time (mostly at 0 cost due to inling). From a code writer perspective, its the same as a virtual v_function(). Might write a blog post about CRTP someday, cause it's a handy technique that a lot of people don't know about.

All of these are cases where you're either limited where you can call it, or you need to be careful when you're calling it. Having these special function prefixes have helped me a lot.

Indentation/Spacing Rules

This is more of the 'feel good' type rules. Can't really justify most of them other then its just how I do things.

Tabs (Literal tab character, editor view set to 4 spaces)
Mostly Allman style in formatting. I've experimented a bit with others such as 1TBS, but while Allman loses a bit in density its pretty nice to read.
Spaces arround operators, keywords, etc
I do allow single statement constructs such as if, but been trying to get away with a condensed Allman for those cases.

Example:

if (condition)
{ break; }

instead of

if (condition)
    break;

Spaces after function names.
Namespaces use Allman, but if multiple namespace are grouped together, I use one line. I don't need this as much as I used to, but this probably the number one thing that editors despise me doing (they keep trying to fix it). This is mostly to prevent indentation explosion.

Example:

namespace Wolf { namespace Core
{
    class A;
} }

C++17 will add namespace Wolf::Core which will help.

Decorations on variables go with the types: aka const Type* t. Yes I know if you're abusing C this looks weird (the char* a, b example) but the answer is to stop abusing C. Yes its technically part of the variable not the value, yes its 'how you use it' and 'not part of its type' but it acts like part of the type, thus its part of the type. Ducks quack and all. Speaking of...
One declaration per line. End of story. No char* a, b crap. If you have a ton of variables to generate, use preprocessor macros or a supporting tool.
Dont use void in the parameter list. I used to, and a lot of code of mine still does, but blank is just fine. Unless you're in C of course, cause C defaults to int for some reason.
Prefer using to typedef. C++11 made it way more powerful, and its more readable too.

A quick example of everything put together, WolfThread::Thread's header! (some parts removed for space).

namespace WolfThread
{
    //
    /// \brief A class that represents a thread of execution
    ///
    /// Subclass Constructor must have the format:
    ///   explicit Thread (ThreadFunctionPointer f);
    //
    class Thread
    {
        public:
            /// \name Types
            /// \{

                //
                /// \brief Function prototype for threads
                //
                using FunctionPointer = WolfType::Function;

            /// \}

            /// \name Construction/Destruction
            /// \{

                //
                /// \brief Destroy the thread - will join if joinable
                //
                virtual ~Thread () {}
            
            /// \}
            
            /// \name Joining/Detaching
            /// \{
                //
                /// \brief Join the thread - waits for it to finish its execution
                //
                void join ()
                { v_join(); }
                
                //
                /// \brief Detach the thread - the thread is now independent of this handle
                //
                void detach ()
                { v_detach(); }

            /// \}
            
            WOLF_DEFAULT_MOVE(Thread);
            WOLF_DEFAULT_ASSIGN_MOVE(Thread);
            
            WOLF_DISABLE_COPY (Thread);
            WOLF_DISABLE_ASSIGN (Thread);
            
        protected /*child interface */:
            //
            /// \brief The thread is starting. Should be called by children when the thread started.
            //
            void p_threadStarted ();
            
            //
            /// \brief The thread is exiting. Cleans up runloop/etc : should be called by children classes when the thread is about to return.
            //
            void p_threadExiting ();
            
        private: /* child interface*/
            //
            /// \brief Join the thread - waits for it to finish its execution
            //
            virtual void v_join () = 0;
            
            //
            /// \brief Detach the thread - the thread is now independent of this handle
            //
            virtual void v_detach () = 0;
            
        
    };
}

Private Virtuals

So a quick aside about virtuals. Virtual functions are pretty amazing, but theirs a few issues with using them:

Interface and Implementation are mixed. The caller of the function is tied directly to the implementation.
Not able to easily insert your own code. Want code to run before and after every call? Can't really do it, without assuming that your child classes will call back to you. Speaking of..
Can't guarantee what child classes will do. Will they call you before they do things? After?
What if later on you want the interface to change? All the callers and the callees have to change. Even if its mostly compatible.
If you have behavior in your parent class, you can't guarantee you're overridden. Modern C++ added support for having default behavior yet being pure, but its still odd.

So the C++ STL (among other places) has an interesting way to fix these. Instead of having one function (say init() that is both called and overridden) you instead have two:

public: void init () { v_init(); } // calls the virtual - public interface
private: virtual void v_init () = 0; // child HAS to override [or not if you dont make it pure]

Even though its private, you can override it from a child class. They just cant call the base version (since its private). So what does this gain us?

Interface and Implementation are seperate. The implementation only messes with the v_init() one, and doesnt know/care if init() is the only caller of it.
Since the interface is seperate, we can easily add any code we want before/after the v_init() all. None of the callers have to change, because its all filtering through init.
We still can't guarantee what child classes will do, but thats fine because they cant call the parent function anyways! So we can guarantee we can do things before/after them regardless of what they do.
If you want to change the interface or the implementation, go right ahead. Only the callers or callees will need to change as long as you keep them compatible. You can even have multiple virtual interfaces and the public API choose whichever it wants.
You can have behavior in your parent class, yet your function is still pure virtual and must be overriden (since its not part of the implementation function).
And yes, at least for simple cases like this it's 0 cost - C++ will inline it all anyways.

The one downside you get is that you have to duplicate your functions - for every virtual, you need the public interface function too. Not a huge deal while you're building things. And if you document all interfaces like I do, it costs a bit more time there too.

This takes some time to get used to, but it comes in handy.

Ending

Wow, that turned out to be a lot more than I expected - and I bet I missed a lot of things. Anyways, thats a basic look in my code formatting - I still tweak things here and there, and theirs definitely places in my codebase with older styles. The most important thing though is having a style - because even if its different (unreal and its 'everything is capitalized' for example), as long as its consistent its fine working with.