The Pragmatic C++ Programmer

 A couple of days ago I was studying at my university library when my colleague Miguel Madrid got up and started to traverse the library looking for programming books. It’s a game we usually play, to find out a good quality book in a place full of Java 2 SE manuals…

There are some gems on that library though. There’s a couple of copies of Alexandrescu’s “Modern C++ Design” (No longer that Modern, right?) and “C++ Template Metaprogramming”, the latter only borrowed by me in the last five years according to the registry. I always try to have a copy of both, it’s easy since there are only a few people doing C++ there, never reaching the “TMP mental asylum” I’m usually in.

But that day, Miguel reached me with a copy of “The Pragmatic Programmer”. “One of the most influential books in the history of software engineering” the cover says. I’m so scared of how software engineering examples look like

 

Software engineering books

Ignore the fact that I didn’t like the book at all. For me, it’s only another example of how someone sells his own experiences as a “how you should do” book. What really matters for me are the code examples and guidelines.

I always suspect from a book of this kind that provides examples in multiple programming languages. I’m sorry guys, but each language has its own rules, design decisions, optimal ways to perform a task. A language is designed to be used in a specific way. Of course you can use a chainsaw as a toothbrush, but don’t expect that things will work as optimal as expected…

There’s nothing wrong in providing multiple examples in different programming languages at the beginning, but things start to stink when a guideline is implemented in almost exactly the same way in two completely different languages. What’s wrong with “The Pragmatic Programmer”? In the fact that the only main differences between its examples are that in Java the author uses dots, and in C++ arrows.

People don’t understand C++

C++ is hard, I will not deny it. But there are a lot of people who do not understand how C++ works, and even when considering C++ an object oriented language (I’m afraid it’s not), C++ OOP has nothing to do with Java-like OOP.

OOP is not about objects and classes, but about program modules intercommunication, but we usually forget that and try to map everything as an object. And since people is used to GCed OO languages, most of them think that objects live elsewhere and should be referenced. So they usually do the horrible Class* ptr = new Class(); pattern by default, and never get the role of constructors and destructors except in wrapper classes that manage new/delete automatically.

From “The Pragmatic Programmer”:

If circumstances permit we can change from a pointer var to an actual node object

Here the author realizes that C++ has ctors and dtors and hence you can write a wrapper class that manages that object instantiation/destruction for you. That’s the point when I started to cry. No, being in the pre move semantics era is not a reason to Javaize every variable in a way to boost object passing. And I’m sure the author didn’t write the examples in Java++ because of performance concerns.

Last year I had a teacher that, when teaching OpenGL, showed this as an example of how a 2d vector class should look like:

Then I got up in the middle of the classroom and shouted “THAT’S JAVA!!!”. Not kidding, everybody there looked at me while saying “What this f… idiot is talking about?”. Then I reached my teacher and asked him why he was doing C++ in that way:

I usually write C, C++, and Java in exactly the same way because I find it’s the most elegant.

Ok so you write three of the most different programming languages in the world in exactly the same way…

Also the “Ok guys you can optimize your regular polygon function by storing sin and cos results in a variable instead of computing that on each loop iteration” example, ignoring optimizer capabilities that completely outperform us by doing sin and cos at the same time in one instruction only, plus loop hoisting. But I don’t expect any good C/C++ advice from a place where people still think that manual assembly outperforms any compiler, and boast themselves by discarding OpenCV in favor of their own “fast square root routine” for image processing.

Things started to get weird when some mates asked me how they could implement operator+(), since their compiler didn’t allowed them to write vector2d* operator+(vector2d*, vector2d*)

The point is that they don’t understand C++ object model. Objects live on the stack except explicitly stated. This is one of the first topics I usually cover when teaching C++, to make people understand that C++ objects are tied to its scope, and cannot be moved from there. When returning a value from a function, the object does not fly out the function and reaches the caller, but there’s a value interchange between an object living on the callee, an object living on the caller, and an intermediary object between the two contexts that we usually don’t care about. Think of C++ objects as plants, not as bees flying around.

I don’t expect any C++ class to have a sizeof() greater than 60 bytes. That nearly fits a L1 cache line. And I trust RVO for in deep copy. Of course always profile first, but you may notice that it’s hard to get a context when copy elision is not applied by the compiler. Long live value semantics. Even better with modern C++, where you don’t worry about object passing anymore since the last corner cases don’t covered by N/RVO are handled by move semantics.

But why? Why is C++ that hard to get?

Ask this to yourselves. At least for me, when I ask people why they feel C++ that hard, they answer something in the form of:

Manual memory management. C++ has no GC.

Garbage collection…. Why do we need garbage collection if we have our beloved

Garbage collectors aren't required in C++ when there is }

?

Also consider class special functions: Ctors, dtor, assignment operators. I’m always surprised looking at people reinventing the wheel, when aggregation of STL resource handlers do the work automatically. The very well known Rule Of Zero.

Here’s some real code:

Clearly the author comes from C background. This class suffers from a lot of repetitive code, code that mimics the job the compiler already does for class member variables (Look at the init() function and where it’s called from), etc. I’m not worried about performance here, you may be surprised this was extracted from a library that has the merit of being the faster lib in its field, outperforming even C++ gurus Boost code. Imagine what this lib can do with an in depth C++ guidelines review.

C++ is hard, but it’s not hard on “mundane” tasks like defining a class and its special members. It’s the programmer who make it hard by ignoring how the language works. I always say this to my C++ pupils: I don’t remember the last time I wrote a C++ destructor, assignment operator, etc; except for freaking purposes.

Dealing with templates could be hard. Dealing with name lookup rules is hard. But dealing with the implementation of object value semantics in a class that’s just an aggregate of other objects is not hard. You just should rely on the language.

So, what we should do?

That depends on the context of course. But my main advice is to known how something works before buzzing about how horrible it is. Do C in C++ if you like, even Java++, but then don’t cry when your codebase starts going crazy.

Here are some advices from my own:

  • Know the language: C++ is a huge elephant, but the elephant is not exactly what you are used to in other OO languages. I don’t even consider C++ an object oriented language since it’s not object/class centered. Choose the tool (functional programming, generic algorithms, objetc, whatever) that best fits to your problem. If you strip some of that paradigms using a little subset of the language only, it does not play as well and easy as it would.
  • Understand how it works: If you try to do things in the same way you learned for other languages, things go wrong. Each language is different. In case of software engineering guidelines, OO patterns, etc; pay special attention since these are usually written for usual OO languages based on reference semantics. They may work, but can be non optimal on C++ value semantics. Doing reference semantics, i.e. Javaize everything with pointers/smart pointers, does not work since C++ is not designed to do that intensive use of dynamic allocation. Take into account that in OO reference languages such as Java, doing new has almost zero cost since the whole language and its runtime its designed to work in that way. That’s not the case for C++.
  • Trust the compiler: The ages when a compiler was a mere code translator are gone. When doing optimizations by hand you are trying to beat the result of almost 30 years of compiler and optimization algorithms research boxed in a thing that runs on a chip that measures its computational power at MFLOPS. Even if you have an awesome brain that can compete with that power, you will be doomed at the point of code generation since you play in an ecosystem where CPU instruction sets and architectures are too far for being that “fetch, decode, execute” schema we all learned at the school/college. The Ford factory pipeline is not a valid metaphor to describe how CPUs work these days. This is reflected in the fact that if you think of C++ as a “syntactical abstraction layer hiding a couple of assembly instructions” you are wrong 99% of times. That’s no longer the case, since hardware is not that simple. Think of your C++ code as a high level description of what the program should do, not how really does it.
    But be careful! Don’t treat the compiler as a genius! Check its assembly output from time to time to see what it actually did. The cool point here is that the more readable the code is, more optimizable is since the optimizer understands your intention. Write convoluted code like a fast square root routine that I’m sure relies on Undefined Behavior and you will get a code that runs 30% slower than what GCC would generate by its own, considering the effort glibc guys put on efficient floating point code generation.

 


Related Posts
  • Collin Rogowski

    Interesting article, although it could have been a little bit less “ranty” for my taste. ;-)

    When I started working with C++ coming from a Java Background a did the same thing you described and worked only with pointers to objects and not with objects. Until it hit me that this was not the way to go :-)

    What I find the most difficult while (still) learning this, is that the decision to switch from objects on the stack to objects on the heap depends on the size of the object. So if I work incrementally (starting with a small class which gets bigger and bigger as my program grows), I suddenly have to switch from objects to pointers… And that’s a lot of work. I guess you could argue that maybe you should strive to only have classes that are small enough so that this problem doesn’t arise… Do you have some insights into this?

    • Manu343726

      I think that @MaxGalkin summarizes it pretty well: http://yacoder.net/blog/2015/04/26/cpp-curiosities-move-semantics-sizeof-areaof-and-pointy-types/

      The point is that C++ objects come primarily from two categories:

      – Values
      – Resource handlers

      The former identify those classes that are a pure aggregation of values, think of std::complex or a proper vector2d implementation instead of the monstrosity above. This classes allocate all data they manage directly on the same location (Stack, heap, whatever) the object lives, and that data is directly part of object’s memory. As you can see, there’s no point on allocating an object of such category on the heap if you are playing on the stack. Since sizeof(x) == areaof(x) C++11 move semantics do not add any performance difference.

      The latter are those classes that manages a resource that, needs to be allocated at runtime (Dynamic array etc), or is a resource not managed directly by the language (A file handle, for example). In this cases, not all data managed by the object is part of object data, but referenced by one or more of its members. Even if that’s the case, **the object can still live on the stack, and there’s no reason to dynamically allocate the object**. As Max says on the article, move semantics worth it if sizeof(T) << areaof(T) so there's a difference between swapping the handles and copying the data that handles refer to.

      Most huge objects are really objects with big areaof() but not sizeof(), that is, when an object manages large chunks of data it usually does externally to the object itself. We have good examples in the standard library, such as vector, string, map, etc.

      So my answer is: Since C++11 the language has proper semantics to handle "resource bypassing": Move semantics means that you are able to differentiate when a resource is being passed to another context (object), instead of wasting time copying the resource into the new context and then delete the original (now dead) resource.
      The only true reason you might want dynamic allocation of objects is to share them between multiple contexts, and that's exactly what std::shared_ptr (A resource handler ready for that purpose) is supposed to do. As you can see, you can avoid new/delete/naked-ptrs completely in modern c++.

      • Nikesh Singh Tanwer

        Any Suggestion for newbie in C++, from where i should start learning C++, so that i will not do this kind of mistakes ?

      • Collin Rogowski

        Thx a lot. I read up on move semantics (didn’t know it beforehand). It seems C++11 is actually quite different from the old C++. This is gonna take a while to get ingrained… :-)

  • http://about.me/bradserbu Brad Serbu

    Are you using Java++ as a tongue and cheek reference to Java?
    Is it an idiomatic term used in other communities/forums/chatrooms/posts that I’m not familiar with?

    Loved the article and the use of that term in no way detracted from the comprehension of it, but I can’t help but wonder the intent behind the usage of the term.

    Cheers.

    • Manu343726

      > Are you using Java++ as a tongue and cheek reference to Java?

      Yep, that’s how I refer to that code that’s supposed to be C++ since they are compiling with a C++ compiler, but has more to do with Java idioms and related.

      • http://about.me/bradserbu Brad Serbu

        Thanks. I thought you were reffering to Java code. It’s clear to me now the term “Java++” = C++ code written like it was Java – i.e. the example of your teacher.

  • Liquidify

    Any advice for a complete noob to C++ coming from a very light python background? What are the first books and steps I should be taking?

  • Pingback: LUG дайджест #12 | LUG Udmurtia()