C++ criticism by other people

Part of C++ FQA Lite

This page is a collection of the best C++ criticism by FQA readers, copied from e-mail messages and online discussions. If you know an interesting consequence of C++ problems not mentioned in the FQA, please send me e-mail. Similarly to the FQA errors page, this one lists things that can be proved / tested rather than qualitative statements. The stuff is published with credits or anonymously, according to the choice of each author.

The issues listed here (or the FQA itself) are not supposed to be "new" in the sense that they were never discussed in a published work (if C++ problems were so hard to discover as to take decades, discussing them wouldn't necessarily be worth the trouble).

There's lots of (well-reasoned or entertaining or both) C++ criticism on the web, including several pieces by celebrity programmers. However, I made the decision not to cite famous quotes by celebrities on this site. The main reason is that I don't want to make the FQA more convincing to the people who mostly value the credentials of an author and ignore things like facts and reasoning. I want to work less both with C++ and with these people. So I'd rather have them use C++ than convince them to switch to something else.

Implicit type conversions

Anonymous: I'd add something about the broken type system - how the following code is legal and compiles without warnings with most compilers, for example:

void foo(const std::string &) {}
int main() 
{
    foo(false);
}

Another example:

class A { public: int a; };
class B : public A { public: int b; };
int main() 
{
    A * p = new B[10];
    p[5].a = 1;
}

Why should a static type system allow this without an explicit cast?

Yossi: The FQA doesn't talk much about implicit type conversions, since the FAQ doesn't. The problem is quite important though. It wouldn't be so bad if C++ detected run time errors (as opposed to compiling the second example to code modifying the wrong place), and/or if so many C++ programmers didn't think that "with C++, when it compiles and links, it will run correctly" (I actually heard this one, and then there are many large C++ monolithic applications without unit tests speaking for themselves).

Note that this has nothing to do with safety and C++ being a "power tool allowing you to do dangerous things" because it's so "high-performance". This argument only makes sense for explicit casts. What the code demonstrates is unexpected interactions between pairs of different implicit conversions (in the first example, bool -> char* -> std::string, in the second - B[] -> B* -> A*).

By the way, the second example explains the FAQ's remark about arrays being evil in the context of inheritance and substitutability. The thing is that with arrays of objects, there's an implicit type conversion that allows you to violate the substitutability principle without a compile time error. With std::vector<B>, there's no implicit cast. I didn't understand the FAQ was talking about that, because most of the time, when inheritance and polymorphism are involved, you allocate arrays of pointers to objects. And in that case, there's no difference between a vector<B*> and a B** - you'd need an explicit cast in both cases. I automatically thought the question was the continuation of the discussion in preceding questions about why the compiler wouldn't do the cast implicitly, and in that context the "arrays are evil" remark didn't quite fit. I completely forgot about the arrays-of-objects case, which is something a newbie coming from another language (and with C++, some people stay "newbies" for years) could very well try to use.

C++ grammar: the type name vs object name issue

drorz: In C/C++ you can not separate parsing into separate syntax and semantic passes. No existing compiler does it in two separate passes.

In the example:

AA BB(CC);

The parse tree is different in the following cases:

You can not (more precisely, no one did it in a real C/C++ compiler) fix a wrong parse tree in semantic analysis pass.

Consider this example:

x * y(z);

in two different contexts:

int main() {
    int x, y(int), z;
    x * y(z);
}

and

int main() {
    struct x { x(int) {} } *z;
    x * y(z);
}

In the first case x * y(z) is expression, and in the second case it is a declaration of pointer y. Parse trees for those cases are completely different.

Yossi: This is the first part of the problem making the C++ grammar undecidable. The second part of the problem is that AA may really be Template<Params>::InnerDef, and figuring out whether InnerDef is a type name or an object name is equivalent to solving the halting problem, since templates may instantiate themselves recursively and in fact represent arbitrary recursive functions. Maybe I'll expand on this one later. In particular, it has to do with template specializations, which are discussed in the next item.

Purists who don't like the "nearly context-free" expression in Defective C++: when you write parsers, it does make sense to discuss the "extent" of your dependence on the context. For example, C++ inherits the type name/object name riddle from C. But in C, you can solve it using a single dictionary of typedef names. Of course, theoretically the important part is that the C grammar is decidable (though not context-free). In practice, what matters is that it's easy to parse. In particular, you can use a parser generator for context-free grammars (yacc/bison is one mature program in this family) with the simple "symbol table hack" described above, and get a working parser. This is what "nearly context-free" means.

I think the example is excellent since it took me quite some time to figure out what the code means myself (I think it's the asterisk). The AA BB(CC); example used in Defective C++ is simpler, but I think it doesn't convey the point as clearly, since apparently it makes it intuitively easier to counter with something like "you can solve the ambiguity at the semantical analysis stage". Note that you can always counter with that - for example, you can say that the "parse tree" of your language is simply the list of characters in the file, and the rest is semantical analysis.

C++ grammar: type vs object and template specializations

drorz: Consider the following example:

  
#include <cstdio>
template<int n> struct confusing
{
    static int q;
};
template<> struct confusing<1>
{
    template<int n>
    struct q
    {
        q(int x)
        {
            printf("Separated syntax and semantics.\n");
        }
        operator int () { return 0; }
    };
};
char x;
int main()
{
    int x = confusing<sizeof(x)>::q < 3 > (2);
    return 0;
}

If you "didn't care" about semantics during parsing, then confusing<1>::q is a typename, so confusing<1>::q<3>(2) creates an object of type confusing<1>::q<3> with the argument 2.

If you "do" semantics during the syntax pass, then confusing<4> will be looked up, confusing<4>::q is a variable. The declaration would "expand" to int x = (confusing<4>::q < 3) > 2.

You can see that parse trees in those cases are completely different, based on the output of the sizeof operator!

Yossi: ...and sizeof depends on the platform and the implementation details of inheritance (including multiple and virtual), virtual functions, etc. The "parser" gets closer and closer to a full-blown compiler.

The problem is the freedom that template specializations have when defining members. Now if anybody showed me a useful application of the ability to define something as an inner type in one specialization and a static variable in some other specialization, I'd be surprised.

The reddit thread which this and the previous example are taken from has a detailed discussion about parsing C++.

printf, iostream and internationalization

Alexander E. Patrakov (patrakov at ums dot usu dot ru): The FQA lists valid information for and against the use of <iostream> instead of <cstdio>. There is, however, one more thing for <cstdio> and against <iostream>: the possibility to translate program messages to a different natural language (using, e.g., gettext). And here I don't mean that there is currently no gettext equivalent for C++ iostreams, but that there is no way to design such thing correctly.

Translation works on phrases, not on their parts. Consider, e.g., such C statements:

printf("Read %d files\n", total);
printf("New data were found in %d files\n", found);

With the standard C++ iostreams, this becomes:

cout << "Read " << total << " files\n";
cout << "New data were found in " << found << " files\n";

A well-designed program fetches translations from a message catalog, Windows resource or anywhere else except its own source code. With C and gettext message catalogs, the translator sees the whole phrases such as "Read %d files", "New data were found in %d files", etc. If the same approach were applied to C++, the translator would see just "Read ", "New data were found in ", and "files" (used twice). Lack of context is the least of all worries. The real problem is that, e.g., when translating to Russian, the two instances of " files" have to be translated slightly differently, because Russian has six grammatical cases and different cases are required in the two sentences:

Read %d files => Прочитано %d файлов
New data were found in %d files => Новые данные были найдены в %d файлах

(approximately - I don't want to overwhelm the example with the singular/plural treatment)

Even worse, examples exist with two format substitutions where they have to be reordered when translating. C (or, more precisely, the Single UNIX Specification) allows such reordering with something like printf("%2$d x %1$d inches", width, num); but in C++ the output order of fields is hard-coded.

The downside is, of course, that nobody except the translator checks the translated format string, and wrongly-copied conversion specifiers can crash a program in the corresponding locale (and this did happen with sed and vim in the past).

See how Trolltech handles the abovementioned problems in their Qt toolkit.

Yossi: I really like this example because it can be a real eye-opener for a practical programmer, and I wish I heard and thought about it several years ago. Clearly the printf interface gets something right that iostream doesn't, since it seems to save us lots of trouble. What is it that printf gets right? Could it be that representing the program structure using compile time constructs incomprehensible to any tool except for the compiler is not the way to go? Effectively the advantage of the printf program is that it's easier for other programs to manipulate. The idea that backfires is that program structure may be encoded in abitrarily complex ways and the only one who ever has to worry about it is the compiler writer.

But maybe this translation business is a singularity in the computing universe, and we shouldn't infer general conclusions from it? Well, here's another example. Suppose you want to do real time logging. You don't have enough time and/or bandwidth to do the formatting at the target machine. And yet you want to log free text, not some strict binary format with versioning schemes and fixed size limits and other headaches. With printf-style interface, you can log packets of (for example) 32 bit words - size, constant format string pointer, and the list of arguments. You can then extract the format strings from the executable file (reading ELF or COFF files is easy - there are examples on the net of about 200 lines of C code), and do the formatting at the host machine. Now, with iostream-like interface, the format string is split to many little parts, and all kinds of types come in the middle - types of data items have to be encoded in the logged packets, too. And you'd have to log calls to I/O manipulators such as hex, setfill, etc. Clearly the overhead per logged data word is going to increase significantly.

Think about it: how can it be that a simplistic "1 format string plus N arguments of dynamic types" interface beats an advanced "statically dispatched polymorphic operators" interface, and what makes it surprising to you?

Static binding rules

Miguel Catalina: The following test program does not compile under gcc 4.3.{1,2}:

#include <cmath>
struct my_class {
  my_class(int) {}
};
inline my_class operator&&(my_class,int){return my_class(0);}
int main(void)
{
  double x = std::pow(1.0, 1.0);
  (void) x;  to avoid unused variable warning
}

The error message is:

$ g++ -Wall -pedantic-errors simple_test.cpp
simple_test.cpp: In function ‘int main()’:
simple_test.cpp:11: error: ambiguous overload for operator&&' in std::__traitor<std::__is_integer<double>, std::__is_floating<double> >::__value && std::__traitor<std::__is_integer<double>, std::__is_floating<double> >::__value'
simple_test.cpp:11: note: candidates are: operator&&(bool, bool) <built-in>
simple_test.cpp:7: note:                 my_class operator&&(my_class, int)

The reason for this error has to do with this declaration in cmath:

  template<typename _Tp, typename _Up>
    inline
    typename __gnu_cxx::__promote_2<
    typename __gnu_cxx::__enable_if<__is_arithmetic<_Tp>::__value
				    && __is_arithmetic<_Up>::__value,
				    _Tp>::__type, _Up>::__type
    pow(_Tp __x, _Up __y)
    {
      typedef typename __gnu_cxx::__promote_2<_Tp, _Up>::__type __type;
      return pow(__type(__x), __type(__y));
    }

Tracing a few header files up brings us to bits/cpp_type_traits.h:

  template<typename _Tp>
    struct __is_arithmetic
    : public __traitor<__is_integer<_Tp>, __is_floating<_Tp> >
    { };

...and:

  template<class _Sp, class _Tp>
    struct __traitor
    {
      enum { __value = bool(_Sp::__value)  bool(_Tp::__value) };
      typedef typename __truth_type<__value>::__type __type;
    };

So it turns out that the operation that is giving us trouble is the && inside the __enable_if in the template declaration of pow(). We are invoking && with two enum operands (__is_arithmetic<T>::__value is an unnamed enum). I guess the compiler is treating unnamed enums as ints. So the compiler is trying to call operator&&(int,int). But there isn't, there are only operator&&(my_class,int) and operator&&(bool,bool). So the compiler is trying to do an implicit conversion of the operands so that they can match the available prototypes. There are implicit ways of converting an int to a my_class, as well as convering an int to a bool. The compiler does not know which one to use, hence the ambiguity.

The question is: why on Earth when you are trying to invoke a function that only deals with doubles, do you have to deal with the ambiguity between two available implicit conversions for types that have nothing to do with double?

Yossi: Takes time to wrap one's mind around this, um, treason (don't you just love the public __traitor bit? I guess a "traitor" is something used to generate so-called "type traits", a key idiom in the world of C++ templates arcana). Now that we (presumably) understand the error message, how would you work around the problem? If the compiler would barf trying to dispatch an operator with user-defined types, we could specifically define the operator with the prototype it would pick as the best match (as the GNU STL implementors themselves do in similar situations). But we can't define operator&&(int,int). Now what?


Copyright © 2007-2009 Yossi Kreinin
revised 17 October 2009