Saturday, February 18, 2017

What's wrong with C?

(The following post is adapted from an answer I wrote in response to this question.)

I started programming in C++ when I was a sophomore in college. Later, I also learned to program in C (without the ++) when working as a student Unix systems administrator.  I wrote somewhere between 10,000 and 20,000 lines of C++ when working on my doctoral dissertation.  Early in my career as a professor, I taught both the C++ and C languages in different courses at different times.

I've since given up entirely on the C and C++ languages. I no longer teach them or write any code in them.

So what do I use now? For quick scripting, I use Python. For building something with a GUI, I usually program in Java. For robot programming, I use both Python and Java, depending on the situation and the available libraries. (I have ambitions to begin using Rust for robot programming as well.)  I like Haskell a lot, but I have not really found a niche for it in the projects I have been pursuing.

My goal in this post is to outline my rationale for giving up on the C family. Some of these comments apply specifically to C, others to both C and C++. None of these critiques are original; all of these points have been made by other people. I am assembling them here to give a concise rationale as to why one long-time C/C++ programmer decided to move on.

What is great about C is that it is well-suited to low-level programming that directly addresses hardware. Unlike Python, Java, any .NET language, or anything with a garbage collector, it does not have a run-time. The code you write in C is translated directly into assembly language, and aside from libraries you explicitly invoke, the only code that runs is the code you write. If you are writing an operating system or code for an embedded device with real-time constraints, it can be an appealing option.  (I now believe that Rust is a strictly superior option in these situations, but that is a subject for a future blog post.)

However, there are numerous problems with C that often make it unproductive or impractical. Problems that also apply to C++ are noted explicitly:
  • Minimal facilities for data abstraction (C). Higher-level languages like Python and Java provide many advanced mechanisms for data abstraction, including classes, objects, polymorphism, lambdas, and so forth. While all of the above can be imitated using C (especially through clever use of function pointers), it is quite painful.
  • Minimal data-structure libraries (C). Other languages include data-structure libraries as part of their standard libraries. While third-party data-structure libraries are available for C, it can be a significant inconvenience in comparison to standard libraries shipped with every implementation.
  • Poor string-handling (C). Using strings in C requires lots of low-level operations. Equivalent work can be done in languages like Python with much less code.
  • Lack of safety (C, C++). In a safe language, like Java or Python, errors in programs are trapped as they happen. C is an unsafe language. That is, a program with an error results in undefined behavior. This can make C programs insanely difficult to debug. In a safe language, the program throws an exception; you trace the exception, find the problem, and fix it. In an unsafe language, all bets are off. Get very familiar with valgrind.
  • Poorly defined semantics (C, C++). According to the C standard, the compiler can generate any code it wants in the presence of undefined behavior. Furthermore, even extremely experienced C programmers can fail to understand situations that result in undefined behavior. As compiler writers discover more opportunities to optimize away code that results in undefined behavior, existing programs that are believed to work suddenly start failing simply by updating the compiler. As bug submitter felix-gcc put it, “Guys, your obligation is not just to implement the C standard. Your obligation is also not to break apps that depend on you.” And this isn’t just one developer; even gcc itself and the Linux kernel frequently execute code with undefined behavior. The fact that this can happen and is so pervasive reveals fatal flaws in the C standard itself. For more details about these problems, I recommend the following:
All of that said, what really brings this issue to be important to me is the fact that I am now my department's Operating Systems instructor. I had to conclude that I did not want to teach that course using the C language. I have decided to teach it using Rust, the details of which I will describe in a series of future blog posts.

No comments:

Post a Comment