What makes cover-up preferable to error handling

October 2nd, 2009

There was a Forth tutorial which I now fail to find that literally had a "crash course" right in the beginning, where you were shown how to crash a Forth interpreter. Not much of a challenge – `echo 0 @ | pforth` does the trick for me – but I liked the way of presentation: "now we've learned how to crash, no need to return to that in the future".

So, let's have a Python & Perl crash course – do something illegal and see what happens. We'll start with my favorite felony – out-of-bounds array access:

python -c 'a=(1,2,3); print "5th:",a[5]'
perl -e '@a=(1,2,3); print "5th: $a[5]\n"'

The output:

5th:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
IndexError: tuple index out of range
5th:

Python in fact crashed, telling us it didn't like the index.

Perl was more kind and rewarded our out-of-bounds index with what looks like the empty string. Being the kind of evildoer who's only further provoked by the gentle reactions of a do-gooder, what shall we do to further harass it? Well, it looks like anything makes a good index (and I mean anything: if @a=(11,22), then $a["1"]==22 and $a["xxx"]==11). But perhaps some things don't make good arrays.

python -c 'x=5; print "2nd:",x[2]'
perl -e '$x=5; print "2nd: $x[2]\n"'

Output:

2nd:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: 'int' object is unsubscriptable
2nd:

Python gives us its familiar elaborate complains, while Perl gives us its familiar laconic empty string. Its kindness and flexibility are such that in a numeric context, it would helpfully give us the number 0 – the empty string is what we get in a string context.

Is there any way to exceed the limits of Perl's patience? What about nonsensical operations – I dunno, concatenating a hash/dictionary/map/whatever you call it and a string?

python -c 'map={1:2,3:4}; print "map+abc:",map+"abc"'
perl -e '%map=(1,2,3,4); print "map+abc: ",%map . "abc\n"'

Output:

map+abc:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'str'
map+abc: 2/8abc

Python doesn't like our operands but Perl retains its non-judgmental frame of mind. A Perl detractor could point out that silently converting %map to "2/8" (hash size/reserved space) is patently insane. A Perl aficionado could point out that Perl seems to be following Python's motto "Explicit is better than implicit" better than Python itself. In Python you can't tell the type of map at the point of usage. Perl code clearly states it's a hash with %, moreover . specifically means string concatenation (as opposed to +). So arguably you get what you asked for. Well, the one thing that is not debatable is that we still can't crash Perl.

OK, so Perl is happy with indexes which aren't and it is happy with arrays which aren't, and generally with variables of some type which aren't. What about variables that simply aren't?

python -c 'print "y:",y'
perl -e 'print "y: $y\n"'

Output:

y:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'y' is not defined
y:

NameError vs that hallmark of tolerance, the empty string, a helpful default value for a variable never defined.

By the way, this is how $x[5] evaluates to an empty string when x isn't an array, I think. $x[5] is unrelated to the scalar variable $x, it looks for the array variable @x in another namespace. There's no @x so you get an empty array, having no 5th element so you get "". I think I understand it all, except for one thing: is there any way at all to disturb the divine serenity of this particular programming language?

python -c 'a=0; print 1/a'
perl -e '$a=0; print 1/$a'

This finally manages to produce:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
Illegal division by zero at -e line 1.

The second message about "illegal" division by zero (is there any other kind?) comes from our no longer tolerant friend, making me wonder. What is so special about division by zero? Why not be consistent with one's generally calm demeanor and return something useful like "" or 0? Would be perfectly reasonable – I did it myself, more accurately asked to have a hardware divider returning zero in these cases. Because there wasn't what hardware calls exception handling (having the processor jump to a handler in the middle of an instruction stream). We lived happily ever after, so what's wrong with 0?

But the real question is, what explains the stunning difference between Perl's and Python's character? Is it philosophical, "There's More Than One Way To Do It (TMTOWTDI)" vs "There should be one – and preferably only one – obvious way to do it; Although that way may not be obvious at first unless you're Dutch" (actual Perl and Python mottos, respectively)? The latter approach encourages to classify program behavior as "erroneous" where the former tends to instead assume you're knowingly doing something clever in Yet Another Possible Way, right?

Modernism vs Postmodernism, maybe, as outlined by Perl's author in "Perl, the first postmodern computer language"? "Perl is like the perfect butler. Whatever you ask Perl to do, it says `Very good, sir,' or `Very good, madam.' ... Contrast that with the Modern idea of how a computer should behave. It's really rather patronizing: `I'm sorry Dave. I can't allow you to do that.'" The latter can be illustrated by Python's way of answering the wish of its many users to use braces rather than indentation for scoping:

>>> from __future__ import braces
SyntaxError: not a chance

So, "Very good, sir" vs "I can't allow you to do that". Makes sense with Python vs Perl, but what about, say, Lisp vs C++?

Lisp definitely has a "There's More Than One Way To Do It" motive in its culture. Look how much control flow facilities it has compared to Python – and on top of that people write flow macros, and generally "if you don't like the way built-in stuff works, you can fix it with macros", you know the drill. Guessing the user's intent in ambiguous situations? Perl says that it "uses the DWIM (that's "Do What I Mean") principle" for parsing, borrowing a term from Lisp environments. And yet:

(print (let ((x (make-array 3))) (aref x 5)))
*** - AREF: index 5 for #(NIL NIL NIL) is out of range
(print (let ((x 5)) (aref x 2)))
*** - AREF: argument 5 is not an array
(print (let ((m (make-hash-table))) (concatenate 'string m "def")))
*** - CONCATENATE: #S(HASH-TABLE :TEST FASTHASH-EQL) is not a SEQUENCE
(print y)
*** - EVAL: variable Y has no value
(print (/ 1 0))
*** - division by zero

5 out of 5, just like Python. Contrast that with C++ which definitely has a Bondage and Discipline culture, what with all the lengthy compiler error messages. Actually C++ would score 4 out of 5 on this test, but the test is a poor fit for statically typed languages. A more appropriate way to evaluate C++'s error handling approach would be to focus on errors only detectable at run time. The following message from The UNIX-HATERS Handbook records the reaction of a lisper upon his encounter with this approach:

Date: Mon, 8 Apr 91 11:29:56 PDT
From: Daniel Weise
To: UNIX-HATERS
Subject: From their cradle to our grave.

One reason why Unix programs are so fragile and unrobust is that C coders are trained from infancy to make them that way. For example, one of the first complete programs in Stroustrup’s C++ book (the one after the β€œhello world” program, which, by the way, compiles into a 300K image), is a program that performs inch-to-centimeter and centimeter-to-inch conversion. The user indicates the unit of the input by appending β€œi” for inches and β€œc” for centimeters. Here is the outline of the program, written in true Unix and C style:

#include <stream.h>

main() {
  [declarations]
  cin >> x >> ch;
    ;; A design abortion.
    ;; This reads x, then reads ch.
  if (ch == 'i') [handle "i" case]
  else if (ch == 'c') [handle "c" case]
  else in = cm = 0;
    ;; That’s right, don’t report an error.
    ;; Just do something arbitrary.
[perform conversion] }

Thirteen pages later (page 31), an example is given that implements arrays with indexes that range from n to m, instead of the usual 0 to m. If the programmer gives an invalid index, the program just blithely returns the first element of the array. Unix brain death forever!

You could say that the sarcasm in the Lisp-style comments proudly intermixed with C++ code is uncalled for since example programs are just that – example programs. As to the dreaded out-of-bound array access cited in the second example, well, C++ doesn't handle that to avoid run time overhead.

But the cited program didn't just ignore the range problem the way C would – it went to the trouble of checking the index and then helpfully returned the 0th element the way Perl would. Probably as one part of its illustration how in C++ you could have custom array types which aren't like C arrays. But why Postmodern Perl arrays, in a generally Disciplined language?

Well, it was 1991 and C++ exceptions were very young, likely younger than the cited example programs. (They've since aged but didn't improve to the point where most library users would be happy to have to catch them, hence many library writers aren't throwing them.)

Likewise, Perl had all the features used in the examples above before it had exceptions – or more accurately before it had them under the spotlight, if I understand this correctly. (In Perl you handle exceptions by putting code in an eval { ... } block and then calls to the die() function jump to the end of that block, saving die's argument to $@ – instead of, well, dying. I think Perl had this relatively early; however it seems to only have become idiomatic after Perl 5's OO support and an option to send exception objects to $@, with people using die strictly for dying prior to that.) Perhaps Perl's helpful interpretations of obvious nonsense like $a["xxx"] aren't that helpful after all, but what would you rather have it do – die()?

AFAIK Python had exceptions under the spotlight from the beginning – although similarly to Perl it had exception strings before it had exception classes. And in fact it does its best to adhere to its "Errors should never pass silently" philosophy, the few deviations coming to mind having to do with Boolean contexts – the falsehood of empty strings, None and 0 together with None!=False/0==False/1==True/2!=True and similar gateways to pornographic programming.

Lisp has conditions and restarts which blow exceptions out of the water, hence its willingness to report errors isn't surprising. However, it gained these features in the 80s; what did previous dialects do? ERRORSET, which is similar to a try/catch block, appears to predate the current error handling system, but it doesn't seem to have been there from the very beginning, either. I'm not familiar with the Lisp fossil record, but there's a function for indexing lists called NTH which returns NIL given an out-of-bounds index. Lists definitely predate arrays, so I assume NTH likewise predates AREF which complains given a bad index. Perhaps NTH doesn't complain about bad indexes since it also predates ERRORSET and any other form of exception handling?

The pattern seems to be: if the language has exceptions, most of its features and libraries handle errors. If it doesn't, they don't; errors are covered up.

(Although I won't be surprised if I'm wrong about the Lisp history part because Lisp is generally much more thoughtful than I amΒ  – just look at all the trouble Interlisp, by now an ancient dialect, went into in order to figure out whether a user wants to get an opportunity to fix an error manually or would rather have the program silently return to the topmost ERRORSET.)

awk and early PHP lack exceptions and are happy with out-of-bound array access. Java and Ruby have exceptions and you'll get one upon such access. It isn't just the culture. Or is it? Perl is PHP's father and awk is PHP's grandfather. *sh and make, which, like Perl, produce an empty string from $nosuchvar, aren't good examples, either – sh is Perl's mother and make is Perl's sister. Is it really the Unix lineage that is at fault as suggested by a dated message to a late mailing list?

Here's JavaScript, an offspring of Lisp:

>>> [1,2,3][5]+5
NaN
>>> [1,2,3][5]+"abc"
"undefinedabc"

I think this definitely rivals Perl. Apparently it's not the lineage that is the problem – and JavaScript didn't have exceptions during its first 2 years.

The thing is, errors are exceedingly gnarly to handle without exceptions. Unless you know what to do at the point where the error is detected, and you almost never know what to do at the point where the error is detected, you need to propagate a description of the error up the call chain. The higher a function sits up the call chain, the more kinds of errors it will have to propagate upwards.

(With a C++ background it can look like the big problem doesn't come from function calls but from operators and expressions which syntactically have nowhere to send their complaints. But a dynamic language could have those expressions evaluate to a special Error object just as easily as it can produce "" or "undefined". What this wouldn't solve is the need to clutter control flow somewhere down the road when deciding which control path to take or what should go to files or windows, now that every variable can have the value Error tainting all computations in its path.)

Different errors carry different meta-data with them – propagating error codes alone ought to be punishable by death. What good is a "no such file" error if I don't know which file it is? What good is a "network layer malfunction" error if I can't tell that in fact it's a "no such file" error at the network layer because a higher level lumps all error codes from a lower level into one code? (It always lumps them together since the layers have different lifecycles and the higher level can't be bothered to update its error code list every time the lower level does.)

Different meta-data means, importantly for static languages, different types of output objects from callee functions depending on the error, which can only be handled in an ugly fashion. But even if there's no such problem, you have to test every function call for errors. If a function didn't originally return error information but now does, you have to change all points of call.

Nobody ever does it.

It is to me an upper bound that apparently all programming languages without exception handling cover up errors at the language level.

A designer of a successful programming language, whatever you think of that language, is definitely an above average programmer, actually I think one that can safely be assumed to occupy the top tenth no matter how you rate programming ability (and really, how do you?). Moreover, the language designer is also in the top tenth if programmers are sorted by the extent of importance they assign to their work and motivation to get things right – because the problems are fundamental, because of the ego trip, because of everything.

And still, once decision or chance lead them into a situation where the language has no exceptions, they allow themselves to have errors covered up throughout their cherished system. Despite the obvious pain it causes them, as evident from the need for rationalizations ranging from runtime overhead (as if you couldn't have an explicit unsafe subset for that – see C#) to the Postmodern Butler argument ("Fuck you plenty? Very good, sir!").

What does this leave us to expect from average programs? The ones written in the order of execution, top to bottom, with the same intellectual effort that goes into driving from point A to point B? The ones not exposed to the kind of massive testing/review that could exterminate bugs in a flophouse?

This is why nothing inspires my trust in a program like a tendency to occasionally wet its pants with an error call stack. Leaving some exceptions uncaught proves the programmers are somewhat sloppy but I wouldn't guess otherwise anyway. What's more important is that because of exceptions along the road, the program most likely won't make it to the code where shooting me in the foot is implemented. And if a program never spills its guts this way, especially if it's the kind of program with lots of I/O going on, I'm certain that under the hood, it's busy silencing desperate screams: where there are no exceptions, there must be cover-up.

1. EntityOct 3, 2009

ahh yes, but your forgetting a major part of your essay.

The other side of the coin in the argument would be the problem with exception handling is it allows the exception to be caught and the program to still continue its death cycle to hell.

Either way, its an argument not of exception handling v function error return codes. Its just how to propagate errors through the different layers of the program.

No matter, in essence when you design a program there is a infinite number of ways for the program to crash. Exception handling does not handle this very well because it still allows you to catch and continue when the program state can be in-consistent or unknown.

The solution its to design the program so that that if the state of the program becomes unknown then it quits the program and prints out a call stack and memory dump.

Though this may seem like a harsh reality of the situation. But you will find that exception handling is like candy, it sucks people into a mindset that don't worry somebody else will handle this problem everything is alright. The problem with this mindset like you've given an example above, is that the more exception code you write, the more code your going to write to catch the unknown thrown exception. Keep adding more code until infinity.

Though right now you would be screaming "what crash the program and print a stack trace to the user". No silly that would be stupid. In most critical systems you will have 3-4 different machines doing the same calculation. If one becomes invalidated then the machine will restart itself and rejoin well the others take over. In consumer electronics you can see exactly the same design philiphosys taken by different teams.

For example my mobile phone by Nokia will freeze and require a complete reboot. Another phone by Nokia by a different design team, It would suddenly rebooted itself on me well usage. Either way, I rather my mobile phone to suddenly restart on me than when im driving seeing its locked up on a internation phone call that cost me $500 because it didnt hang up.

Its not how you design your layers and error handling in your program. What really matters is how fast you can detect invalid states in your program and find and fix bugs. Because either with exception handling or return codes if you cannot identify invalid states in your program and go through major iteration life cycles of your project it will never make it to market.

2. AndrewOct 3, 2009

It's important to mention that much of Perl's "cover-up" is optional, and Perl can be made more strict with the "use strict" and "use warnings" directives.

3. Yossi KreininOct 3, 2009

@Andrew: agreed, I probably should have mentioned it; I didn't because I didn't want to discuss how it works with older code and how various errors would be handled under these settings and how things evolved around that – I just don't know much about it. I think the dynamics are similar to (my impression of) the dynamics of error handling in Lisp, with older functions returning "something" given "weird" arguments and newer functions producing an error; but Perl's approach to compatibility is different here – old features are in fact given new, incompatible semantics but only when explicitly specified by the user.

This raises the general question – if you started out without exceptions and without much "strictness" ("without exceptions and therefore without much strictness" according to my claim), what are your chances to retrofit more strictness later, overcoming compatibility issues? What raises/lowers the chances to succeed, and what the extent of success is likely to be (will errors be no longer tolerated or merely logged? which ones will be handled which way? and which error cover-ups will remain intact, classified as "features"?) I don't think I know to approximate an answer.

And while we're at it – am I right about the Perl exception handling history part?

@Entity: regarding the need to find bugs – much easier to do when array[badindex] crashes immediately then when it corrupts memory and moves on. Regarding ways to handle exceptions – depends on the system, but you usually end up having a few places where exceptions are handled and live happily ever after. For example, many embedded systems can simply be restarted, perhaps saving some of the state first; what's important though is for them to restart when a bug surfaces as opposed to corrupting data like the majority of them that are written in C(++) most probably will. At other times there are clear boundaries between subsystems limiting the amount of damage – the error is localized in a plugin/a subprocess/etc.; log the error and reload/restart/etc.

At the end there are two truly ugly scenarios: pressing forward despite the error and "killing everything" upon it. Exceptions avoid both.

4. EntityOct 3, 2009

Ahh yes, so you would be in the camp, don't press forward stop the program and restart.

Though the point I guess I didnt communicate clearly was exception handling does not do both, in fact its actualy worse.

The reason for it being worse, is by looking at any bit of code you cannot clearly identifiy what part of the code will throw an exception. This is the nature in C++. The next problem with exceptions in a language like C++ or in any language for that matter they're an invisible goto.

I agree with you in general your view point of exceptions and how great they're but my view point is that I've seen them been used for worse than better. I've seen try, catch (everything) throughout java programs and thats my main concern. The problem with how people use exception and how enginners use exceptions are two different things.

5. Yossi KreininOct 3, 2009

C++ makes the finalization problem harder than it is elsewhere, so it's the worst example.

I wouldn't be "in the restarting camp" – it depends on the program. What doesn't is that pressing forward is awful.

6. Paul EipperOct 3, 2009

Sometimes I wish for something like:

function_do_stuff(foo, bar) on error(handler_object)

where handler_object has the necessary callback for the thrown exception by function_do_stuff (or die() ;)

I just don't all that error managing code mixed with the process logic.

7. arakydOct 7, 2009

As this post so entertainingly points out, there are many ways to handle (apparent) errors that are even dumber than exceptions. Still, as a programmer who strives to be better than the average schlubs that Yossi has to clean up after, I agree 100% with Entity: exceptions are a still a code smell. A function is just an interface; if there is no file, just return the appropriate value for that result, don't call it an error and invent an entirely new control structure just so I can throw it all over the place. I knew there might not be a file when I called you, so stop being so dramatic about it. Same with division by zero, or overflow, or out of memory, or whatever. It's all the same.

Exceptions are concessions to the fact that most programmers are incapable of dealing with the environments in which their functions operate, incapable of designing their programs in terms of small interfaces, and incapable of documenting those interfaces. It's not all their fault. They are usually forced to call all sorts of library functions or whatever which also have huge, poorly documented interfaces, and of course the language itself is typically a huge interface, although hopefully it is at least well documented and easy to interact with. Object orientation is especially good at increasing the shear mass of an interface at exponential rates. Then there is usually an operating system, and so the horror compounds itself.

Another problem is that everyone wants to handle the common case first and not think about the edge cases until they have to. This is a stupid approach and should be beaten out of every CS student as soon as possible, but of course it is not. Everyone is too busy mastering a bunch of overcomplicated interfaces. The upside is that we get entertaining posts from the people who have to clean up the messes.

There is a pattern here, and it is that you get huge interfaces when you cross personal and group boundaries. This is the corollary to Conway's Law (system architecture follows social organization): the more people involved, the more huge interfaces you get. The less you know about your customer, the more you put in your interface and the more you have to carefully check every single edge case – which are now multiplying like rabbits – because your mystery customers want to do everything and they will do everything.

I am writing a program for end users at the moment, and a major goal is to make the user interface simple so that I do not have to write mountains of code asking the user if they really meant to do this instead of that and inevitably missing things which will either crash the program or require blind defensive programming in the form of elaborate error propagation schemes. I have figured out how to eliminate most of the libraries I was using without writing much more code myself (actually less, in some cases). I throw no exceptions, eliminate the need to check as many that are thrown by the libraries I do use as possible, and catch the rest with a pleasant message, state dump, and program close; no known bugs which cause this are allowed in release builds.

It takes more time to do things this way, but the end result is much more stable and pleasant for everyone. Even if it turns out not to be what the user wanted, the program is now small enough that it can be rewritten much faster than would have been possible if I had bloated it out with extraneous libraries and error handling. I do not have a sense of attachment to hundreds of thousands of lines of code that I had to sweat over and key in and test, because there aren't hundreds of thousands of lines of code.

The problem, of course, is that to scale you inevitably have to cross a lot of these sorts of boundaries, but even then the way to scale is to minimize the complexity of the interface. Unless you are Microsoft, or Larry Wall, or whatever, and get a lot of money or personal recognition from the fact that you have millions of programmers wandering around in your maze like interfaces, plaintively bleating and crapping themselves while others follow along behind with shovels and write amusing blog posts (or snarky comments).

8. Yossi KreininOct 8, 2009

"Another problem is that everyone wants to handle the common case first and not think about the edge cases until they have to. This is a stupid approach and should be beaten out of every CS student as soon as possible, but of course it is not."

Even if a CS department succeeded in thusly brainwashing their students, economics would soon beat the supposedly stupid approach right back into them.

Now in a non(or less)-snarky mode: I believe that you are sufficiently more optimistic than me regarding the ability of people to write (exceptionally) good code in reasonable time frames, hence your lower tolerance for the admittedly abundant stupidity in interfaces, team interactions, operating systems, etc.

In the world that I think I live in, exceptions are a very good way to significantly raise the quality and lower the damage of a mean/median piece of code. In the world that you think you live in (more specifically in the world that I think that you think you live in...) that simply isn't much of a virtue, as an ability to improve the nose-picking skills of the population isn't much of a virtue – the real goal should be to have people stop picking their noses in public, not having them get better at it.

But I believe (not really, just to go on with the example) that people will always, or at least in the foreseeable future, pick their noses in public, so if there's a way to at least have them then clean their fingers without using someone else's clothes, this way should be pursued; back from the analogy, without exceptions people won't, in my world, think about more edge cases but will simply cover up errors – and lie to themselves about it if that's what it takes to get their job (look like) done on time. But to you, once they stop cleaning their fingers with others' clothes, there will be a downside of lowering the pressure to stop nose-picking altogether, so it's not very good to go that way.

I think exceptions are an economical way to write passable software, you think they lead astray from the path to exceptional software. I think the core point is that we have different estimations of the possible and the likely.

9. EntityOct 8, 2009

I still think you're trying to solve the problem from a technical/engineer stand point of view. You want a error handling mechanism or technique that can be used by your average developer so that they can produce maybe good or better code without having to teach them a whole bunch of new 'stuff'.

The problem is its not a technical problem, its a problem between people and how well you or they can communicate the message about your interface across to each other. Its more of a cultural code. If everyone is friendly and TRUST each other and sets a guideline how and where you will proper-gate errors through the layers either with 'exception' or 'return codes' then that is going to produce a better product at the end of the day.

Though you may say that limiting the developers if they use exceptions or if they use return codes in their software may result in less clever way to do things the truth is the reverse. You typically find people will get more creative with a narrow sets of well defined tools that behave exactly how they expect and they can trust them.

The other big question you have not talked about even with exception handling is the nightmare that is maintaining all the thrown exceptions around the place? Because lets say a new developers adds a new exception lets call it "TimeOut". The program compiles, and it gets shipped to the customer. The customer reports that the program just suddenly stops working with a stackdump of "exception not handled "timeout"" You go in and find out that this exception is thrown but needs to be caught and delt with in all its different layers of thrown exception. A exception is caught in this layer and then re-thrown as a new exception in another layer!

What I'm saying is if developers cannot separate their layers correctly with easy return codes and maintain them what hope do they have following no design principles with exception handling other than "throw it you will be okay"?

You cannot take a average programmer and economical turn this program into a better one by using a different technology and not even talking about the social-political-engineer boundaries that you have to cross. Though you can take any C++ program and dump them into Visual Basic and C# and watch them fly because that language needs so much brain power to go to waste its not funny.

If you want to economical way of doing what you're highlighting then you should 'fire' the programmer and hire a new one that has the same ideas, practices and focus to write good software as yourself. Considering that if your not a person in a high position your going to have to work on team spirit or bring the two teams together. Other wise, stay the course get burnt via very complex interfaces that emerge and slipped product dates or a new job that team are focused on the same goals as you.

Its always a people problem.

10. Yossi KreininOct 9, 2009

That something isn't a strictly technical problem but rather a "people problem" happening in a technical context doesn't make the problem solvable by strictly non-technical measures, nor does it mean that changing the technical details of an environment won't affect the situation for better or worse. You may be the CEO or the President of the World and still you won't be able to fire all "bad" programmers and hire only "good" ones and make them trust each other in all caps and write perfect code.

To me, realizing that some problems just aren't technical at their core but rather psychological goes together with realizing that some human behaviors simply can't be changed so all that can be done is adjusting the environment such that their damage is minimized. It sounds like two opposite things but it's actually the same thing: tech stuff is much easier to tweak than human attitudes which should be accepted as a given much more frequently; it is he who thinks that everything is a solvable technical problem who will rant most vitriolically about the narrow focus and lack of education of his fellow humans which supposedly prevent the always possible technical solution.

11. arakydOct 11, 2009

Wrt handling the common case first, I think this is a matter of scale also. The larger the interface, the more you want to (and have to) focus on the common case. So, yes, in the average case you do have to start with the average case.

You are also right, Yossi, that it is always much easier to change the technology than to change the people. Exceptions or whatever are not a (locally) optimal solution to the technical problem, they are a (locally) optimal solution to the technical + hiring problem. I think the result is that better programmers who don't mind taking on risk in exchange for more satisfying work environments go off and find small teams of other good programmers who have found a problem that can be solved better by scaling technical proficiency than by scaling number of people and lines of code. In other words, the smarter nomads leave the organization and start their own organizations. It doesn't always work out obviously, but I don't know of anyone with good data on success rates. What's more clear is that it makes the hiring problem more difficult, and therefore the average quality of average code at the average large organization that much worse.

12. Yossi KreininOct 11, 2009

The trouble with running your own organization is (again) that it's not just about having a team that is technically strong. In fact in many cases deciding to do so more or less guarantees that you'll no longer do technical work yourself.

Also, some things just can't be done without quite sizable teams (and quite some funding), which is why most embedded development is done by companies ranging between medium and extra large, and why I'll think very carefully before choosing another gig in this hefty domain.

13. Kragen Javier SitakerNov 8, 2009

In Wheat, we did in fact return an error object (there was an error-object constructor spelled "!!!") instead of raising exceptions, specifically with the idea of avoiding control-flow complication and thus making cleanup code reliable. Every time you did something to an error object, such as call a method on it or return it from a method, it would add some trace information.

(It also led to a nice error-handling operator: a !! b evaluates and returns b iff a evaluates to an error, so you could write foo.bar[baz].quux(3) !! walrus and get the walrus if foo didn't have a bar method, if indexing into it with baz failed, if the result didn't have a quux method, or if quux failed when given 3.)

At first this led to the problem you would expect: eventually most error objects would propagate up to an expression statement, at which point the value would be ignored and the error would be silently covered up. So we changed the semantics so that a method that encountered an error object as the value of an expression statement would continue on its way as normal β€” but would eventually return that error object as its return value, instead of whatever value was supposedly being returned.

This worked reasonably well in the small amount of code we wrote in Wheat.

I don't remember how we handled the case where someone tries to use an error object to direct their control flow, e.g. "if (!!!) ..." .

14. Yossi KreininNov 8, 2009

Is this the Wheat you're talking about?

It's interesting how you had error objects – I always wondered why on Earth people put up with contextless error codes. It's a bit frightening that code falls through errors like that though; I try to imagine how all the side effects in the trailer of the function which eventually returns an error object work out considering that something upwards failed to do what it was supposed to do.

15. Kragen Javier SitakerNov 11, 2009

Yup, that's the one.

Every once in a while those side effects would end up overwriting some important things with error objects β€” whatever important things they were trying to modify based on the failed computation, of course. Initially Mark made it impossible to store error objects in data structures (only local variables and return values) but in a lot of cases *not* overwriting was even worse, and in any case you often wanted to preserve the error for inspection a little later.

So I guess you're not a fan of this "Go" thing?

16. Yossi KreininNov 12, 2009

I guess I'm not, at least not of their attitude towards exceptions. It's nice to think of a programming language as a "systems programming language" which is used to program "systems" which should handle errors in well thought-out ways, however to me it's a lot like saying that we shouldn't have prisons because it is better to live in a society where people obey laws irregardless of the possibility of punishment – most of us don't live in such a society, and most of us aren't working on "systems"; there is crime in the streets and stupidity in the code whether you think there should be or not.

Also, "goroutines" creep me out, as does the tagline "don't communicate by sharing memory, share memory by communicating" ("don't ask what shared memory can do for you, ask what you can do for shared memory", as my roommate said when he read that).

It's nice that people behind C and Unix finally acknowledge the utility of garbage collection, and that a large C++ shop, and a favorite of the language creator's, acknowledges the problem with C++ build times; as to the rest of Go, I'll have to look into it.

17. Kragen Javier SitakerNov 12, 2009

I think "systems" means "software that connects programs together". Quoting Rob Pike from 2000 in http://www.eng.uwaterloo.ca/~ejones/writing/systemsresearch.html, "Systems: Operating systems, networking, languages; the things that connect programs together. ... Web caches, web servers, file systems, network packet delays,
all that stuff. Performance, peripherals, and applications, but not kernels or even user-level applications....The focus is on applications and devices, not on infrastructure
and architecture, the domain of systems research."

So basically these guys were complaining that, for writing that kind of stuff (and Google has a huge quantity of it, and a lot of people working on it; it's where their competitive advantage comes from) they wanted a better language. I don't think they really care whether it's good for average programmers. If they cared about that, they would have come out with Modula-2 or something instead of C, right?

I prefer exceptions, myself.

I concur that the neologism "goroutines" is annoying. But I think my low tolerance for neologisms may be a personal flaw.

I thought the tagline was pretty insightful, really; message-passing and monitors are isomorphic and dual to one another (there's a paper from the 1970s about this, but unfortunately I can't remember even the terminology they used, let alone the title) but message-passing seems to me to make it possible for concurrency to make programs *simpler* instead of *more complicated*, even aside from possible efficiency or responsiveness gains. I've never seen that with shared memory. Maybe Newsqueak is the best example of this, although Python generators (and especially generator expressions) and Unix pipes may be good examples too.

I think it's kind of a shame that they're including unrestricted shared memory. It means that you can't, for example, asynchronously kill a thread safely, the way you can in Erlang.

About GC, I think they've been sold on GC for some uses for a while. Winterbottom was in that group, and his 1994 paper on Acid revealed that it used a mark-and-sweep GC. It's just a question of garbage collectors improving over the years to be applicable in more circumstances.

18. OktalistMar 25, 2010

Here's where Python falls down for me...

#Python:
receipt = foo #declaration is first assignment, why?
#...snip...
reciept = bar #typo not caught, nasty bug

#Perl:
use strict;
my $receipt; #mandatory explicit variable declarations are good
$receipt = $foo;
#...snip...
$reciept = $bar; #compile-time error, unrecognised symbol

(in both cases assume foo and bar are already known names/symbols)

Admittedly the error Perl emits due to the typo is somewhat cryptic ("Global symbol requires explicit package name" when the symbol you meant to use was lexically-scoped (that being the kind of symbol declared by the "my" keyword) and hence needs no package name), but Perl is not the subject of this comment, Python is.

How do you mitigate this problem of Python's? "Name used only once" warnings would help, but what about object attributes? Only access them via accessor methods, I suppose (even from within its other methods). Inadequate workarounds to a problem that has no reason for being, IMO.

Disclaimer: I have avoided Python like the plague for exactly this reason, so if you reply with a decent solution then I will look like an idiot. But I will also be very pleased because it will mean I can finally use Python :)

But if I am right and the problem is not adequately solved and if I'm not misunderstanding Python's declaration semantics then I don't understand why more people don't find declaration-is-first-assignment such an offensive language "feature" trading in safety/correctness for a little brevity. I can't find many such people through Google, nor many discussions of this problem/feature. I did find one person claiming it encourages choosing more descriptive identifier names (yes, really), but I would say that it encourages exactly the opposite.

BTW I just found your blog via C++FQA while looking for C++ refactoring tools (yeah, I know ;) and it is *awesome* :) You have rescued one coder from the cargo cult, at least.

19. Yossi KreininMar 26, 2010

PyChecker seems to detect the name errors you don't want to hit at run time (heuristically, I guess, since you could stuff things into globals() at run time and have PyChecker erroneously report a problem). Anyway, I never used PyChecker, mostly because of being lazy and reluctant to set up an environment different from everyone's default, so I don't really know what's up with it. I generally agree that Python delays too many name errors to too late a stage (at least it then generates an error as opposed to the helpful default of make or non-strict Perl), and it's fairly annoying, however it's hardly a reason to avoid a language like the plague for me because this is an error you find the first time your tests cover the offending line, so it's not like it lets you plant this horrendous time bomb to explode at a mysterious unknown point in the future or something. Generally C++ has set my standard for programming languages very, very low; of all languages mentioned above, I'd probably avoid none except for make.

20. OktalistApr 17, 2010

I am frequently frustrated by the standard get-out answer "tests will find the bugs", as tests can only tell you that *something* is wrong with a method, not which particular something that is.

It's not just Python; it seems to afflict most all scripting languages and is commonly mistakenly seen as a necessary implication of dynamic typing (with the result that dynamic typing and explicit declarations are perceived as being mutually exclusive) and therefore part of "the philosophy".

Ruby is a beautiful language, but I still have reservations due to the same issue. It also lacks such a mature tool as PyChecker, and even if such a tool existed, Ruby's open classes mean that something that looks like a variable access actually might be a method invokation and this ambiguity may only be resolved at runtime.

NOTE: I am continuing this discussion as I value your input on the matter and I wish to explore it in more depth and more generality in order to expand my knowledge surrounding it and solidify my opinions concerning it prior to ranting about it in my own blog in the hope that language designers will take heed. Just so you know I'm not arguing just for the sake of it.

As soon as a language designer does away with typed variables, they instantly think they can do away with explicit variable declarations altogether, apparently without thinking about what other facilities are provided by such declarations besides specifying type. The perceived benefit of this is to cut down on a few characters of code for each variable.

I've already discussed one class of mistakes that declarations guard against (or two classes, depending on how you look at it), typos in variable assignments and accesses. Also, when you get into the habit of writing a declaration upon introducing a new variable, the compiler can tell you when you are redeclaring (shadowing) a pre-existing identifier. Finally, declaration makes explicit a variable's scope, without which languages tend to end up with scoping rules that feel rather primitive to someone used to the C-like block scoping syntax of, say, Perl or Java (notwithstanding closures, which can be done with or without explicit declarations).

Forcing the programmer to declare their variables just makes code easier to follow, understand and reason about, I feel.

When Perl 5 introduced the strict pragma, people thought it would be too unwieldy. Now it is the accepted wisdom among the Perl community that it should be on by default (and in Perl 6 it is).

Even Visual Basic has option strict, and that's not even a truly dynamically typed language (variables not explicitly declared are assumed to be of type Variant). As soon as a language introduces such an option, it very quickly becomes accepted wisdom to have it turned on all the time; that should tell you that it pays off.

It is the one feature of Perl that I miss when emigrating to Python or Ruby, and the one feature that means that for me, Perl 5 remains superior to those languages, even in the face of its abhorrent @_ and inconsistent OO.

The following Python and Ruby idioms allow one to assert that a method uses only a specific set of local variable names, but they're still only a runtime check and they still don't address stone-age scoping rules.

for v in sys._getframe.f_code.co_varnames:
if v not in ("declare", "local", "var", "names"):
raise NameError("'%s' not declared" % v)

local_variables.each do |v|
if not (:declare, :local, :var, :names).include?(v)
raise NameError.new("'#{v}' not declared")
end
end

21. Yossi KreininApr 28, 2010

Well, "tests will find the bugs" may not be a very wise attitude, but in this particular case you only need to run the line once to find the problem so counting on that to happen is much less optimistic than "counting on testing" in general. As to the benefits of declarations unrelated to typing, I probably agree, though I don't think of it as a killer feature.



Post a comment