I was just going to put a comment on his blog, but Julian has suggested something bordering on heresy, so I figure it’s appropriate for subject matter here. In addition, this is probably going to take a while, and I think it’s going to go beyond tackling Julian’s assertion.
He’s suggesting case preserving, case insensitivity is appropriate for programming languages. Case insensitive, case preserving in this case means don’t change the case of identifiers, but match all identifiers regardless of case (eg: “Blah” is still displayed as “Blah”, but is the same as “bLah”, which is still displayed as “bLah”). You’d have to read his blog to get the entire argument, but I’ll display (hopefully) the critical bits here. He first makes some arguments against case-sensitivity (or rather, some reasons against the benefits of case sensitivity) but he does not mention the strongest reasons for case sensitivity. I’ll give my reasons later, but first, I’ll address the case for case insensitivity.
Taking apart the argument
The basic gist of his point is that KEANU REEVES is understood as being the same as KeAnu ReEvEs by humans, so it should be understood as being the same by computers. However, humans assign significance to capitalisation, so capitalisation should be preferred. This makes the programming language more human. However, there are certain cases in human language where capitalisation or context is the discriminator, and computers don’t always have context. Let’s say Keanu’s first name was ‘Boing’ (“Mr Boing Reeves”). In this case, ‘Boing’ would clearly be different to ‘boing’. While in English this kind of case does not crop up often, in programming this is untrue. While Julian does mention theclass Foo foo("bar"); case, he claims that the practise is indefensible. I think it’s perfectly fine. Say ‘Foo’ is a singleton, it’s often acceptable to call the instance ‘foo’. Even for classes which tend to have only a single instance in a program, I think it’s perfectly acceptable to call the object by the class name. In fact, we often do this in English (“the clock” to refer to an unspecified “Clock”, or “bike” for a particular “Bike”, whereas multiple clocks will be given some more specific names).The worst thing about his argument is the rule of engineering: Consistency is good. Forced consistency is good. This is why we have coding conventions, checkstyle, indent. Any sane coding convention is going to force consistent capitalisation anyway. Worse, because the language allows inconsistent naming, and naming is difficult to check for a formatter or something like checkstyle, warnings can only possibly show up by the compiler. Should the compiler show a warning that inconsistent naming has been used? If so, what’s wrong with the same warning in a case-sensitive compiler?
The real reason for his argument comes from something I saw when he was setting up his framework. As soon as he mentioned case, I thought “he’s been using PHP, and is pissed because you don’t have to declare variables and messing up the case will leave a nasty bug with no warnings”. I know that sounds like a long shot, but it’s happened to me, and I went through the same thing, so I just went “if it were me, this is where I’d be going”. The place where he leaves his argument open is where he says:
The two most common capitalisation errors I make are: HOlding DOwn THe SHift KEy TOo LOng, and being inconsistent in CamelCasing the term “fileName” (I never did resolve satisfactorily whether it was one word or two!)It’s unforgivable that you could have a spelling mistake in your code “KEanu” and it still runs correctly, giving no warnings. There’s no actual advantage to being able to have improper spelling (or is it grammar?). In addition, someone else reading the code could actually wonder if you meant different things when you said “filename” and “fileName”. The solution is declaring variables or having warnings for inconsistent case.
In case no one believes that someone might think “filename” and “fileName” were different, I’ll give you a story from uni. At uni we’d often get answers to questions which were wrong. People with a clue often figured out that the answer had a silly mistake and would continue nonetheless. People with less of a clue just got plain confused until year 3 or 4 where they realised that they saw this stuff so often that there must be mistakes in the answers. People with little or no clue would construct alternate abstract mathematical universes where the answers would somehow become correct. It was really quite scary to see them solve problems sometimes. In the same way, if code that looks funny executes correctly, we’re going to see people with strange voodoo consistency which they won’t play with. This is most definitely not good.
The correct solution is declaring variables. I was always undecided about the topic of declaring variables. I thought there was no need, and no point. I thought it was just there to make it easy for the compiler. I thought anime was lame when the characters declared their attacks. Then I saw martian successor Nadesico, and now I know that you declare attacks for more than just allowing the audience to know what you’re doing. You do it for style, and you do it because it’s what you believe in. It’s the same with variables. It’s not just for the compiler, it’s for style, and it’s what you want the variable to be…
gekigan punch;
gekigan flare;
Why case sensitivity is good
Julian mentions some lame reasons for why case sensitivity is good, and then takes them down like burnt effigies. The only thing I can salvage (other than the “Foo foo” thing) is how he mentions that the difference between A and a is minuscule. If you had a variable named ‘a’ and another named ‘A’, you would think of them as different. ‘a’ sounds like a scalar, or a vector, whereas ‘A’ sounds like a matrix. ‘Ax’ is “intuitively” a matrix multiplication. Surely case here is more important than the actual identifier used. ‘Ax’ or ‘By’ is still just a matrix multiplier.I’ve already made the point of forced consistency. I can kind of extend this point by saying that you can be sure that a particular capitalisation has the connotations you attach to it. This has already been mentioned in one of the comments to Julian’s blog entry, but it’s what one good turn deserves. THIS_IDENTIFIER is clearly a constant, ThisIdentifier is clearly a class, thisIdentifier is clearly a variable. You can’t accidentally type thisIdentifier and get a class in a case-sensitive world, whereas you can in the case-insensitive world. In addition, this may form a sort of “hungarian notation”, which is evil. For example: “c_this_identifier” for a constant, “clThisIdentifier” for a class, or “vThisIdentifier” for a variable.
The final point is important, but subtle. Case preserving, case insensitive identifiers encourage “more human” thinking. The problem is, when you’re thinking human, you’re almost definitely thinking wrong. The only reason people zone out when coding is that they’re thinking in the problem domain, and in the language of the problem domain. When you’re writing in C, you’re thinking in C. When you’re writing in something “intuitive”, you’re thinking “intuitively”, which is to say, less precisely. I can only speak for myself, but I find it hard to zone out in languages that are imprecise, like SQL or BASIC. I believe a part of that can be attributed to the imprecise nature of the language itself.
The fallacy of intuition
The real problem I have with his proposal is the ending. Julian ends with:There is no longer any excuse for making humans learn and handle the quirks of the way computers store upper- and lower-case characters. Instead, software should handle the quirks of human language.It sounds a lot like:
It is time for integration of the cases! Case-Preserving Case-Insensitivity: equal and yet different!
“Why won’t the machine just do what I want”which sounds to me like:
I cant type properly and ny shuft ky is stuk itd be good if the puter fixed all my typing an dint crash all my 1338 code LOL!!!1I occasionally have to type my password in two or three times to get in, because I get it wrong the first time. At times like that I think “maybe it’d be nice if it’d let me pass if I was close enough, or had a couple of close-enough guesses”. Then I come to my senses. LOL indeed.
Nothing against Julian in that last bit, btw. He certainly doesn’t type like that.
I’m a person who spends a lot of time thinking about how one should interact with the PC. I’m really keen on tablet PCs. I think “intuitiveness” is a load of fucking shit. A fallacy, a lie, a failure of higher thinking. It’s what happens when you’ve stayed up too long and your body is trying to hurt you so you’ll get some rest. I wish I had stronger words, but I don’t. Every intuitive program I’ve ever seen is a piece of shit. It’s always non standard, slower, and less flexible than whatever “less intuitive” thing was before it. I remember programs that had pictures of a virtual room which you could click on to do things. A desk on which you’d work on documents, a briefcase, a calculator, walls and TVs and shit.
Those programs don’t exist anymore.
You know why they ship solitaire with every copy of windows? So you’ll learn how to use a mouse. If you didn’t, I’m betting people would’ve stayed with whatever they were using before. Microsoft may or may not have known it, but they were probably betting that people would while away hours playing minesweeper and solitaire, honing their mousing skills before they’d ever want to do anything “intuitive” on their machines.
I can’t use macs. Never have. I thought those buggers were meant to make sense. I went to nathan’s house and started using his mini while he was in the shower. I felt really uncomfortable until I found the terminal.
Anyone who ever says anything is intuitive is probably lying. Try picking up CAD and figuring out how to use it. I guarantee you’ll give up unless you’ve used some other CAD program, regardless of how “intuitive” the program claims to be. Hell, even go from the “drawing” model CAD programs to the CSG ones, and you’ll probably be screwed. This is because programs deal with concepts. If you don’t grasp the concept already, you think the program is not intuitive. Most people have written a letter, so they think they “get” word processing packages. Most people haven’t designed something to be built on a lathe, so they can’t “get” CAD.
In conclusion, case insensitivity is bad because it allows inconsistency, allows errors, and makes reading code harder. Case sensitivity is good because it’s consistent, gives more information to both the compiler and the reader, and allows for better “zoning”. Intuitiveness is bad because nothing touted as intuitive is ever standard, flexible, and powerful, and the idea of intuition as a goal is a fucking lie. Power is good because it allows professionals to do their jobs properly.
I think it’s time to expose intuition-loving hippies for the frauds they are. Power to the people! Olé!
Sunny,
I am glad to have provoked some thought on the matter – I knew I was treading into controversial territory, so I spent sometime preparing the arguments.
Unfortunately, I left one minor point out. My praise for “”http://www.somethinkodd.com/oddthinking/2005/10/26/the-world-of-case-sensitivity/" REL="nofollow">Dictionary Definition Canonical Form" support in an IDE was added as a comment to my original post, just minutes before you posted this to your blog. To some extent, this shows that I agree with many of your objections, and propose that we can rely on simple technology for us to overcome it.
Let’s say Keanu’s first name was ‘Boing’ (“Mr Boing Reeves”). In this case, ‘Boing’ would clearly be different to ‘boing’.
I am not convinced it would be that clearly different, as I am sure many people with names that are homonyms with English words might attest!
While in English this kind of case does not crop up often, in programming this is untrue.
I am not saying that it is true for programming. I am saying it should be true for programming!
Say ‘Foo’ is a singleton, it’s often acceptable to call the instance ‘foo’.
So, if I had my way, this wouldn’t be possible. Is it such a great loss, in return for the benefits? I argue the answer is “No”.
“the clock” to refer to an unspecified “Clock”, or “bike” for a particular “Bike”
I strongly agree with the first example. The variable could be called the_clock, and we would both be happy. I don’t think the second example is true. We don’t say “Pick up bike”. Call it the_bike, my_bike or a_bike, but bike is a type of object not an instance of an object. (I can think of two minor exceptions to this: calling a dog “Dog” or calling a boy “Boy”. I don’t think they invalidate my argument.)
The worst thing about his argument is the rule of engineering: Consistency is good. Forced consistency is good. This is why we have coding conventions, checkstyle, indent.
You should explain why consistency is good. Where it promotes clarity, where it allows higher level abstractions because you agree on the lower-level definitions, where it makes items more likely to plug together, consistency is good.
Where it is forcing the user to jump through hoops to make the computer understand you, I don’t agree.
Any sane coding convention is going to force consistent capitalisation anyway. Worse, because the language allows inconsistent naming, and naming is difficult to check for a formatter or something like checkstyle, warnings can only possibly show up by the compiler. Should the compiler show a warning that inconsistent naming has been used? If so, what’s wrong with the same warning in a case-sensitive compiler?
Here’s where I point to powers of a decent IDE to say “let it take care of this for you”.
As soon as he mentioned case, I thought “he’s been using PHP, and is pissed because you don’t have to declare variables and messing up the case will leave a nasty bug with no warnings”. I know that sounds like a long shot, but it’s happened to me, and I went through the same thing
Sure that’s happened to me. Sure that’s happened to you too. It happens to everyone who uses PHP (and Python, and Perl, and CSS, and…). So let’s fix it our development environments, so it doesn’t.
It’s unforgivable that you could have a spelling mistake in your code “KEanu” and it still runs correctly, giving no warnings. There’s no actual advantage to being able to have improper spelling (or is it grammar?).
We could make it illegal to write x = 1.200. We could insist that it be written as x = 1.2, but it wouldn’t add any value to be so pedantic, because 1.200 and 1.2 are both considered legal. So why isn’t KEanu just as legitimate? Why do you consider it a spelling error? (My argument would be more forceful here if we weren’t using a human name, and stuck to a typical function name – e.g. get_URL versus get_url.)
In addition, someone else reading the code could actually wonder if you meant different things when you said “filename” and “fileName”.
Some languages are case-insensitive when you write key-words. You can write IF, if or If. Do people wonder what is meant by the different capitalisations? No.
Sure, people who are still hung-up on case-sensitivity will take a while to get used to it, just like they have to get used to != versus <>.
The correct solution is declaring variables.
Amen, brother! SmallTalk has brought a great evil upon our world, and it has spread to many scripting languages. But this is a different topic, and one which is at least as controversial.
‘a’ sounds like a scalar, or a vector, whereas ‘A’ sounds like a matrix.
An interesting point. For mathematicians, the world is case-sensitive. I guess I would add physicists too, and their units.
It is not surprising their worlds are case-sensitive. They are writing out equations on the blackboard so often that they want to cram the symbols in quite densely.
I don’t think that is a typical case for programmers though. Soon, we would end up back at Fortran, where the name of the variable determines its type, because that is how it works in maths!
THIS_IDENTIFIER is clearly a constant, ThisIdentifier is clearly a class, thisIdentifier is clearly a variable.
Oh, so by your own arguments, ‘A’ must therefore be a constant matrix class? :-)
These are conventions (and I argue deplorable ones) that have arisen from case-sensitive languages. These case conventions can still be used! You just can’t have the identifiers overlap.
You can’t accidentally type thisIdentifier and get a class in a case-sensitive world, whereas you can in the case-insensitive world.
I think that you have this backwards! In the case-sensitive world, a simplistic case-shift (which I argue is an easy-to-overlook change – proof) could lead you to the wrong item. In a case-insensitive world, you would need to change the name of the variable in order to get the wrong type.
In addition, this may form a sort of “hungarian notation”, which is evil.
The most common form of Hungarian notation is, indeed, evil. There are some spirited defences of some versions based closely on the original. But, whatever your reason for arguing that the Hungarian notation is evil, the exact same arguments apply to the very case conventions that you were just defending!
You go on to attack “intuitive” software which will attempt to “do what I want”. That’s a straw-man. I am not a “intuition-loving hippie”; I am not asking for a computer with super-intelligent powers.
I wrote “If a computer can also disambiguate [variable names] accurately, it should do so” . You are arguing that computers can’t disambiguate woolly thinking. I agree, but I am not asking for that. All I am asking is that the compiler call the to_upper function.
I was using an IDE that took care of all of this back in 1991. Now that I have a computer in front of me that is over 1000 more powerful, I don’t understand why I can’t have that same simplicity that I had back then. Give that power to the people!