Sep 24 2003
My First Programming Language(TM)
It is a long standing debate in Computer Science as to which programming language is the best for beginning programmers. In the early 60’s, students at Cornell were programming in CORC, a language designed as a pedagogic replacement for PL/1 in introductory courses. At the same time, DITRAN was being developed at the University of Wisconsin: a tool for running FORTRAN programs that (among other things) gave the programmer errors in terms of the code they wrote, not the assembly language produced by the tool; that was a big feature in 1965, one we take for granted today.
Both of these languages were developed for the sole purpose of teaching programming to novices. They were not for developing operating systems, writing device drivers, video games, word processors, or anything else. The languages were designed for teaching, because both the students and the instructors were new to the whole “programming a computer” thing. This was, for almost everyone involved, something they had never done before.
Today, we have thousands of languages. Literally. Thousands. If you believe even one ounce of the Sapir-Whorf Hypothesis, you’re going to have to believe that the choice of first programming language will affect how a novice programmer thinks about the programs the write and the way they express ideas in code. As a journeyman programmer in more than one language, I know the way I shape solutions to problems is definitely shaped by the language I’m using; to believe it is any different for a beginner is foolish.
But this isn’t debated in Computer Science circles today. It’s accepted. And I’m happy that it is. What bothers me is the languages we teach introductory programming with are not designed for novices. The tools that the students use are not designed for novices. Furthermore, even if you use a tool designed for a novice with a language that isn’t, it is difficult to patch all the leaky abstractions that will persist despite your best efforts.
My use of leaky abstractions is akin to what Joel Spolsky wrote on his weblog on November 11 of 2002.
| [This] is what computer scientists like to call an abstraction : a simplification of something much more complicated that is going on under the covers. As it turns out, a lot of computer programming consists of building abstractions. |
A leaky abstraction is when something gets out from under the covers (oh my!). And from a pedagogical standpoint, leaky abstractions make for difficult questions at times that you’d really rather not deal with them. Consider two programming languages:
| Language A | Language B |
![]() |
![]() |
The relatively small size of those circles is actually a visual lie. Both languages are actually much, much bigger. These circles represent the size of language that a novice should encounter. These are the basic nouns, verbs, and phrases that a beginner learning Language A or Language B should know. I am, you are, we are, they are, he is, she is, … of course the English language is huge, but we start with conjugations of critical verbs and by learning simple phrases: Hello, my name is Matthew.
| By analogy, that’s where I’d like to start. Unfortunately, we have a problem with Language A; A was designed for use in industry, writing huge pieces of enterprise software. It has an inconsistent syntax, and even though it aims to be self-consistent, but rules you learn early on have to be broken to write more expressively. Rules grammar change: English Traditional Replaced To Be New Syntax With. [The Onion] This becomes a problem for us when we want to start novices off with just a small part of the language. | ![]() |
| Language A | Language B |
![]() |
![]() |
In Language A, it turns out that you can’t say anything meaningful without reaching for other parts of the language. While we want to start off with simple bits of code, we must include bits and pieces of the full language to make even the simplest of sentences. Language B, however, is a complete core. We can do simple things simply to start, and if we want to say more complex things, we can. As we grow the language, we don’t have references out into parts of the language we don’t know; instead, we only have references back to the parts of the language we have already learned.
The inability to say simple things simply in Language A is a leaky abstraction. The inability to keep the complexity of the whole language under the covers kills us pedagogically (the blue arrow, below). What should we teach first? Where do we start to build a shared conceptual vocabulary that both the instructor and student can use? Where do we go next if we do manage to find a starting point?
| Language A | Language B |
![]() |
![]() |
In an ideal world, Language B would be ever expandable, so we could grow the language as we needed to. All our references would be back to things we have already learned: in learning a new verb, we don’t need to reference a verb tense we’ve never seen before, and once learned, it’s usage will never change. In time, we might even be able to develop a pedagogical ordering that makes clear to instructors what parts of the language (and the concepts they embody) should come first so as to scaffold the learning of larger and more complex ideas later.
As long as I’m dreaming, I might as well take this to completion. The important thing to remember is that the goal of a first course in computer science is to teach foundational concepts using a particular language, not to teach the language. Carpenters learn how to drive a nail before they learn the basic tenants of sound structure; pianists start with scales, not concerti. In either case, they aren’t learning about hammers and pianos, they’re learning about building and performing, using the tools of their trade.
Language B leads to Language A. Done right, it’s as simple as that. There isn’t anything wrong with Language A; it’s just not appropriate for a first course in computing. Language A was never designed for novices. It is the wrong tool for the task: instructors teaching a first course in programming are trying to teach students to drive nails and do scales, and inspire them to dream of skyscrapers and symphonies.![]() |
Language B exists to build a shared vocabulary between the instructor and the student. In learning it, a shared conceptual space is developed in which new ideas can be introduced, and a framework exists for discussing those ideas [Vygotsky]. As students explore Language A, there is a strong foundation (laid in the exploration of Language B) upon which the instructor can draw as they introduce and explain old and new ideas. |
There are different methods to teaching a child how to play the piano: some involve reading, some involve listening, but they all start with the same instrument, a collection of ebonies and ivories grouped in twos and threes. The focus for a piano teacher is strictly one of methodology, as they have no option to build the tools they teach with in conjunction with their pedagogy. In computing, we are Gods of our domain; we can create any language to suit our needs, develop it carefully, over time, fully aware of it’s sole end use: to teach. Despite this simple fact, we insist on keeping up with industrial trends, and must constantly reinvent ourselves and the tools we use in the name of progress, as if the fundamental task has ever changed.
Unfortunately, this implies that we know what that task is, which we don’t. Hence the failing of a discipline that continues to turn a blind eye to the importance of high-quality educational research.
A must read:
- Growing a Language, by Guy Steele.
I saw Guy give this talk at Indiana University; he spoke for 45 minutes on the importance of extensibility in programming languages, constraining himself to words of one syllable. He would only use words of more than one syllable if he had previously defined them using words of one syllable. To date the most riveting talk I’ve seen in Computer Science, and an absolute must read in my opinion.
Pedagogical environments:
Each of these links leads to a list of articles and papers related to the environment or project described.
- BlueJ, a pedagogical environment for Java.
An excellent piece of software. To an certain extent, Java is rife with leaky abstractions (from a learning and teaching perspective); Java is, in my mind, Language A. While it is possible for BlueJ to patch many of those (which it does well, and continues to do better every day), there will always be leaky abstractions. - DrScheme, a pedagogical environment for Scheme.
Another excellent piece of software. The PLT group have generally managed to build Language B, made possible by the simplicity of Scheme’s nature. - DrJava, a pedagogical environment for Java in the spirit of DrScheme.
I have only seen one presentation of DrJava, and have never used it. I believe it is fair to say that DrJava does not help patch the leaky abstractions in Java, but instead takes a path of full disclosure: it provides an environment for students to explore the language, warts and all. - Helium, an environment for learning Haskell.
I’ve never seen it before, and just learned of it yesterday. Given the dates on the documents available, it is much younger than either BlueJ or DrScheme. - The CS-1 Sandbox, a pedagogical environment for C.
It is unlikely that this particular environment will be developed further; the study of it was the subject of Peter DePasquale’s PhD dissertation.
Sweeping criticism:
In the case of all of these projects, the same fundamental problem exists: none of these projects has been developed hand-in-hand with research. While they are excellent, and driven by smart people, none of them are explicitly (or even implicitly) driven by research. The blue arrows driving all of them are based on experiential suppositions, not research findings. Peter’s work represents the only environment that has been empirically studied, but my dream of such an environment being developed hand-in-hand with research still remains.
- Many other articles
If you are interested in reading more in-depth about pedagogical languages, environments, and related issues, all of these articles dance around the topic to some degree.
6 Responses to “My First Programming Language(TM)”








Language B exists to build a shared vocabulary between the instructor and the student. In learning it, a shared conceptual space is developed in which new ideas can be introduced, and a framework exists for discussing those ideas [
.
I’ve always been unsatisfied with teaching Java, namely because of the Language A problems you bring up. Students have to memorize “public static void main(String[] args)” just to get any program to run, and that’s just the start. I think they need to understand logic and loops before they can even start to put objects together.
I’ve been thinking that there should be some toys that kids can play with to get them familiar with the concepts in computer science. There’s hints and such that I like to bring up, like “99 bottles of beer on a wall” or “Bingo” are loops, but there’s not really any Lego-ish toys that let kids create objects and then describe their interactions. Computer languages are so virtual; I’d like to have something concrete to play with and explore. Toys, my ideal CS1 language.
Hey Matt, I’d be interested in your thoughts on the Kernel Language Approach advocated by Van Roy and Haridi? (http://www.mozart-oz.org) It seems like their language would satisfy your criticism of development in hand with research (i.e. I think they would at least claim the blue arrow is natural and not leaky). Could it be done at the first year level? (Their text is aimed at 2nd year level students.)
My other thought here is to wonder what impact a student’s background brings in, even if we build a wonderful non-leaky language. Many of my students have done some sort of programming and that will leak into this (Sapir-Whorf I suppose).
For the record, while PLT may not have done “pedagogic
research” in the sense of departments of education to
construct DrScheme, we have certainly conducted a
number of feedback experiments. Indeed, if we hadn’t
we probably would have come up with the language
levels and many other innovations in this environment.
I believe you followed DrScheme for a while and you
must have noticed that the language levels changed.
These changes were the results of classroom and lab
observations based on the existing prototypes.
I believe that you misunderstand the magnitude of the
task if you ask for more or the complexity if you believe
theoretical research helps. Constructing DrScheme is a
huge undertaking. It takes years to explore this thing,
and nothing but an honest (!) feedback-experiment
cycle helps you get to the end.
“Mr Whorf, get down from there, and take off that ridiculous suit, it’s the Sapir Whorf hypothesis, not the Super Whorf hypothesis”
Wow, I made the list. Cool… I’m currently deciding what else to do with CS1 Sandbox, now that I’m done with the PhD. Being that I had to code against what I call a bastardized version of C/C++ that we used for instruction at VT (now taught in Java), I’d have to port the underlying language to support Java. Isn’t someone (Ian?) working on this for BlueJ? Thus, it’s moot for me to do it as well.
I’ve actually got other irons in the fire that I want to get to, so I’m not sure what else I really have the desire to do. On the list however is to get a version released to the web for playing with (that does not include the data collection support which is still currently part of the code base). Also, when I last run it under 1.4 (which we didn’t do during the experiment), the splash screen hung the app… so a few tweaks need to happen…
In other news…. Matt, how do you draw those nice graphics in your blogs? What program is that?
get more confused actually…
what i want to know is what would be the best Foundational Programming Language to use especially for those who were neophites.
Specifically, C against Visual Basic.