The Evolution of CS Papers
This post was prompted by a Facebook interaction regarding Dijktra’s famous “GO TO Statement Considered Harmful” article, a letter he sent to the editor of CACM back in 1968. Seen through the lens of currently accepted research reporting practices, Dijsktra’s article reads like a technical rant, something that we might find in today’s blogs, but that have been essentially abolished from peer-reviewed research publications. Here’s a person stating his strong opinion about a language construct that he dislikes, starting with a strong premise about the negative effects of that construct without presenting any data whatsoever supporting that premise! As far as anyone can tell, the premise was based entirely in his own experience and intuition — which happened to go against opinions of other famous Computer Scientists at the time like Knuth. Granted, this was just a letter to the editor, not a full-blown technical article. However, the influence of this letter is unquestionable! Following Dijsktra’s rage against GO TOs, practically all programming languages that came after it avoided that construct, and the ones that didn’t, work hard at hiding it from programmers, sort of like “oops! sorry! there’s goto, but try to avoid it, ok?” (Also, this letter was single-handedly responsible for starting the meme “X considered harmful” that has been going on in CS since then, although credit for that meme goes, not to Dijkstra, but to the CACM editor, who spiced the title a little bit from its original “A case against the GO TO statement.”)
Whether GO TO is harmful or not, is besides the point. In the process of writing my upcoming book, I spent a considerable amount of time over the past year looking through old CS literature. It’s fascinating! The topic of this post is the evolution of methods in Computer Science research for the past 60 years, and the changing ways by which ideas are considered well-argued for.
Let me start in 1960, because I believe the papers of the time represent a fork in the road.
Back in 1960, Dijkstra published a very influential technical paper with the title “Recursive Programming” in the mathematical journal Numerische Mathematik, where he put forward the idea of using a stack in the implementation of subroutine calls. Up to that point, subroutines in the PLs of the time, specifically Algol, were given fixed blocks of memory, so it was impossible to call them recursively. Using an activation record pushed onto a stack every time a subroutine was called solved this problem.
To set the record straight, Dijkstra was not the first person to suggest that stacks might be a useful data structure for computation. Back in 1946, Turing had already put forward that idea in an internal report, although, because of restrictions from the British Secret service, it’s questionable that anyone had seen it; but it’s likely that Turing talked about it. Using stacks for computing arithmetic expressions was well known at the time. By the mid-1950s some German professors had described using stacks for subroutines calls, including filing for a patent, and the idea was independently developed by 2 Australians and another Professor at ETH. Dijkstra’s paper has the grand total of… 1 reference to a paper by one of his close collaborators. It’s interesting to analyze the context of the citation to that work:
[After presenting arithmetic expressions using stacks]
“The above is well know (see for instance ) and so elegant that we could not refrain from trying to extend this technique by consistent application of its principles(*). […]
(*) Without doubt, in view of the vivid interest in the construction of compilers, at least some of these extensions have been envisaged by others but the author was unable to trace any publication that went further than . The referee was so kind to send him a copy of the report ‘Gebrauchsanleitung fur die ERMETH’ of the Institut fur Angewandte Mathematik der ETH, Zurich, a description by Heinz Waldburger of a specific program organization in which similar techniques as developed by Professor H. Ruthishauser in his lectures are actually incorporated. The author of the present paper thinks, however, that there the principle of recursiveness has not been carried through to this ultimate consequences which leads to logically unnecessary restriction like the impossibility of nesting intermediate returns and the limitation of the order of the subroutine jump (cf. section F 44 of the report).”
This is the entirety of “related work” in Dijkstra’s paper — a footnote! The passage also discloses Dijsktra’s attraction to stacks: a technique “…so elegant that we could not refrain from trying to extend [it]…” (ooook!)
Note this was not a letter to the editor, it was a full-blown technical, peer-reviewed paper published in a Mathematical journal. Seen in the light of current-day peer reviewing practices in CS, even considering the fact that the paper was proposing a clear solution for a clear problem, this paper would likely be shot down. First, the problem for which the solution was proposed was clear but not well motivated: how important was it for programs [of the time] to call routines recursively? — maybe it wasn’t important at all! If it was important, what were other people doing about it? Any paper with only 1 reference in the bibliography goes immediately to the reject pile. Second, given that the concept had already been proposed (in writing) by H. Waldburger, the author of this paper would at least be requested to express his ideas in terms of how they were different from those described in H. Waldburger’s paper rather than spending most of the paper describing how to do procedure calls using stacks. Finally, the work is borderline an “engineering issue” rather than a scientific one.
These are complaints often seen in today’s CS peer reviews.
But this obviously misses the cultural context of the mid-20th century. Things were very different back then!
Let’s go even further back to 1936, an important year for the emergent field of computing. In that year, two foundational papers were published, one by Alonzo Church and the other by Alan Turing (picture on the left: page 28 of Turing’s paper). Church’s paper described the lambda calculus, a model of computation that influenced the design of the Lisp system 2 decades later; Turing’s paper described the Universal Turing Machine, which influenced the architecture of computing machines that were built 1 decade later. Both papers focused on the issue of computability of functions; they both had Gödel’s earlier work as the stepping stone. Turing referred to Church’s paper, which had been published just a few months earlier. Turing’s and Church’s papers are both Mathematical in nature: the ideas are argued with theorems and proofs. They read like today’s Mathematical papers.
Between 1936 and 1955, work in the computing field was published in Mathematical journals, as there were no publications dedicated to it. One of the first publications dedicated to this emerging field was the Journal of the ACM (JACM). Started in 1954, its first issues focused primarily on numerical methods that could be given to a computer for solving algebra problems, differential equations, etc. Those papers were Mathematical, using theorems and proofs as the method of argumentation (the figure on the left shows a page from a paper published in 3(3) 1956). One could call that work Computational Mathematics. Over the years, JACM consistently continued to focus on the Mathematical aspects of Computer Science, although in the earlier years, and from time to time, it also carried non-Mathematical, design-oriented, papers, some of which resulting from work presented orally at the Society’s annual meetings — but those were the exception, and their methods were clearly at odds with the ones in the Mathematical papers.
For example, one short non-Mathematical paper published in 1961 describing a language for symbol manipulation ends with “The examples in the text and appendices are adapted from an ALGOL translator which the author is programming for the UNIVAC 1105 in the notation described. It appears that a translator can be described compactly and precisely in this notation, and that it is not difficult to translate the productions into symbolic assembly language.” Another one published in 1965 describing an architecture for a multiprogramming computer ends with “The ideas presented in this paper have grown from experience gained while implementing general purpose time-shared computer systems at MIT [5, 6], together with and our desire to create a second generation of computer utility with greater power and efficiency of operation. It is our conviction that these concepts are applicable to the design of online systems in general .” These papers had a completely different nature from the Mathematical papers, of which the picture above is one example.
Soon, the few design and implementation papers disappeared from JACM. Fast forwarding to 2014, and browsing through the most recent Table of Contents of JACM, it is not surprising to find papers that are Mathematical in nature, and that could very well have been there in 1955! Advances in human knowledge in the past 60 years not withstanding, the ways of arguing accepted at the JACM have been fairly homogeneous: formal systems, theorems and proofs.
But by the late 1950s, “Computer Science” had grown from pure Mathematics to the various ways of programming computers and of processing programs written in various programming languages. Dijkstra’s work follows this new trend, and his “Recursive Programming” is a good example. He was not the only one. For example, John McCarthy’s “Recursive functions of symbolic expressions and their computation by machine, part I“, published in the same year as Dijsktra’s “Recursive Programming”, describes the Lisp system — a system intended to carry out instructions using “common sense”. Even though the work was based on Church’s lambda-calculus, McCarthy’s paper has no theorems or proofs; also, there aren’t any illustrative applications, as he hoped to do that in another paper (which was never written!). The paper is entirely a [dry] description of the Lisp system, with no further justification for it other than they wanted to build it in order to support automatic deductions.
In order to cope with this non-Mathematical material, in 1957 the ACM started a monthly magazine called Communications of the ACM (CACM). CACM featured a mixture of algorithms, language designs, recipes for using computers efficiently, and opinions on topics that were important at the time, including what to call the emerging field. McCarthy’s Lisp paper was published there in 1960. It’s fascinating to browse through the earlier years’ issues of CACM: the sense of discovery and of chartering uncharted territory is palpable. Most of the articles were of the nature “I did this cool thing, and I think everyone should know about it!” or “I see this problem, and I have a cool idea for how to solve it! (I may or may not have done it)”. Things that were considered “cool” at the time were efficient algorithms, analysis of geeky details of programming languages (Algol seemed to attract a lot of attention for a few years), new language constructs, compilers and operating system structures.
CACM illustrates the fork in the road that happened around 1960: while some papers stayed within Mathematics, more and more material was being communicated that wasn’t Mathematical, but that covered some aspect of design and implementation of computers and programs. This second kind of articles seemed to be valued for exposing some novel idea that might appeal to the readership. A series of algorithms, including Hoare’s Quicksort, published early on is a wonderful example of this (see CACM 4(7), figure on the left). The algorithms were presented as pseudo-code without much text around them other than a comment!
With the exception of JACM, throughout the 1960s and 1970s, at least in the US, novelty and interest to readership seemed to be the main criteria for publishing technical computer science work. Whether the ideas were effective or sound or provable or broadly applicable was not a concern (if it was, the published papers don’t transpire it!). The image on the left is a page from CACM 8(6) 1965 containing 2 short recipes and 1 small observation, the dominant contents of the magazine around that time. There were also longer articles, like McCarthy’s Lisp paper and others, again describing some piece of design or implementation work that seemed novel and interesting, without much of Mathematical or empirical basis for it.
This was the context in which Dijkstra’s “GO TO Statement Considered Harmful” emerged in 1968 and, in this light, it’s clear that his letter to the editor was not an outlier. The computing community was taking the first steps into unknown territory: programs and programming. Never in Human History had these artifacts existed in such scale, and the art of creating them was a whole new endeavor of human intellect! For a long time, it was completely unclear how to argue for or against ideas in this new territory and, to some extent, the methods are still murky today. In the beginning, arguing for or against didn’t even seem to be an issue; the main concern was feasibility — “here’s how we did it” — and that was extremely valuable, because the community was still learning the basics. Opinions also abounded, and taking opinions for fact seemed to be modus operandi. Dijkstra happened to be an extreme case of an opinionated person; his many writings are a testament to the strong beliefs that he was not shy to express, and very amusing too!
Publications in Computer Science grew exponentially since 1960. Starting in the 1970s, special interest groups started to form within the ACM around topic areas, each one with its annual conference. A similar thing happened in other professional societies, such as IFIP. Unlike the conferences in other fields, these conferences featured full-length, highly technical papers selected by committees of peers. The practice of technical publication featuring novel ideas without Mathematical or empirical basis continued throughout the 1970s and way into the 1980s.
By the mid-90s, however, things started to change. I’m old enough to remember this phase as a professional adult — I witnessed this transition. To a large extent, it was my generation that brought this transition about. Questions started to be raised: “This is cool but how are you going to validate your claims that your idea is good?” or “This is cool, but can you prove it?” or “Are you sure this is really novel? How is this different from that other thing?”, etc. One reason for this, I think, was the sheer volume of work being published; with so many ideas being thrown around, with so many words being used for similar concepts, and with so many commercial interests at stake, how could we separate the wheat from the chaff? Another reason, I think, was the realization by many that our field had the word “science” in it, but that very few — with the exception of the Mathematically-inclined — were actually doing it, so people scrambled to converge to methods accepted in other scientific communities that could be used in CS.
And so we came to the current state of affairs. Unevenly but steadily, “idea papers” and “opinion papers” have been essentially banned from CS research. Feasibility is usually not considered enough for arguing for the ideas, the work needs to present validation of those ideas — that being via performance numbers, Mathematical proofs, statistics on collections of data, or people-facing empirical methods. Granted, this is uneven; there are still a lot of good old strong opinions and tastes floating around! — e.g. types in PLs.
I have mixed feelings about this state of affairs, but I already wrote about that in another post. I like some of those old papers a lot! Had they not presented the “what & how to” without much regard for scientific argumentation, progress would likely have been a lot slower. Communities have their ways of separating the wheat from the chaff over time.