Conferences vs. Journals: The Hidden Assumptions
Some conversations I’ve been having over the past year led me to a deeper exploration of the issue of conferences vs. journals in Computer Science. The debate, so far, seems to be missing a few critical observations regarding scientific journals and our own ACM, and therefore it is somewhat incomplete. This essay lays out my thoughts on it.
Warning: No one except academics cares about this!
I’ll start with the historical context. CS is this oddball academic discipline where people publish their most important work in conferences instead of in journals. For several years now, CS faculty have been fighting an uphill battle with promotion committees regarding the workings of our field. Back in 1999, the Computing Research Association (CRA) put out a “best practices memo” called Evaluating Computer Scientists and Engineers for Promotion and Tenure written by three heavyweight computer scientists. The starting point of that document is this:
[…] Relying on journal publications as the sole demonstration of scholarly achievement, especially counting such publications to determine whether they exceed a prescribed threshold, ignores significant evidence of accomplishment in computer science and engineering. For example, conference publication is preferred in the field, and computational artifacts —software, chips, etc. —are a tangible means of conveying ideas and insight. Obligating faculty to be evaluated by this traditional standard handicaps their careers, and indirectly harms the field.
This memo gave ammunition to many CS departments over the past several years, at least in the US. In our School of ICS, for example, tenure and promotion cases are often attached with a memo written by the Chair or the Dean conveying essentially the message quoted above, and many letter-writers reinforce this message in their letters.
In spite of this ammunition, the battle has continued to be an uphill one, because, well, we’re the only ones doing this odd thing. About four years ago, Moshe Vardi brought the issue into the public arena in his CACM letter called Conferences vs. Journals in Computing Research, where he wonders if CS is driving on the wrong side of the road:
[…] the prevailing academic standard of “publish” is “publish in archival journals.” Why are we the only discipline driving on the conference side of the “publication road?”
His letter exposes the Achilles Heel of the CS publication process:
My concern is our system has compromised one of the cornerstones of scientific publication—peer review. Some call computing-research conferences “refereed conferences,” but we all know this is just an attempt to mollify promotion and tenure committees. The reviewing process performed by program committees is done under extreme time and workload pressures, and it does not rise to the level of careful refereeing. There is some expectation that conference papers will be followed up by journal papers, where careful refereeing will ultimately take place. In truth, only a small fraction of conference papers are followed up by journal papers.
He’s right, sort of, and we all know this, of course. In the top CS conferences, reviewers are assigned something like 15 to 28 papers to review in about 2 months. I’ve heard about reviewers who basically suspend their jobs for 5 weeks in order to read the papers and produce in-depth reviews. Most reviewers, however, are sane people who don’t, or can’t afford to, do this. They apply all sorts of heuristics like spending more time with the best papers in their pile and less time with the ones that aren’t so good. Or they give some papers out to colleagues and students who they know are in a better position to make in-depth reviews. Or, plain and simply, they just spend an hour or so with each paper and produce the best review they can produce within that time. Given the enormous task we ask ourselves to do, at no pay, we can be unhappy about this practice, but we can’t really condemn it!
The limited time that reviewers allocate to papers shows up in the quality of their reviews. It’s not unusual to see reviews consisting of one short paragraph. At that point, these aren’t reviews anymore, they’re just opinions on whether the paper should or should not be accepted; the review is missing.
But Vardi’s observation doesn’t tell the complete story. Journals often reject papers off the bat without even sending them to review. Top journals do a significant percentage of desk rejects. This is equivalent to 1-paragraph reviews. Many papers submitted to conferences aren’t mature enough to be given a thorough review, and that’s ok. The conference review process is more democratic than the journal review one, in the sense that all papers in scope of the conference are given a chance, and it’s up to the reviewers to decide whether the papers are worth their time or not.
There is no question, however, that when good reviews are provided, their integration into the papers is extremely beneficial for everyone, even if the authors need to do some more work. And this is where journals really have an edge over conferences. In most of our conferences, so far, we produce reviews, but their proper integration into the papers may or may not happen, and no one is there to check, because papers are accepted immediately.
So the debate of monsters vs. zombies continues.
Recently, parts of the CS community has started to come up with interesting hybrids. Grudin, Mark and Riedl described them well in a CACM Viewpoint called Conference-Journal Hybrids that I encourage everyone to read. They boil the hybrids down to three:
- Journal acceptance precedes conference presentation. In this category: PVLDB, TACO-HiPEAC and a few other conference-journal pairs, including TOPLAS-PLDI, that are starting to experiment with this model.
- Shepherded conference papers become journal articles. In this category: SIGGRAPH and Infovis and their journal counterparts ACM TOG and IEEE TVCG.
- Conferences without a journal affiliation that incorporate a revision cycle. In this category: AOSD, CSCW, and now also OOPSLA, of which I’m the PC Chair this year.
This is all very interesting, but here is a fundamental question: does the CS community really know how our colleagues’ journals work? After all, we don’t usually publish in them, and have, for the most part, stayed away from our own journals for 50 years…
In the quest to answer this question, I searched the web, talked to people, and have come across what seems to be the core of the CS community’s understanding of journals, something that is prominently stated in Grudin et al’s Viewpoint:
Journals encourage more revision and are less deadline driven
Where does this idea come from?
While it’s true that many journals engage in very long, and many, review cycles — the CS journals being in that category –, others don’t. Many excellent journals in other fields have very short cycles and at most 2 rounds of reviews. Here are some well known examples: Science, PNAS and Physics Review D. Not all journals are like that, but many are. There isn’t just one single journal model, there are many! (and I don’t even want to touch on the fascinating history of peer review, which puts a lot of what we take for granted into question)
So, again, where does our monochromatic CS journal model come from?
The ACM Publications Board — the powerful board that controls all journals and transactions published by the ACM — felt the need to write down a policy concerning the publication of conference proceedings in ACM journals. I suspect, but I’m not sure, that this explicit policy has something to do with several conference leaders approaching them recently in order to just publish their conference proceedings as journals. And there it is, in that policy, the core of the community’s understanding of journals:
ACM journals and transactions are designed to publish research results which are the gold standard for the profession, i.e., they are of high novelty and interest, technically sound, and well presented. Achieving this level of quality requires a review process that provides the time necessary for careful review by acknowledged experts in the field. In particular, this means selection of reviewers from the widest possible pool, and open- ended review cycles that ensure the most sound and polished result. Such a standard is largely incompatible with conference review procedures which are sharply constrained by deadline.
So this is where the crux of the issue lives! The ACM, which pretty much institutes academic computer science, believes that its journals should have open-ended review cycles — that means, prolonged ping-pong between authors and reviewers that can last for a year or more. It also believes that the reviewer pool should be as wide as possible, as opposed to being the 25-30 people as the conferences’ program committees. But this last point is largely mute, because, as stated above, conference reviewers have naturally found sane ways of coping with reviewing overload which includes requesting reviews from others. This makes Program Committee members be essentially equivalent to journals’ Associate Editors (some conferences already switched their terminology, e.g. CHI and now ICSE). So the real difference here is the open-ended nature of the reviewing/revision process: the ACM thinks it’s critical for quality; many excellent journals disagree.
In fact, our best conferences aren’t that far from those excellent, short-cycle journals. This is particularly true for the conferences that are following the 3rd hybrid model described by Grudin et al. such as CSCW and OOPSLA, with the extra check at the end. The main difference is that we do it in batch, whereas those journals do it continuously. Let me explain: we recruit the community to perform actions en-masse under strict deadlines — there’s deadlines for submission, so our conferences get 200-400 papers all at once; there’s deadlines for reviews, so we get 600-1200 reviews all at once; then we meet and discuss the most promising papers; the entire program committee / associate editors participate in the final decisions, not just the PC Chair / Editor In Chief; we select a subset of papers to go to the second round all together; the authors all have the same amount of time to review their papers and resubmit on a strict deadline; we review a second time and submit the final assessment on a strict deadline; and we finally send the final notification at the same time for the papers in the second round. This is a communal effort, everyone is in it together!
So, yes, it is different from the continuous pipeline that our colleagues’ journals use, but is that bad? I like deadlines. Deadlines make my world go around! If it weren’t for deadlines I would procrastinate like a snail. I have a feeling that the CS community, in general, likes deadlines. The only problem with the batch model is that we have just one cycle per year, and that slows things down. But wait. We already have only one cycle per year, because our conferences happen once a year, so we can’t really point to that as a disadvantage of this model over our current conferences. The way we cope with that is by having many conferences over the year to which we can submit our work. But, granted, more than one cycle per year would be better than just one! — we just need to find a way of making that work in a way that doesn’t loose the community engagement that comes with the batch process.
The very same reasons that made conferences the loci of research work in our field also made journals in our field seriously lag behind in many respects. The community is not used to submitting original research work to journals; most of the journals in our field publish extended versions of conference papers, and it is not unusual that reviewers expect that as a condition for acceptance of papers in journals. If we disturb the prestige of our conferences, it is unclear where the good, original work in this field will be submitted to.
Many of our conferences have brilliant histories, having been the forum for some of the most important innovations over the last several decades. Devaluing that by transforming them in conferences-as-other-fields-have-it would waste all that history. OOPSLA, for example, has been naturally following the evolution and maturation of the software field; it has evolved as we, the community, have evolved. While I see the value of journals in the pursuit of solid knowledge that is backed up by evidence, I don’t necessarily agree that we should simply abandon the conferences in our field that have brought us to where we are today and start publishing in the existing open-ended review journals. Instead, I am inclined to accept that the next step in this evolution is for our best conferences to get more serious about argumentation and validation of the published work by adding a second round of reviews, bringing them to level with many excellent journals in other fields. Once that is in place, we can then have the discussion of what to call them: conference proceedings or journals?
From what I understand of academic history, the Humanities, Arts and Social Sciences have traditionally had open-ended review cycles. The “hard sciences” not so much, they want speed (this is an oversimplification). But all this is messy and interesting: the field of scientific publication is wide and mixed, it’s not monochromatic. The ACM, so far, has decided that it wants only one type of journal, and does not accept new journal proposals unless they account for open-ended review cycles. The CS community, on the other hand, by and large prefers short cycles with strict deadlines.
So here’s the departing thought: how much of this debate of journals vs. conferences in CS exists because of ACM’s current open-ended review cycles policy for its journals?
First off, it’s lovely to see a piece on this topic which takes an unexpected turn! I didn’t expect this to end up where you did, but I do agree with you.
Two, I’m really glad to see OOPSALA incorporating the revise-and-resubmit cycle. Having got used it as both author and PC member in CSCW, I now miss it in the other conferences I’m involved with.
And one more thing: you write “In the top CS conferences, reviewers are assigned something like 15 to 28 papers to review in about 2 months. ” I’m not sure I agree with that assessment. I’d like to think that I’m involved in at least *some* of the top CS conferences, like CHI and CSCW (as well as some others) which are, at very least as measured by factors like ACM DL downloads and the like, top CS conferences. Workloads are much more like 10-12 papers per AC, not 15-28. I have heard of rates that high in other conferences — but, frankly, those rates make me doubt the quality of the conferences and reviewing for the very reason you cite above.
@jofish, at POPL last year reviewers were given ~25 papers, which, from what I can tell, made them all go into post-reviewing burn-out, because the peer pressure to produce extensive reviews there is enormous. At OOPSLA this year, it’s ~16. I agree these numbers are too high. The reason for them is that these conferences use a single-level program committee, and each paper is given to 3 reviewers or so.
In the upcoming year, ICSE is switching to a 2-level review committee; the first level is called the “Program Board” and has 24 people; the second is called the “Program Committee” and has 80 people. This will lower the load for everyone.
Thanks for another enjoyable article and picking up an important topic. I don’t quite follow on a couple of aspects but the point I want to make is a different one:
My impression is that an implicit assumption of yours is that it is very important to keep poor work out of our conferences and journals and that expert peer review, if done right, can ensure this.
In my mind, this ignores a whole second dimension of how papers are evaluated: Not only do we have a quality gate in the form of expert peer review before publication, we also see the mass evaluation of a paper over time. Open Access and Google Scholar are the great equalizers: Papers are easily found (more or less but certainly than before) and properly calculated citation numbers will speak the ultimate truth about the value of a paper.
Conference and journal expert peer review have a lot of flaws, as studies have shown over and over again. Mass peer review by citing has its own problems and there is some good research to be done to improve it, but over time I expect it to be much better in putting a value onto a paper.
Expert peer review in a defined evaluation process and mass peer review in the open over time don’t overlap today, though sometimes folks suggest they should and that mass peer review should be pulled into the defined evaluation process running up to a conference or as part of a journal paper submission process.
I’m not so sure about joining them. You didn’t mention it as an option either. What are your thought on this?
Hi @Dirk. Yes, that’s a whole other can of worms… It relates to the history and purpose of peer review. We all assume that it started for purposes of quality control, and that is, roughly, how we use it; history, however, seems to indicate that it exists for purposes of censorship. There’s a fine line between the two… Plus, according to some literature, the systematic use of peer review in scientific publications is a relatively recent practice. For most of the past 2-3 centuries, the decision to accept or not accept papers was purely done by the editors, who delegated the correctness of the papers to the authors themselves.
Relevant post and book: http://michaelnielsen.org/blog/three-myths-about-scientific-peer-review/
I don’t necessarily agree that opening things up to the masses is always a good thing — see the recent episode with crowdsourcing the identification of the Boston bombers. But, as I said, this is a whole other issue that pertains to peer review in general, not just CS …
Crista, I think your distinction between science having fast journals and humanities having slow journals is a bit too coarse, and maybe needs to distinguish experimental sciences from theoretical science. At least, the mathematics journals I’ve published in have been indistinguishable from theoretical computer science journals in their review process (i.e. months to years review time, multiple revision cycles).
Hi D. You’re right, it’s an oversimplification. The main point is that there is a variety of journal types, some with long cycles others with short cycles. I don’t really know the reason for this variety.
You might be interested in some analysis of the recent furor over the Reinart and Rogoff Excel error. They have been going around claiming the work was published in an “AER paper”, when in reality it was a summary article of a talk, not given very thorough peer review: http://andrewgelman.com/2013/04/16/memo-to-reinhart-and-rogoff-i-think-its-best-to-admit-your-errors-and-go-on-from-there/#comment-144888
But to Dirk’s point, clearly it didn’t matter as the paper was widely cited because of its conclusions. Perhaps a full journal paper, with meaningful peer review (and accompanying data!) would have revealed this error.
I alway say: this is what happened because Computer Science came of age around the same time as the 747. But, really, I think this comes down to politics.
First, most universities are capable of recognising research across disciplines as different as law, fine arts, music, and science, with widely different publication patterns!
20 years ago, when CS was “cool” and growing rapidly, Deans & tenure & promotions committees had good reasons to retain & reward CS people. Over the last ten years, perhaps not so much. If a Dean wants to build CS, they’ll pay attention to what we say about publication in the discipline: if they don’t, it won’t matter if we publish in journals or conferences – they’ll be able to find all sorts of reasons, “true” or not, to avoid hiring or promoting CS people. If we publish in conferences, they’ll say conferences are useless; if we stop conferences and publish in journals they’ll say everyone knows CS journals are rubbish – or are just newly started – and the real action is in the conferences. Personally, I think there’s no comparison between the quality of a paper or the quality of the reviews received – between a 2- or 4- page article in Physical Reviews, where authors spend days tweaking latex and deleting paper tiles and author names from citations to get into the page limit, and get two reviews likely done by graduate students anyway – and a 15-20 page OOPSLA paper. Waving a couple of those at the physicists is one way to argue our publications are substantial!
With apparently 30% increases in CS enrolments, and people realising Google is at least as influential as say Genentech, perhaps more, maybe the pendulum will swing back our way.
A couple of comments on the actual post:
The only problem with the batch model is that we have just one cycle per year, and that slows things down
Not really. Most subdisciplines have two or three conferences, often well spaces. OOPSLA and ECOOP. ICSE and ECSE. POPL and ESOP. SIGGRAPH and EuroGraphics and SIGGRAPH Asia. The A in ACM doesn’t stand for American, and if US academics can only get academic credit by publishing in the US then we are all the poorer for it!
A related innovation here – between ECOOP & OOPSLA the last year or so – is the opportunity for authors to submit a statement describing the revisions made to a paper previously submitted to (and rejected from) a top tier conference. Frankly there aren’t many journals with two or three review cycles per year, and the quality of reviews from ECOOP. At that point the models become almost indistinguishable.
Once that is in place, we can then have the discussion of what to call them: conference proceedings or journals?
The biggest irony here is that almost all our conferences – or rather their proceedings – are bibliographically either journal articles or book chapters. Anything in LNCS is a book chapter: no question about that. We have a convention as citing ECOOP as “ECOOP” – we all know what that means – but citing an ECOOP publication as a chapter in a book called Object Oriented Programming is just as correct. Every OOPSLA, PLDI, and POPL paper is – and always has been – an article in one of the oldest, most established and highest impact journals in computer science, one with at least twice the aggregate citations of TOPLAS: SIGPLAN Notices. Sure, some articles in Notices used to be unreviewed (e.g. selected by editors) but that’s true of many scientific journals – and in Notices, even that has stopped recently.
It would be quite correct for most CS CVs to list every conference paper either as a book chapter or as a journal article – and give a separate list of “conference presentations” listing the actual talks — and it has been almost since the foundation of the field. That’s why I think these turf issues have more to do with institutional or interdisciplinary politics than any “objective” standards of How Science Should Be Done.
Hi James. One of these days I talked to some of the members of the ACM Pubs Board and asked them how the information flows from the ACM to Thomson — that company that computes the “official” impact factor of scientific publications. The answer was “with difficulty”… But besides that, the ACM only considers journals/transactions as worthy of Thomson’s citation index for journals. Thomson has a separate index for conferences proceedings, and ACM has tried to feed the conference info into that — not entirely successfully, from what they said. In any case, that would send our conference proceedings to the pot of other fields’ conference proceedings, which are completely different from ours. I believe that the notices series aren’t even in the Pubs Board radar, because they are produced by the SIGs.
The ACM Pubs Board is making internal decisions about what is an archival publication and what isn’t [within the ACM], and it seems that those decisions are somewhat at odds with how the community actually sees their publications. This is the observation that I think is missing from this debate. If the Pubs Board would open up their policy regarding what is an archival publication within the ACM, we wouldn’t be having this discussion.
Comments are closed.