Sunday, January 03, 2010

Unshipped Software Does Not Exist

In much of computer science, at least the "systems" variety, an enormous amount of effort is spent designing, developing, and experimenting with software systems. Meaning, we write programs to make concrete our new ideas, show off our inventions, and validate our claims.
In the world of hard science, this engineering, albeit of the software kind, is more akin to experimental science than theoretical. We are like the physicists who build super-colliders, smash together atoms, and measure the results to validate or invalidate hypotheses posed by ourselves or others.
Despite what the theoretician might tell you, developing a complex software system is non-trivial and is more than "just a matter of engineering." It is often an incredibly complicated endeavor that continuously opens up (and sometimes closes) new research doors, most of which we never publish. In my experience, merely shipping a complex software system takes about the same amount of time as writing a conference paper.
And thus we have our dilemma: we have an experimental science, that of systems-based computer science, whose sole output, for the vast majority of researchers is exclusively the twelve page conference paper, in which only the Smallest Publishable Unit (SPU) of the work is described.
Nowhere is the full software system described so that others can replicate the "experiment." Though we are a discipline that thrives on abstraction, you essentially never see a full, or even partial, specification of a research software system.
And obtaining a copy of the concrete system designed and built by researchers over many man-months? Forget it. It is always "not quite done" or "needs to be cleaned up." Or perhaps it is "pending an IP review" by a technology transfer office. Heck, some researchers simply do not answer their emails or return phone calls when I ask them for a copy of their system!
More often, the reason scientists do not ship is more pragmatic and more cynical. Shipping software is simply not directly rewarded in nearly all Universities. Tenure reviews and promotion panels sometimes even state that developing and shipping software is a waste of time, time better spent on writing peer-reviewed papers.
In my view, this situation is untenable and this behavior is unforgivable. This is not legitimate science or engineering.
If you do not ship a research software system, it does not exist.
Like the physical sciences, where one cannot publish a paper unless an experiment is described in excruciating detail and data is often made publicly available, I believe that one should not be permitted to publish results based upon an unshipped and undescribed system.
When I review research papers that discuss results coupled to software systems, the first thing I search for in the PDF is "http." If I cannot find a mention of how and where to download the system in question, my warning bells go off. If a Google search turns up nada, I reject the paper, as simple as that. Hollow promises of shipping after publication or at some later date are ignored, as they are so often unfulfilled.
I also know from personal experience how rewarding developing and shipping a software system can be.
You are opening your heart and head to the world by showing everyone exactly what you are made of. Sure, you may have fewer papers than some competitors, but your limited time budget for writing means that you must more tightly prioritize writing goals and publication targets.
The notion of SPU goes out the window as you want to put as much into each research paper as you can fit, rather than as little as will be accepted for publication.
Finally, if you develop systems that are useful and usable, you gain an audience of industry and academic users that is typically at least as large as the number of people that would have read that conference paper or two you did not write, and typically orders of magnitude larger.
My advice to the young PhD Computer Science student? Ship your software; you won't regret it.



Blogger Dermot Cochran said...

I have noticed that the ESC/Java2 documentation appears on Google scholar and gets citations; so that's one software release that might be counted as a kind of publication.

22 January, 2010 10:37  

Post a Comment

<< Home