Applied Formal Methods: RSpec

Saturday, January 02, 2010

RSpec

I enjoy discovering when old, good ideas from the research community eventually trickle their way out into common practice, but sometimes what you discover surprises you.

For example, contracts are a great idea that should see wider use, especially in languages that provided assertions from day one. But after you discover the nth framework/tool for providing contracts in a language like C++ or Python that does not support inheritance or visibility, one gets a little deflated.

This week I accidentally came across RSpec from the Ruby community. It caught my eye initially because it is described as a "behavior-driven development tool" and one of its popular tutorials states on line one: "Behavior Driven Development is specifying how your application should work, rather than verifying that it works.", this sounds like a framework for me. Moreover, since contracts are at the core of behavioral interface specification languages (BISLs) like the Java Modeling Language (JML), and I'm "one of the JML guys," then I should get excited about RSpec, or at least learn something from it. But before I dig into RSpec, let's reflect upon Ruby a bit.

I learned Ruby when it had just "leaked" out of Japan many years ago. The only English documentation on the language then was a fragment of the API docs, and thus I had to learn the language by reading other peoples' code. I like Ruby. It purposefully or accidentally intelligently synthesizes some of the best ideas of Smalltalk and object-based languages with a prototype-based feel. By prototype-based feel I mean languages like JavaScript, Tcl and, I'm told, used in several of the scripting languages that I mentioned yesterday, namely Io, Logtalk, Lua, Omega, and REBOL. My experiences with these kinds of languages derives from the literature, namely Abadi and Cardelli's "Theory of Objects" and papers about the Self language. The fact that it gives you some access to its metaclass system, a la Smalltalk, in a relatively clean API unlike, say, the horrid APIs of Python and Perl, is also compelling.

Consequently, I have written a few thousand lines of Ruby, including some of the server-side processing for my research group's website, and thought it would be nice to see a clean OO scripting language like Ruby catch on (as it has, in spades).

So, I hear you say: "Hey, a simple OO language with a clean metaobject framework is ripe for the application or dependable software engineering principles, Joe!" I would say you were right, so lets see what has happened in the World of Ruby... so, back to RSpec.

The first thing to note is, while the 'B' in "BDD" means "behavior," it is not literally in the sense of BISLs, but instead is the "behavior" of the "Agile" community. *sigh* This already starts to worry me, but lets not throw the baby out with the bathwater, because sometimes riding on the coattails of a populist movement like "agile programming" (or "aspects" or "Java," for that matter!) is just a smart mechanism to effect change.

The API and common use of RSpec guides a developer down the path of connecting informal English sentences using modal verbs like "must" and "should" and code fragments which interpret the informal specification. Thus, "behavior" in this context is the informal, manual specification and linking of traditional requirements and hand-written unit tests.

Now, anyone familiar with my work in BON and verification-centric development will know that I think codifying requirements, domain analysis, and features in structured English is a Good Thing. And we have been developing a formal refinement between informal specifications in English and formal artifacts like requirements, concepts, tests, types, and assertions (look for a paper on this in 2010). So the juxtaposition of English and code is unsurprising to me.

The codification of assertions in the API is also interesting. Having methods like "should" and "should_not" are akin to jUnit methods like "assertTrue" and "assertFalse," though fit better with the vernacular of the domain. Permitting the definition of pre and postconditions of unit tests via "before" and "after" methods, akin to aspects and straight from the world of CLOS and MOP, is also nice to see. There is also integrated support for mock objects and the use of lambda expressions to talks about state in the pre-state of a method call is cute as well.

So in the end, I think RSpec is a pretty nice framework for specifying the behavior of Ruby code, but only if you are willing to accept the fundamental testing premise of agile programming: that hand-written unit tests should specify the behavior of a system. My criticisms of this approach are not unfamiliar. Hand-written executable tests are only maintainable at high-cost and are expensive to write early if one does not have (1) a fairly solid understanding of a domain and (2) pleasant customers who do not change requirements all of the time.

In other words, I am still unconvinced that in the key areas where agile programming is supposed to shine their fundamental tenant, that of test-driven development, holds true. If you are an agile practitioner and have evidence of this claim, please speak up!

I will write more on RSpec later this year after I get a chance to really take it for a test drive.

Labels: rspec, ruby, software

7 Comments:

Marnen said...: Interesting post; I agreed with most of it. However, your assertion that unit tests are "expensive to write early" is only true if you're writing tests for the whole system at once, before coding any of it. But that's not how I or any developer I know does it.

Rather, the point of this style of development is that you write *one* small test, watch it fail, then make it pass and move on to the next, refactoring as you go. This way, you can have emergent design without sacrificing test coverage.

In other words, you only test the little bit you know about, then you move on to the next little bit. If the client changes the requirements, no problem -- you write a different test.; 16 October, 2011 17:44
Joe Kiniry said...: Hi @Marnen. Thanks for your post.

You'll note that my statement is not quite what you quoted.

It is instead, "Hand-written executable tests are only maintainable at high-cost and are expensive to write early if one does not have (1) a fairly solid understanding of a domain and (2) pleasant customers who do not change requirements all of the time."

The cost of one test is, roughly, the cost of authoring, tracing, and maintaining the test.

From what I gather from my research colleagues in the "agile" and "testing" subfields, objective analysis of this strategy has shown itself to be not any more cost effective than previous testing strategies, and much less cost effective than using declarative specifications.

Moreover, at least from personal experience, I have really tried both approaches for years and I can safely say that I never, ever want to write tests first and either throw them away or babysit them when requirements are changing. Instead, I'd much rather write a declarative spec that is short, focused, and to the point and encapsulates the behaviors of dozens or hundreds of tests.

That being said, having a test-first process at least guarantees that there are tests, which is better than most projects who schedule testing last and then cut the whole phase when schedules slip.; 17 October, 2011 08:42
Marnen Laibow-Koser said...: Are the declarative specs executable? If not, they can't be self-verifying, and so you lose one of the benefits of something like RSpec. Also, I think there's a point to be made that if you only focus on what you think the system does, you introduce confirmation bias. Ideally, executable test make sure that the system actually does what you think it does.

I'm not sure where your distaste for maintaining tests come from, but I would think it would be an issue no matter when you write the tests. If the desired behavior of the system changes (and of course it will), the tests need to change with it. Also, if you don't write the tests first, then you're introducing untested code into the project, and I think we would both agree that that's bad. How do you deal with this problem?

Finally, unless I misunderstand what you mean by declarative specs (which is certainly possible), I don't see how you can do any sort of iterative design with them, and I certainly don't see how they help refactoring (which absolutely needs executable tests to be reliable). Iteration and refactoring are important, because the system almost never looks the way you think it will at the start. I don't see how declarative specs give you the flexibility necessary for these necessary changes to occur. What am I missing?; 29 September, 2014 00:25
Joe Kiniry said...: Hi @Marnen,

Most of our declarative specifications are executable, either in symbolic interpretation or via runtime verification. All of them are verifiable via static analysis, either automatically using extended static checking, or manually via theorem proving.

In other words, one can write a small volume of declarative specifications to generate a large volume of executable specifications, aka tests.

My distaste for maintaining tests is solely driven through experience in doing very large-scale TDD and very large-scale DBC. Our teams witness remarkable productivity and quality using DBC, so much so that TDD simply doesn't happen unless forced by a client or by a library or tool on which we depend.

When requirements evolve, or a system is refactored for other reasons, certainly the specifications must evolve as well. But refactoring a small amount of declarative specs (which, recall, are useful as documentation, for execution/validation, as well as for static analysis/verification) is much easier than refactoring large amounts of hand-written test code.

Finally, because we do DBC, we do not add untested code to projects. After all, we write the specs before we write the code. Moreover, we are able to perform static analysis on the architecture and its specifications even before the body of routines is written. I.e., we get quality feedback on the design of the system before we waste time writing code.; 02 October, 2014 22:27
Marnen Laibow-Koser said...: Perhaps I don't understand what you're doing, but your first couple of paragraphs seem to contradict each other. The first says that your declarative specs are executable, whereas the second says that the declarative specs generate executable tests. Perhaps I'm splitting hairs, but those seem like significantly different claims.

Also, from your last paragraph, it seems like you're testing the architecture rather than the code. Static analysis in advance of code is IMHO a smoke test, not "quality feedback". Writing code isn't a waste of time, it's how you know if your theoretically great architecture works in practice. Or do I misunderstand your process?

Also, wouldn't maintaining a contract through changing requirements be about as hard as maintaining tests?

I'll hold off on addressing the rest of your claims until I find out what your declarative specs are like, and where theorem proving fits in. Do you have some description of how this all works in practice? I'd really, really like to see what your specs and development process look like in this system.; 19 November, 2014 16:13
Joe Kiniry said...: My claim about unit tests relates to any level of abstraction, from tiny methods/functions to whole-system tests. If your design is not done, you will be rewriting your tests. So whether you are refactoring the type signature of a method, or if you are changing the architecture of your system, you'll be chasing the dragon's tail of manually fixing unit tests. The volume of the fixes is at least linearly correlated with the size of the refactoring, but usually it is worse due to test dependencies that percolate across units.

I have witnessed your style and tried to apply it with great diligence to a number of projects. I also taught it to hundreds of students to let them experience it firsthand.

But I also taught them the alternatives and let them use them too, and reflect for themselves which works better for them, given their project, team, technology, client, etc.; 18 December, 2014 07:50
Joe Kiniry said...: Also, with regards to your latter question, declarative specs are executable both symbolically and practically through compilation.

Evaluating the system's design, from the architecture level to the module level, via static analysis is important. We evaluate the design using a combination of old fashion expertise assisted by static analysis of architecture and design. Only after we green-light pieces of the architecture do we start to write code and (sub)system tests.

Changing architecture, design, and contracts when requirements change does happen. Happily, because we have a relation between our requirements and our design, we can easily determine exactly what need change (be it design, contracts, or subsystems tests that are either generated or manually written). Suffice to say that this takes radically less time than other methodologies we have used over the years, especially agile techniques.

In fact, the only extra ingredient we have found that reduces impact scope, and thus reaction time to requirements changes, is by writing the implementation in a pure language (either OO or FP).

The big picture description of how this works in practice is found in several of my research papers written with Dan Zimmerman. We are now using this rigorous engineering methodology for commercial development work and audit work at our new employer, Galois.; 18 December, 2014 17:06

Applied Formal Methods

Saturday, January 02, 2010

RSpec

7 Comments:

Contributors

Previous Posts