Don Batory on Category Theory in SPL and MDE
At MODELS 2008 Don Batory gave a talk on how category theory relates to Software Product Lines (SPL) and to Model Driven Engineering (MDE). There's a good paper covering most of the talk so I won't go into much detail here. But let me try to convey the aspects I found the most interesting.
I don't want to pretend I know much about category theory but I believe that one doesn't need to know much more than that it is a bunch of points and a bunch of arrows between them to dig what Don has on his mind (let's see how many categorists read this).
MDE is about refinement: going from highly abstract models, one gradually gets to an implementation via models with increasing level of refinement. Don presented an example of a process which derives Bytecode from a DSL via a more specific DSL and Java code. So here we got:
DSL1 → DSL2 → Java → Bytecode
Now to get a category, consider program representations as points and transformations as arrows between them. Note that identity transformations and composition of transformations exist. However non-groundbreaking this might seem at first sight, it is important to realize that such diagram is an important concept in MDE and it might be worth including such as an artifact in software development (the paper points to relevant work on megamodels and on tool chain diagrams).
Ok, MDE development chain is a category, now what about SPL? A software product line is a means of developing and maintaining a family of similar software artifacts (typically programs). As reuse is the key, these family members are compositions of features (some might want to call them aspects at the software level).
So where's the category hidden in this setup? Imagine programs of a product line as points and features as arrows between them. So a feature is a delta adding functionality to a program which gets us to another program.
As a particular software product line considers only certain compositions of features, a feature model is used to determine which compositions of features, also known as arrows, are valid (or are considered if you wish).
Now this really gets groovy if we combine MDE (refinement) and SPL (program synthesis). Imagine you got a software product line which we want to refine into another product line. Categorically speaking, we need to transform not only points to points but also arrows to arrows — features of the first product line to the features of the more refined one. In effect, a product-line is a directed graph (category), and what we are doing in refining a product line is mapping a directed graph (category) to another directed graph (category).
So let's say we have a product line S in which the program s1 is mapped to the program s2 by the feature fS. Further, the product line S is refined to a product line B in which b1 is mapped to b2 by fB. Now if we know that b1 is a refinement of s1 and b2 a refinement of s2, we obtain the following arrows:
The cool think is that this should commute: If I want to get from s1 to b2, I can either go via b2 and use →fB or, I can go to s2 via →fS and then refine to b2. If the term commuting diagram popped out in your head, it's not by accident (if it hasn't, the diagram above is a commuting diagram).
How's that useful? Don describes two scenarios, one when testing (verifying) this commutativity led to a discovery of bugs in his software and one where taking one route was computationally more efficient than taking the other. So the morale is: "Look for commuting diagrams."
To conclude let me advertise my own work (done with Goetz Botterweck), which investigates the refinement of feature models and I think it would be interesting to investigate how it relates to the concepts by Don. And I encourage anyone who's dabbling or working with Software Engineering to read this article as it's a fruitful ground for ideas and one doesn't need to be scared of math as there's only little of it.