Thursday, October 30, 2008

Don Batory on Category Theory in SPL and MDE

At MODELS 2008 Don Batory gave a talk on how category theory relates to Software Product Lines (SPL) and to Model Driven Engineering (MDE). There's a good paper covering most of the talk so I won't go into much detail here. But let me try to convey the aspects I found the most interesting.

I don't want to pretend I know much about category theory but I believe that one doesn't need to know much more than that it is a bunch of points and a bunch of arrows between them to dig what Don has on his mind (let's see how many categorists read this).

MDE is about refinement: going from highly abstract models, one gradually gets to an implementation via models with increasing level of refinement. Don presented an example of a process which derives Bytecode from a DSL via a more specific DSL and Java code. So here we got:

DSL1 → DSL2 → Java → Bytecode

Now to get a category, consider program representations as points and transformations as arrows between them. Note that identity transformations and composition of transformations exist. However non-groundbreaking this might seem at first sight, it is important to realize that such diagram is an important concept in MDE and it might be worth including such as an artifact in software development (the paper points to relevant work on megamodels and on tool chain diagrams).

Ok, MDE development chain is a category, now what about SPL? A software product line is a means of developing and maintaining a family of similar software artifacts (typically programs). As reuse is the key, these family members are compositions of features (some might want to call them aspects at the software level).

So where's the category hidden in this setup? Imagine programs of a product line as points and features as arrows between them. So a feature is a delta adding functionality to a program which gets us to another program.

As a particular software product line considers only certain compositions of features, a feature model is used to determine which compositions of features, also known as arrows, are valid (or are considered if you wish).

Now this really gets groovy if we combine MDE (refinement) and SPL (program synthesis). Imagine you got a software product line which we want to refine into another product line. Categorically speaking, we need to transform not only points to points but also arrows to arrows — features of the first product line to the features of the more refined one. In effect, a product-line is a directed graph (category), and what we are doing in refining a product line is mapping a directed graph (category) to another directed graph (category).

So let's say we have a product line S in which the program s1 is mapped to the program s2 by the feature fS. Further, the product line S is refined to a product line B in which b1 is mapped to b2 by fB. Now if we know that b1 is a refinement of s1 and b2 a refinement of s2, we obtain the following arrows:

The cool think is that this should commute: If I want to get from s1 to b2, I can either go via b2 and use →fB or, I can go to s2 via →fS and then refine to b2. If the term commuting diagram popped out in your head, it's not by accident (if it hasn't, the diagram above is a commuting diagram).

How's that useful? Don describes two scenarios, one when testing (verifying) this commutativity led to a discovery of bugs in his software and one where taking one route was computationally more efficient than taking the other. So the morale is: "Look for commuting diagrams."

To conclude let me advertise my own work (done with Goetz Botterweck), which investigates the refinement of feature models and I think it would be interesting to investigate how it relates to the concepts by Don. And I encourage anyone who's dabbling or working with Software Engineering to read this article as it's a fruitful ground for ideas and one doesn't need to be scared of math as there's only little of it.

Monday, October 13, 2008

Jeff Kramer on Abstraction

Here are some notes that I've taken down from a keynote speech given by Jeff Kramer at MODELS 2008. The keynote was about how and if to teach abstraction and how abstraction is important for SW development and design. Some students just do not understand clearly certain concepts (concurrent algorithms, modeling) whereas some students are really good. What is the difference between the good and unable students. The answer is abstraction. What is abstraction?
  1. remove detail (simplify) and focus (selection)
  2. finding commonalities
Jeff exemplified abstraction on metro maps. London was the first one to use abstraction to depict the routes, i.e., not adhering to the geographical parameters. Metro maps around the world use this abstraction now. One should keep in mind that each abstraction is fit only for certain purpose. Don't use the abstract metro map instead of a normal map when you are walking around the city. Software is abstract, intangible, by its nature. Hence abstraction is important in its design. This leads us to the following Question: How do we obtain the skill of abstraction? Can we teach that? Answer: It's part of our genes but, no doubt, can be improved by teaching. According to Jean Piaget's cognitive stages, abstractions skills develop between the age of 12 and adulthood. Bad news: abstraction skills reached only by 30% of people. What do we do to improve this?
  • teach enough mathematics
  • teach modeling and analysis, must be tool supported, students must feel the benefit
In the second half of the presentation, Jeff gave a demo of some examples and tools that he's using in his course on concurrent systems. A formal language is used to describe concurrent systems and these can be further analyzed, animated, etc. in a LTS Analyzer. See Jeff's book for more information. An example that I found interesting was how to use deadlock detection to solve the Fifteen puzzle. And a small problem at the end. Let's have the famous N dining philosophers problem. An even-numbered philosopher will always first pick the fork on the left and then the right one whenever he wants to eat. An odd-numbered philosopher will pick the one on the right first. By picking I mean that the philosopher waits if the fork is in use. For which N this strategy does not deadlock?