Sunday, November 09, 2008

Separation logic

A good way to learn separation logic is to read Reynold’s introductory lecture notes. This is a little bit like saying that a good way to learn programming is to read Knuth’s books. If you are the type of person that likes using iPhones and searching with Google you might need a bit more to wet your appetite.

Separation logic was designed to allow local reasoning. ‘Local’ is a generalization of ‘modular’. You can say ‘local’ when you have some sort of distance and you can say “a is close to b but far from c.” We reason locally when all the stuff that we need to keep in our head at one time is ‘close’. ‘Modular’ is simply ‘local’ specialized with program text distance. The typical distance people have in mind when they talk separation logic is “the number of pointer indirections I need to do to get from location x to location y.” When you think about a linked list you tend to ignore other linked lists that are at distance ∞. Good, so the central concept of separation logic is locality; or separation, depending on how you look at it.

A logic is a very simple language with very precise semantics. To describe the logic (that is, the one most people know) you need to introduce operators and describe what they mean (for example, pq evaluates to ⊤ when both p and q evaluate to ⊤); in other words, you introduce a piece of syntax and then you give its semantics. Thinking in terms of semantics is not always efficient. Sometimes you need to get in the symbol crunching mood. That’s why you’d typically present rules for manipulating symbols that are sound (for example, p ∧ (qr) can be replaced by (pq) ∨ (pr)).

There are three pieces of syntax that are new in separation logic: the separating conjunction pq, the points-to operator xv, the separating implication aka the magic wand p −⋆ q, and emp. The semantics are described in terms of a somewhat more complicated model. Now we have not only booleans plus some abstract domain. Still, the model will be familiar to any programmer. We have addresses, values, a stack (inappropriately named ‘store’ by the old-timer Reynolds), and a heap. (BTW, there’s nothing wrong with ‘store’ per se. It’s just that most programmers prefer the ambiguous terms ‘stack’ and ‘heap’.) So,

  • pq holds in a heap that can be split in a part where p holds and a part where q holds,
  • xv holds in a heap that has exactly one cell whose address is x and whose content is v,
  • p −⋆ q holds in a heap that can be extended with a heap where p holds to get a heap where q holds,
  • emp holds when the heap is empty.

Once we introduce these new operators we also get new laws, such as:

  • p1p2 and q1q2 can be replaced by p1q1p2q2
  • p1p2p3 can be replaced by p1 ⇒ (p2 −⋆ p3)

In fact there are classes of assertions for which extra rules can be used.

  • pure assertions are those that don’t say anything about the heap so they can be pulled out of the new operators; for example (pq) ⋆ r can be replaced by p ∧ (qr) if p is pure,
  • strictly exact assertions are those that uniquely identify the heap,
  • precise assertions are those that identify a unique subheap
  • intuitionistic assertions are those that hold for any extended heap
  • supported assertions are those for which there is a least subheap on which they hold

Now we can annotate programs with descriptions of the state at various intermediate points. If we also have rules on how the state is changed by simple commands we can then derive formal proofs of programs, meaning that the annotations can be shown to agree with the program by pure ‘symbol crunching’, something even a computer should be able to do. People who write such programs have a knack for funny names: Space Invader and Smallfoot. Well, perhaps not all people who write such tools. There is also the Unnamed and Unreleased tool from Matthew Parkinson that handles Java code. Edit: Matthew pointed out that the tool has a name: jStar. (Please write a comment if another tool should be mentioned here.)

The interesting algorithms proved correct (in the sense explained above) using separation logic are Schorr–Waite and Michael’s stack. (Please write a comment if another algorithm should be mentioned here.)

Let me briefly present these two algorithms.

The SchorrWaite algorithm is used to reduce memory usage during garbage collection. For our purposes ‘garbage collection’ simply means that we want to mark all reachable nodes from a given root. The simple solution is to do a DFS. A simple implementation would use a stack of pointers that may grow in the worst case to be linear in the number of nodes in the graph. The trick Schorr and Waite use is to use the link fields in the graph to store the backpointers.

Michael’s stack is lock-free and works in the presence of memory reclamation. You want to implement push(stack, node) and pop(stack), where stack is a pointer identifying the stack. These functions will be used from multiple threads and the only synchronization primitive you are allowed to use is the compare-and-swap statement. To push a node you would probably try to read stack->next, set node->next := stack->next, if stack->next == what I read then set stack->next := node. The last operation is the compare-and-swap: “If the first element of the stack is still what it was when we looked the first time then make node the new first element.” The problem with this is that “the same first element” doesn’t necessarily mean “the next pointer of the first element is the same.” No one is supposed to change the next pointer apart from push and pop, but it may very well be that the node is freed, a ‘new’ allocated node happens to be the same area in memory and then there is nothing holding back the change of the next field. That is, unless you use an array of stuff that shouldn’t be touched, as in the solution given by Michael.

That’s it for now. Oh, and sorry if you were under the impression that I’m trying to ‘wet your appetite’. I was merely trying to draw a child’s sketch of separation logic.

Labels: ,


Anonymous ilham said...

Hi, rgrig!

Thanks for your short introduction into separation logic. Only if I find this post earlier... I've been battling for the last couple of weeks with the extremely clear in-place reversal example given in Reynolds paper. (Only yesterday I understand how it works after a drop of hint from ardiankp)



27 August, 2009 15:16  

Post a Comment

<< Home