As I did with my first attempt at explaining π-calculus, I think that before getting into any of the deep semantics, it’s good to look at a few examples of things you can build with π-calculus. But before getting to the meat of the post, I’ll give you the answer the puzzle I left in the last post. What’s wrong with the example I gave for “Hello world” yesterday?
As a reminder, the question was: Assume “out” is the name for a channel that is read by a process that prints whatever it receives to the standard output. What’s wrong with the following process as an implementation of “Hello World”?
new(in).{ *(?in(x).!out(x).∅) | !in(hello).!in(world).∅ }
As one commenter correctly answered, there is no guarantee that “hello” will be printed before “world”. Let’s first do the two reduction in the non-replicated subprocess. Then we’ll end up with:
new(in).{ !out(hello).∅ | !out(world).∅ | *(?in(x).∅) }
Now, we can do a reduction with either of the two processes trying to send to out. There’s no guarantee that “!out(hello)
” will get reduced before “!out(world)
“.
The correct way to do it is to explicitly synchronize things: don’t let the message “world” get sent, until after you know “hello” has been printed out:
new(in,sync).{ *(?in(x).!out(x).!sync().∅) | !in(hello).?sync().!in(world).?sync().∅ }
I’m going to use the syntax I introduced in the previous post, with
one addition: named process definitions.
A named process definition is really just a shorthand. A recursive named-process
definition is really just sugar for a replication expression; and a non-recursive one is
obviously just shorthand for a textual replacement of the process name with the process
definition assigned to it. Similarly, when a process definition takes parameters, there’s a
translation of that into messages. I’m not going to explain the expansion of recursive
process definitions or parameters in this post – but I will come back to it in a
future post when we start getting into the detailed semantics. The way that I’ll write a
named process definition is “Name[params]≡Process
“; and for
clarity, I’ll always use names starting with an uppercase letter for processes, and names
starting with a lowercase character for channel names and variables.
One very useful thing to have if we’re going to write real programs using π-calculus is mutable storage cells. In terms of π-calculus, a mutable storage cell is a process
with two channels: one for reading the value of the cell, and one for writing a new value into the cell.
To update the cell, a client will send a message to the cell’s “write” channel containing the new value; to read the value of the cell, a client will receive a message from channel’s “read” channel. Based on that description, here’s a process definition which would be most peoples first attempt at defining Cell:
Cell[val]≡new(read,write).{?write(v).Cell[v] +!read(val).Cell[val]}
What that says is: “Cell” is a shorthand for a process expression which is parametric in a value, called “val”. Cell introduces its own read and write channels, and then either allows a client to read the value in the cell, or update the value in the cell. There’s a bit of trickiness for people who aren’t used to process calculi: writing to replace the value inside the cell is done by reading a message: ?write(v)
a client needs to send a message containing a value to update the cell; and for clients to read the value from the cell, the cell has to write to the read channel: !read(val) – sending a message to the client containing the value in the cell. “read” and “write” are from the perspective of the client; they’re backwards from the perspective of the cell.
This cell implementation is close, but it isn’t correct, and the reason why illustrates some of the subtleties of π-calculus. What’s wrong with it? There are problems with channel naming – and channel naming and the subtleties of managing them are at the heart of how things work in π-calculus.
- The “
new(read,write)
” creates two channels, scoped locally to the Cell process. That means that no one outside of the cell process can see them! They’re initially scoped to Cell, and there’s no method inside of Cell to send the channel name outside of its name-scope. So it’s a storage cell, but no one can ever actually read it or write to it. - The “new” that creates the channel is inside of the part that is invoked recursively. So each time that “Cell” is invoked, it creates new channel names. So even if some client had access to the channel names of the cell at some point in time, any time that the cell was read or written, it would allocate new names, and the client would no longer be able to access it.
To fix it, we need to do two things. First, we need to change the cell definition so that it only allocates its read and write channel names once, and then reuses them. And second, we need to provide some way of making those channel names available to a client.
To remove the allocation of the channel names from the recursive part of the process, we’ll split the process definition into two parts: one which will only be executed once, and one which executes recursively:
NewCell[val]≡new(read,write)Cell[read,write,val] Cell[read,write,val]≡{!read(val).Cell[read,write,val] + ?write(v).Cell[read,write,v]}
Fixing the other problem is also easy. We want to make the names of the
cell’s read and write channels available to a client. How do we do that in π-calculus? There’s only one way of moving values around in a π-calculus process: message passing. So when someone wants to create a cell, they’ll send a message to
a cell factory process containing an initial value for the cell, and the name of
a channel where the new cell’s read and write channel names should be sent:
NewCell[creator,initval]≡new(read,write).{Cell[read,write,initval] | !creator(read,write) } Cell[read,write,val]≡{!read(val).Cell[read,write,val] + ?write(v).Cell[read,write,v]}
To create and use a cell, the client process just invokes NewCell, passing it
a message containing a channel name where it wants to receive the read and write channels of the cell:
new(self).{ NewCell[self,0] | ?(cellread,cellwrite).(...) }
So, for example, we can write a process which allocates a cell, prints out its initial value, updates the cell value, and then prints out the new value like this:
new(self).{NewCell[self,0] | ?(cellread,cellwrite).?cellread(v).!out(v).!cellwrite(7).?cellread(w).!out(w)}
I find something disturbing about these
out
,read
andwrite
channels: I think they are impossible to use in the way you suggest, unless you make them primitives of the language. Because, the synchronization of ! guarantees only that the message has been received on the other end, not that it has been handled.So, you cannot define
read
orwrite
to work as suggested and be sure that their effects are applied in the right order, even with your suggested fix. And surely, storage cells can be defined in π-calculus without being introduced as primitives.For
out
, while you could expect that one to be primitive, you should also be able to define it in terms of an even more primitive channel, say afout
that took a file descriptor argument. And I don’t think you can do that.I think all these I/O/mutation channels need to work like
in
, and invoke async
channel to be tested by the calling process.This makes me think of the Haskell state monads, including the
RealWorld
passing trick some implementations (including ghc) use forIO
. I guess both π-calculus and Haskell need such things because their basic evaluation is unsynchronized.Orjan:
You are correct – I am handwaving my way past some problems here. But I think that the root of the problems that you’re talking about is that you’d like the synchronization behavior of the system to be defined purely in terms of the channel, when it’s actually dependent on the process.
So, for example, I don’t think that the “read” and “write” channels have the problem that you’re suggesting. The storage cell is synchronized, in the sense that a “read” can’t happen until after a write has finished: exactly one of read and write can be chosen, and then the choice isn’t accessable again until the selected read/write operation is complete. There is the problem that other processes can do a read operation on a cell’s “write” channel, and intercept a message intended for the cell – but that’s a scoping issue, which I’ll get to looking at later. The “read” and “write” channels aren’t defining a strict synchronization behavior that guarantees that the cell will behave correctly – but the cell process that uses those channels does.
WRT out, you’re absolutely correct – there is an issue there which I’m handwaving past. Eventually, we’ll get to an implementation of “out”. But you don’t need to add a return sync operation, like we did in the hello world example. The key again is that we aren’t just relying on the fact that there’s an out channel – we’re relying on the fact that there’s a process with a specific behavior listening for messages on the “out” channel – and that out process is behaving in a properly synchronous way.
It’s confusing because I haven’t yet shown the process that’s listening on “out”, but just assumed that there is a process and that it’s implemented to get the synchronization behavior correct.
I hate to be picky but I’d like to warn about some bugs in the definition above, that got me confused for some moments, and may confuse other readers as well.
You forgot Null a few times, and missed a channel name.
Example: !creator(read,write).0
?self(cellread,cellwrite)
I still have a question that I made on the previous post, and is now haunting me a little bit, along with some new questions that have just popped up.
There are some properties that seem obvious:
A|0 = A
(A|B)|C = A|(B|C)
(A+B)+C = A+(B+C)
a.(A+B) = a.A + a.B
and
a.(A|B) != a.A|a.B
P+0 != P
But there doesn’t seem to be a trivial way to derive this properties from the construction rules of the language defined previously.
Daniel:
I actually haven’t defined the π-calculus notion of process equality yet. When we get to that, we’ll be able to show why properties like the ones you mention either hold or fail.
For now, the short version is that two processes are equivalent if there’s no observable way to differentiate between them. a.(A|B)!=a.A|b.B because if A and B each send a message, and you send one “a” to a.(A|B), you’ll see (A|B) send two messages – one from A and one from B. Whereas sending one message on “a” to a.A|a.B, you’ll see either one message from “A” or one message from “B”.
Ouch, I nearly sent another message claiming you were still wrong, but then I saw the light just in time.
However, there is some subtlety. As I understand it now it strongly depends on the fact that P+Q cannot choose P or Q until it knows that the one chosen can be reduced.
However, to know this is more subtle than the current formulation of the choice rule:
If P→Q, then (P+R)→Q
Because what you actually want in this case is “If S|P->Q, then S|(P+R)->Q”.
This does not follow from the rule as written. The problem could be solved if you had an equivalence rule “S|(P+R) = (S|P)+(S|R)”.
I think that the equivalence rules of π-calculus are essential ingredients needed to make the real reduction rules work. For a silly example, you need commutativity of + to be able to conclude “If R→Q, then (P+R)→Q”.
Wait a minute, “S|(P+R) = (S|P)+(S|R)” would break the exact property I was trying to preserve. Because if S’->S, then
S’|(P+R) = (S’|P)+(S’|R) -> S|P
without even looking at the contents of P and R.
New conclusion: Finding the rules to make + work correctly is rather too subtle to do off the top of one’s head.
You’re right Mark, I was mixing up reduction and equivalence, which are different things.
But as Orjan as mentioned, we’ve had to implicitly use some equivalence rules already in order to do the reduction proofs for the examples you’ve shown.
I am still confused with one thing.. the reduction rule for choice:
“if P->Q then (P+R)->Q”
What happens to (P+R) if P->Q and R->S ?
Daniel:
I think there are two things combined in that question. One of them is caused by my laziness: I didn’t explicitly state that “+” is commutative. The other, I think, is a common misunderstanding that I tried to make clearer this time around, but I guess is still fuzzy. Reduction rules don’t say when a reduction must occur – they say when a reduction can occur.
So if you have (P+R) and P→Q and R→S, then you can get either Q or S – because you could reduce P to Q, selecting the P branch of the “+”, or you could reduce R to S, selecting the Q branch. Which one actually gets chosen is deliberately not specified by the reduction rules.
Does that help?
Daniel: P+R is non-deterministic if both parts can reduce.
So, I finally looked up π-calculus on Wikipedia to refresh my memory. As expected, “structural congruence” equivalence is fundamental.
Unfortunately the original version described there doesn’t include +, which is just mentioned as an extension. So I still don’t know the exact rules for it.
Yes, I understand that P+R is a non-deterministic choice.
What I was trying to say was how would we reduce it.
Let me give an example, supose we have the following process:
(!a(x).P + !a(x).Q) | ?a(y).0
We can have both of the following reductions
(!a(x).P + !a(x).Q) | ?a(y).0 -> P
(!a(x).P + !a(x).Q) | ?a(y).0 -> Q
If we want to know all the possible reductions from the initial process, then we will have to branch our reductions sequence everytime we find this scenario, so we wouldn’t get lost.
(!a(x).P + !a(x).Q) | ?a(y).0
-> P -> …
-> Q -> …
But… would it be incorrect to say that:
(!a(x).P + !a(x).Q) | ?a(y).0 -> P + Q
So that we could avoid the branching?
I realized that reducing with + can be very complex. Say we have
(P1+(Q|(P2+?v(x).R))) | (P’1+(Q’|(P’2+!v(t).R’)))
Then it seems reasonable to reduce that in one step to
Q|R[t/x]|Q'|R'
. The nesting could be arbitrarily deep, and maybe contain * andnew
as well.It seems like it’s necessary to recurse on each part separately. To do this we could divide the reduction into two parts, one producing the message v(t) and one consuming it. Most rules like parallelism could treat such half-reductions similarly to ordinary ones. And the fundamental rule could then be changed to:
If P -> P’ producing message v(t), and Q -> Q’ consuming message v(t), then P|Q -> P’|Q’ normally.
I just got how Cell works. Maybe I’m not quite your target audience, but it would’ve helped me if you’d explained why you use “v” in addition to “val”, and/or maybe show its need by expanding Cell once for both !read and ?write.
Let me try to help on this one.
Supose you have a cell storing value 7: Cell[read,write,7]
and you what to replace that with value 8.
In the definition of Cell, “val” is the old value, and “v” is the new value.
Cell[read,write,val]≡{!read(val).Cell[read,write,val]
+ ?write(v).Cell[read,write,v]}
Then you run a process that will write a new value on the cell.
Cell[read,write,7]|!write(8).0
=
{!read(7).Cell[read,write,7]+?write(v).Cell[read,write,v]}|!write(8).0
->
Cell[read,write,v][v/8] | 0
=
Cell[read,write,8]
I’m loving the new explanations of the pi calculus. I was intrigued but confused last time, but this time it all seems to be quite clear.
Question: Say you have a !channel(one,two,three). This *must* be matched to a read-message channel that similarly has three argument, correct (such as ?channel(testone,testtwo,testthree)).
If I had a ?channel(one,two) sitting somewhere waiting to match up, it wouldn’t do anything, right? I believe it has to wait until it sees a !channel(one,two) process somewhere. Similar, it wouldn’t match to a !channel(one) process either.
Xanthir: Note that the version of π-calculus that Mark is showing is an extended one, and as far as I understand extensions differ widely. The original calculus contains neither + (so I am still not sure of the semantics of that in all cases) nor multiple argument lists.
However, matching argument counts seems obvious, although you might add another extension to match variable length lists (like e.g. in Scheme). And given that Mark has decided to use match checking for constant values, it seems only reasonable to do it for argument count as well.
Apologies for dragging this post up a good month after its last post, but I’ve only taken the time to read these posts through properly recently and have a simple question regarding the Hello World example – why not just use one process?
!out(hello).!out(world).∅
I’m sure I’m missing something (possibly just that it was a pedagogical example) but nothing seems to be gained by adding that channel and second process?
Mark H:
Basically, it was just pedagogy. Just !out(hello).!out(world).∅ doesn’t demonstrate much about the fundamental rendezvous-as-reduction. The more complicated example does.