On the 14th of Janruary 2020, Jurriaan Rot will present our paper entitled “Separation and renaming in Nominal Sets” at CSL. You can find the paper here or on arXiv. I had fun writing the paper and doing the research, here is a bit of its history.
How it came about
The paper solves a performance problem we faced in a previous paper. In that paper, we use nominal techniques to query-learn register automata. The advantage of the nominal techniques is that it provides a clean theory, which allows a straightforward and concise implementation of the learning algorithm. A downside, however, was the performance.
The performance problem is not caused by a bad implementation, but rather it is a theoretical issue. When we go from register automata to nominal automata, we have to inflate or expand the automaton. In terms of register automata: a nominal automaton is a unique valued automaton. The learning algorithm will always attempt to learn the inflated automaton, as that is simply the model we chose to learn! However, in many examples, the algorithm is doing way too much work for this reason.
When working on the learning paper, we were aware of the problem, but could not solve it at the time. Frits Vaandrager had certain ideas on how to merge states of a unique valued automaton. But a formal definition for nominal automata seemed impossible (to me). With Alexandra Silva and her group, we also tried to cook up a monad in order to compress the state space. Also this we could not make precise. I was convinced that there was a categorical adjunction which could be used, but I couldn’t find it. We decided to let it rest for a while.
Later, at the IPA fall days (a sort of summer school), I attended a talk on separation logic. This reminded me of nominal sets, because of the separated conjunction and magic wand. These concepts also arise in nominal sets. We had thought of using these thing for nominal automata theory. But only at the summer school I made the connection to the performance problem.
After the school, I explained the problem to Jurriaan. He liked it and agreed that we should be looking for an adjunction. Better yet, we imagined that the adjunction should behave well in combination with separated products.
One of the difficulties when we started to look for an adjunction was that we did not know the other category yet. (Research is very different from making homework assignments, you know.) We had a rough idea about the morphisms (should be equivariant under a monoid which extends the group of permutations). The details were still missing.
In the end we found “the correct adjunction.” There are a few reasons we did not find them the first time we looked for them. It was clear that we needed -sets and -set, but we also needed:
A relational definition of support. For -sets, we say that supports if . Unfortunately, the same definition for -sets is not quite useful. Instead we needed: .
Finite renamings. For nominal sets it doesn’t matter so much whether you consider or . It was not clear to me whether such a difference matters in the case of -sets. It turns out that we really needed , instead of the set of all transformations. One unexpected property I needed in the proof of the adjunction is that two transformations only differ on a finite part. In particular, we can always choose a such that .
To generalise the output of our automata. Instead of simply having rejecting and accepting states, we needed to consider general outputs. That is, we should be considering nominal Moore machines. The distinction between input and output is important. (See remark 7 in the paper.)
Only when these things were clear to us, we could describe the desired adjunction. At this point, we also finally found relevant research: Renaming sets by Jamie Gabbay and Martin Hofmann. This was very reassuring.
After we had found the adjunction, we could apply it to nominal automata relatively easily. We obtained a general method of describing certain automata much more compactly by a new nominal automaton model. Since this new model is much simpler, we believe that it also simplifies learning a lot. But we haven’t experimented with that yet.
From a categorical perspective, the theory of (pre)sheaves already predicts such an adjunction with nice properties. The contribution, however, lies more in that this adjunction is actually meaningful and useful to us. Even if general theory gives something, it is still a bit of work to distill it into a workable tool. In a previous submission of the paper, one reviewer complained that the results followed from known category theory (by which was meant that it follows abstractly from 3 different papers). After adding a section to highlight these connections, another reviewer called that section “recreational category theory which can be omitted.” To each their own, I guess.
One thing remains open: to apply this adjunction “locally.” Currently, it is an all-or-nothing approach, meaning: The whole automaton either admits this compression or it doesn’t. It would be cool to decide per state (or even for specific inputs per state) whether to use the adjunction or not. Ideally, a learner algorithm does this adaptively: Start learning a compressed model, and only when counterexamples show that you must, then inflate that part of the automaton.
Have a look at the paper, and perhaps register for the conference to see Jurriaan talk!
I used to have a Disqus comments section here. But everybody seems to simply send me emails instead. I have disabled Disqus, in order to avoid third-party trackers.