In a recent paper with Mikołaj Bojańczyk and Bartek Klin we introduced the notion of vector spaces with atoms. They allowed us to give a decision procedure for equivalence of unambiguous register automata (and more). In this post I will show some of the underlying maths and discuss why, for register automata, weights are nicer than nondeterminism.
I will not formally define vector spaces with atoms here, but I will give two abstract perspectives, which could give an intuition. So I will be assuming some familiarity with sets with atoms or nominal sets.
The (pre)sheaf perspective
If you have learned about nominal sets, you will have seen the correspondence to certain (pre)sheaves. The idea is to associate to each element of a nominal set a context (concretely: the names appearing in that element). We can then collect all elements with the same context into a set . This defines a functor
This is a presheaf. To exactly capture nominal sets, the functor should satisfy additional properties, making it a sheaf. Moreover the category of contexts is the category of finite sets with injective functions. A good resource on this stuff is Sam Staton’s thesis.
To define vector spaces with atoms, we can replace by , the category of vector spaces (over the field of rational numbers, for instance) and linear maps:
The algebraic perspective
The other common description of sets with atoms is an algebraic one. Here we realise that a renaming of contexts induces a renaming in the elements themselves. In fact, we can forget about the contexts and make the renaming a first-class structure of the sets with atoms. For this, we take the group of automorphisms (all bijections) on the set of atoms . A set with atoms is then a set together with a -action (satisfying some more properties):
A good source for learning about this is this paper or Mikołaj’s atom book. Again, this easily generalises to vector spaces:
In fact, this shows that we talk about group representations of ! Or, in other words, these are modules over the group algebra of .
Which perspective did we use?
Neither! We wanted to prove a certain finiteness property of these vector spaces, but neither abstract perspective actually helped us to prove that. So we went for a explicit description in the paper (also allowing ourselves to use finitely-supported spaces instead of only equivariant ones).
The property we cared about is the following. Consider a vector space (with atoms) and consider chains of (equivariant) subspaces:
(Or descending like .) We say that has finite length if all those chains are bounded. Our main result is that has finite length whenever is an orbit-finite set, and we proved this by hard work, as opposed to some abstract machinery.
For finite dimensional vector spaces (without atoms), the length is always equal to the dimension. So it is a notion many people may not have seen.
Let’s see why vector spaces with atoms are not as simple as you might think. We consider the simplest case, we take set of atoms (this is a countably infinite set without further structure) and we consider the free vector space on this set
Each atom is a basis element of this vector space, and the space contains linear combinations such as
There are two things to note:
- As a vector space, it is infinite dimensional (since is an infinite set).
- As a -set, it has infinitely many orbits, for instance the vectors , , , and so on, are all in different orbits.
This means that we don’t have good algorithmic properties out of the box, we will have to work for that. Finite length is one such algorithmic property, as it makes certain fixed point algorithms terminate. Does have finite length?
Consider the linear map defined on each atom as . The kernel of this map is a subspace of . (At this point, you should try to find other equivalent descriptions of this subspace.) Surprisingly, it is the only equivariant subspace of :
(For a proof, see the paper.) This means that has length 2. So although we started with the simples case of , it already gives us non-trivial subspaces. When considering tuples it only gets more complicated.
We used the finite length property to generalise Schützenberger’s algorithm for checking equivalence of weighted automata. Normally, one would argue that the algorithm terminates because has finite dimension. But that is not the case for us. Luckily, termination really follows from finite length.
This algorithm also applies to unambiguous nondeterministic register automata. Normally, when dealing with nondeterministic register automata, many things are undecidable. So what makes weights nicer than nondeterminism?
It has to do with the algebraic structure. With nondeterminism, we will often consider the Boolean algebra (with atoms) or the join-semilattice (with atoms) . These are also algebraic structures, but they do not have finite length. For instance, there is an infinite decreasing chain of equivariant sub-semilattices
defined by . So even though has an orbit-finite “basis”, the structure is too big, as it were. This hinders many fixed point algorithms to terminate for nondeterministic register automata.
Somehow the additive structure (or perhaps specifically subtraction?) of saves the day, and we show in the paper that many constructions, such as tensor products and dual spaces, preserve finite length of these spaces.
I think these vector spaces with atoms are really cool. During the research we thought of many geometric arguments to prove certain things, and unfortunately they make very little appearance in the paper. But working in this infinite dimensional vector space, where you can just add another dimension whenever you like, is really nice. The spaces are much better behaved than things like powersets, and so this is a good reason to do weighted automata theory instead of nondeterministic automata theory.
We did not use the abstract perspective sketched above. I hope that we can, at some point, obtain results abstractly as well, since we do not really know yet where this technique stops working. (In the paper we make some remarks about other atom structures, for instance.) Already using the terminology of sheaves and abstract algebra could be of use. As an example, the space above is not (semi-)simple, and its only submodule does not have an equivariant basis.
There is still a lot of work to do. One thing is the exact length of the spaces , instead of the exponential bound that we currently have.
To learn more about this, there