As a student of mathematics, I’m often interested in how fascinating math works its way into other subjects. In particular, I recently became curious about why computer scientists are talking about complicated categorical machinery, and this post is a quasi-answer to this question. As a disclaimer, I’m neither a computer scientist nor a category theorist, so what ensues will be a layman’s (or lay-mathematician’s rather) approach to understanding their connection.
In the following, we will not actually need any theory but this post is mostly about developing the requisite language to understand how all of this strangeness came to be. I’m going to routinely defer main definitions to wikipedia, and instead pass to the level of heuristics throughout this post, so that it does not become bloated and technical. Additionally, I’ll assume working knowledge of any first chapter in a category theory textbook (which by happenstance is about as far as the author can speak with any semblance of authority.) I’ll also make the executive decision to not write in the language of categories (diagrams) or the language of computer science (code) but instead just plain old english and math. In short, by the end of this post, I hope that the reader will feel comfortable reading some of the technical definitions online, which often make this blogger feel categorically challenged.
Let’s start with the “mathematics” :
A Monoid is an algebraic structure where it “makes sense” to do multiplication. In particular, given a set S, a monoidal structure on S is just a binary operation that is associative, and has a unit (A group without inverses.) A good first example is , but a slightly more relevant example is the collection of functions on some set with function composition as the binary operation (you give me two functions and I’ll spit out a new one) and the identity functiton as identity (I too regret the last part of that sentence.) More generally, given any object in a category , we can consider the collection of endomorphisms , and these will form a monoid under composition as well.
A Monoidal Category for our purposes is a category that looks like a monoid, afterall since if it walks like a duck and talks like a duck… nevermind. It is a Category equipped with a bifunctor (a functor in both arguments) that is associative and has identity “up to natural transformation.” This means basically, that it may not be literally associative since that is probably too stringent a requirement, but there is a way to canonically identify with . A great first example of this, is the category of sets under cartesian product. For example is just , and the reason we can write this unambiguously is that is the “same thing.” Maybe a good first pass at gaining intuition here is that we want a category where we can form something kind of like a cartesian product (in fact, this alone gives rise to tons of monoids since any category with finite products provide a monoidal structure); Indeed, the important thing here, is that given two sets , we can regard there product as an element of .
One nice Monoidal Category is the category of abelian groups, with the bifunctor . In particular, given two abelian groups we see that is a bifunctor that is “associative under natural maps and has an identity, namely . (To see the latter statement, consider the map given by , where multiplication is iterated addition.)
A Monoidal Object is a generalization of monoids that we can form in any monoidal category. It looks like a monoid as well, but we can replace with as one might expect, and we replace binary “function” with binary “morphism” in our new category. The main obstruction to the last sentence being the end of this conversation is that we don’t have strict associativity and identity, so there are some diagrams to be drawn here.
If we take our previous monoidal category , we can see that if we fix , we can form a monoidal object by declaring a group homomorphism , but in doing so, we use the fact that tensor product is bilinear, we have the relations and . In other words, we actually endow with a Ring structure. So a monoidal object in this monoidal category is just a ring.
the Category of Endofunctors is the category of functors as objects, and natural transformations between them as morphisms. Unsurprisingly, this is a monoidal category. this construction is a specialization (generalization?) of the construction of using endomorphisms of an object in a category to make a monoid under composition.
A Monad is a monoidal object in the category of endofunctors. This has a bit going on, but if we fix a particular endofunctor , then then we should have a natural transformation that is “associative” and a natural transformation , where is just the identity functor, and we require these to satisfy the so-called coherence conditions.
Here is a near content-less example: we can take functor in . Then, the family of maps . We will Take given by . actually has a slightly different description: take the union of all the sets interior to the outermost brackets, and delete one of the brackets. This is basically, “concatenation” and will play a prominent role in the next example.
Take the endofunctor , the powerset of . Take . and take as its second description above. This also gives a monadic structure, and introduced a broad class of monads: take a functor that sends some set to some object “freely generated” by it, such as all of its subsets, the free group on it, vector space etc. and compose that with the forgetful functor to get back to Set. “concatenation” of some form, and the underlying set for the inclusion will provide monadic structures on .
Okay, on to the Computer Science, where I’m going to focus on Haskell. Let’s first just chat about what the monad formalism means here, and then I’ll speak a little bit about why this formalism matters at all.
The first thing to do will to describe the Haskell Category: We consider objects in this categories “types,” so something like floats, lists, tuples, strings etc. and the morphisms between them to be functions between these different types. Note that there are many arrows between different objects, since they are all functions with the appropriate (co)domains.
Now, an endofunctor in is going to be a functor that sends one type to another, in a way that preserves maps between them. There is a data constructor that takes some type and maps it to , which is a list that contains elements of type . This data constructor can be made into a functor by specifying what it does to maps by constructing the function defined by applying a morphism to each element of the list individually. Indeed, defines an endofunctor on , and we can actually equip this with a monadic structure much like the examples above, by taking , by (this is called in the language) and we can take the list concatenation operation by by list concatenation (for example . (This is called in the language.)
This is one example of a monad in Haskell, and there is a sense in which it is the example, if we want to understand how monads are used to “unwrap” data.
Here is an unrealistic first example: suppose we have two morphisms and (which is not uncommon, perhaps you want to translate a number into a string, which is a list of characters.) Then, we may want a way to compose these morphisms, but there is of course a problem with composing different data types. Ultimately, we want a new function , but in order to get there, we would want , so we want some way of taking some and a function and combining them (which is the thing in the middle) to form a new function . It turns out, this is precisely how we can get a monad to work for us.
First, note that the functor also tells us what to do with functions , namely . Our natural transformation , allows us to do the following:
where is of type , so it gets mapped to type by , and then unwrapped by to , so in other words, given a and some map , we can use the monadic structure to obtain something of type . This allows us to compose such functions.
From a more practical standpoint, there is the notion of a debuggable function , where has two types: and . So, composing said functions allows us to avoid the cumbersome nested if , then return or else break for the composition of functions (more can be found on Haskell wiki.)
Either way, this was just my piece on how category theory weaseled its way into programming.