範疇論對學習函數式編程有用嗎?


125

我正在學習Haskell,並且對語言很著迷。但是我沒有認真的數學或CS背景。但是我是一位經驗豐富的軟件程序員。

我想學習類別理論,以便在Haskell變得更好。

我應該學會在類別理論中的哪些主題為理解Haskell提供良好的基礎?

31

I'm going to try and keep it short and sweet. There is an informal correspondence between Haskell programs and certain classes of categories, which can be made more formal with some work. This correspondence is known as the Curry-Howard-Lambek correspondence and relates:

  1. Haskell types with objects of the category
  2. Terms of type $A\rightarrow B$ with morphisms $f\colon A\rightarrow B$ (note the similar notations)
  3. Algebraic datatypes with initial objects
  4. Type constructors with functors
  5. etc

The list goes on and on, but one crucial point is that you can define things like monads and algebras in category theory and come up with notions that are both useful to mathematicians but also pervasive in the practice of Haskell programming.

I'm not sure which book to recommend, as I haven't found a completely satisfactory introductory book on categories for computer scientists. You can try Categories, Types and Structues by Asperti and Longo. The idea is to learn basic definitions up to adjunctions, and then maybe try and read some of the excellent blogs out there to try and understand these concepts.


123

In a previous answer in the Theoretical Computer Science site, I said that category theory is the "foundation" for type theory. Here, I would like to say something stronger. Category theory is type theory. Conversely, type theory is category theory. Let me expand on these points.

Category theory is type theory

In any typed formal language, and even in normal mathematics using informal notation, we end up declaring functions with types $f : A \to B$. Implicit in writing that is the idea that $A$ and $B$ are some things called "types" and $f$ is a "function" from one type to another. Category theory is the algebraic theory of such "types" and "functions". (Officially, category theory calls them "objects" and "morphisms" so as to avoid treading on the set-theoretic toes of the traditionalists, but increasingly I see category theorists throwing such caution to the wind and using the more intuitive terms: "type" and "function". But, be prepared for protests from the traditionalists when you do so.)

We have all been brought up on set theory from high school onwards. So, we are used to thinking of types such as $A$ and $B$ as sets, and functions such as $f$ as set-theoretic mappings. If you never thought of them that way, you are in good shape. You have escaped set-theoretic brain-washing. Category theory says that there are many kinds of types and many kinds of functions. So, the idea of types as sets is limiting. Instead, category theory axiomatizes types and functions in an algebraic way. Basically, that is what category theory is. A theory of types and functions. It does get quite sophisticated, involving high levels of abstraction. But, if you can learn it, you will acquire a deep understanding of types and functions.

Type theory is category theory

By "type theory," I mean any kind of typed formal language, based on rigid rules of term-formation which make sure that everything type checks. It turns out that, whenever we work in such a language, we are working in a category-theoretic structure. Even if we use set-theoretic notations and think set-theoretically, still we end up writing stuff that makes sense categorically. That is an amazing fact.

Historically, Dana Scott may have been the first to realize this. He worked on producing semantic models of programming languages based on typed (and untyped) lambda calculus. The traditional set-theoretic models were inadequate for this purpose, because programming languages involve unrestricted recursion which set theory lacks. Scott invented a series of semantic models that captured programming phenomena, and came to the realization that typed lambda calculus exactly represented a class of categories called cartesian closed categories. There are plenty of cartesian closed categories that are not "set-theoretic". But typed lambda calculus applies to all of them equally. Scott wrote a nice essay called "Relating theories of lambda calculus" explaining what is going on, parts of which seem to be available on the web. The original article was published in a volume called "To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism", Academic Press, 1980. Berry and Curien came to the same realization, probably independently. They defined a categorical abstract machine (CAM) to use these ideas in implementing functional languages, and the language they implemented was called "CAML" which is the underlying framework of Microsoft's F#.

Standard type constructors like $\times$, $\to$, $List$ etc. are functors. That means that they not only map types to types, but also functions between types to functions between types. Polymorphic functions preserve all such functions resulting from functor actions. Category theory was invented in 1950's by Eilenberg and MacLane precisely to formalize the concept of polymorphic functions. They called them "natural transformations", "natural" because they are the only ones that you can write in a type-correct way using type variables. So, one might say that category theory was invented precisely to formalize polymorphic programming languages, even before programming languages came into being!

A set-theoretic traditionalist has no knowledge of the functors and natural transformations that are going on under the surface when he uses set-theoretic notations. But, as long as he is using the type system faithfully, he is really doing categorical constructions without being aware of them.


All said and done, category theory is the quintessential mathematical theory of types and functions. So, all programmers can benefit from learning a bit of category theory, especially functional programmers. Unfortunately, there do not seem to be any text books on category theory targeted at programmers specifically. The "category theory for computer science" books are typically targeted at theoretical computer science students/researchers. The book by Benjamin Pierce, Basic category theory for computer scientists is perhaps the most readable of them.

However, there are plenty of resources on the web, which are targeted at programmers. The Haskellwiki page can be a good starting point. At the Midlands Graduate School, we have lectures on category theory (among others). Graham Hutton's course was pegged as a "beginner" course, and mine was pegged as an "advanced" course. But both of them cover essentially the same content, going to different depths. University of Chalmers has a nice resource page on books and lecture notes from around the world. The enthusiastic blog site of "sigfpe" also provides a lot of good intuitions from a programmer's point of view.

The basic topics you would want to learn are:

  • definition of categories, and some examples of categories
  • functors, and examples of them
  • natural transformations, and examples of them
  • definitions of products, coproducts and exponents (function spaces), initial and terminal objects.
  • adjunctions
  • monads, algebras and Kleisli categories

My own lecture notes in the Midlands Graduate School covers all these topics except for the last one (monads). There are plenty of other resources available for monads these days. So that is not a big loss.

The more mathematics you know, the easier it would be to learn category theory. Because category theory is a general theory of mathematical structures, it is helpful to know some examples to appreciate what the definitions mean. (When I learnt category theory, I had to make up my own examples using my knowledge of programming language semantics, because the standard text books only had mathematical examples, which I didn't know anything about.) Then came the brilliant book by Lambek and Scott called "Introduction to categorical logic" which related category theory to type systems (what they call "logic"). It is now possible to understand category theory just by relating it to type systems even without knowing a lot of examples. A lot of the resources I mentioned above use this approach to explain category theory.


13

A short answer: no [but this is only an opinion]

Don't go to Category Theory or any other theoretical domain to become good in Haskell. Learn functional programming techniques, such as tail recursion, map, reduce, and others. Read as much code as you can. Implement as many ideas as you can. If you have issues, read and read.

If you want a good theoretical reference to learn Haskell and other functional programming paradigms then have a look at: An Introduction to Functional Programming Through Lambda Calculus, Greg Michaelson (available online). ... There are other similar books.


33

Echoing @AJed advice, I recommend to turn your statement

I want to learn category theory so I can become better at Haskell.

on its head: learn Haskell, building on your programming intuition. Once you are an FP guru, it might be easier to pick up category theory (if you still care).

Category theory is simple for somebody with broad mathematical education (groups, rings, modules, vector spaces, topology etc). Lacking this background, category theory is nearly impenetrable. The beauty of category theory is that it unifies a lot of seemingly unrelated things (e.g. left adjoints of forgetful functors include free groups, universal enveloping algebras, Stone-Cech compactifications, abelianisations of groups, ...), and so reduces complexity. But if you are not familiar with the multiple examples that category theory unifies, category theory is just an additional layer of complexity that makes your life harder.

In my experience, learning is easier by building on things one already knows. As a software developer, you know a lot about programming, and Haskell programming is not that different from other programming, so my recommendation is to approach Haskell from a pragmatic programming point of view, ignoring category theory. The bit of category theory that is in Haskell, e.g. some support for monads, is much easier for a programmer to grasp without taking a detour via category theory. After all, monads are merely generalised composition (and you will have used monads in your programming practise already -- albeit without knowing you did), and Haskell doesn't really support monads for real, as it does not enforce the monadic laws.


1

Here is a (long) blog post, that gives motivation on how category theory ideas are relevant for practical programming: http://cdsmith.wordpress.com/2012/04/18/why-do-monads-matter/


0

Category theory is a very sophisticated branch of mathematics and mastering it will unify most of your previous learnings by making them instances of same abstract objects. So it is very useful and very intuitive. But it is vast and broad, and you will find yourself in a plenty of new concepts that will don't even know which one is suitable for your needs and which one you should skip. So your purposeful approach needs choice among concepts, otherwise mastering in it inevitably needs long time and is really not a self study domain.

By the way, I suggest a very well start point for your purpose to be here.


0

Category theory has certain but limited use for learning functional programming or in actual practice of functional programming. I recently made a presentation to answer that question. https://www.youtube.com/watch?v=Zau8CxsfxOo

Summary:

  • Functional programmers do not require category theory in order to master the main features and design patterns that FP uses to write better code. For example, one can (and should) first learn how to use monads, functors, liftings, map/filter/fold, etc., in a concrete programming language with specific examples. Category theory will not help master these techniques even though the words "functor" and "monad" originally come from category theory.
  • At a certain point, programmers will encouter examples of typeclasses with laws, and understand why those laws are important in practice.
  • There will be lots of laws. To make some order and system among those laws, we can formulate the laws as a generalized "lifting" type signature with "twisted" function types.
  • We can then use the definitions of category and functor as generalizations that cover the laws of functors, contrafunctors, filterable functors, filterable contrafunctors, monads, applicative functors, applicative contrafunctors, comonads, and perhaps other type classes.
  • I show some examples of categories that are used to describe functors, monads, applicatives, and filterable functors.
  • I cover filterable functors in more detail, with backdrop of category theory, because filterable functors are rarely explained as a separate typeclass.
  • Another example where category theory is useful: "type constructor libraries", i.e. libraries with functions parameterized by a type constructor. Examples of these are free functor / free monad / etc., and Church encoding of types (including recursive type constructors, e.g. the Church encoding of a free monad). Programmers who need to implement these libraries will need to understand how these constructions are defined and what laws need to hold. Category theory provides some limited guidance about that.
  • Conclusion 1: programmers need to learn functional programming and not category theory. The special knowledge required in functional programming (e.g., how to implement and use a free applicative functor in your programming language) is not going to be covered by any book in category theory.
  • Conclusion 2: basic definitions of category theory (category, functor) are useful as condensed formulations of general laws for a number of typeclasses. Unless a programmer has experience dealing with all the different laws of those typeclasses, it is unlikely that the appreciation of category theory will be of much help. At the same time, beyond the five or so basic concepts and definitions (category, functor, natural transformation, monoid, Yoneda identities), any further study of category theory is not likely to be of any use for a practicing functional programmer.
  • Conclusion 3: knowledge of category theory will not help us derive or prove laws for specific typeclasses, and will not help us implement those typeclasses correctly in code. The reason is that category theory is so general that it only talks about laws that apply generally to a large number of very different typeclasses (functor, contrafunctor, monad, filterable, applicative, pointed, etc.). For practical coding, e.g. to verify that our implementation of a specific monad is lawful, we need to learn not category theory but the techniques of symbolic derivation and proof. I am writing a new free textbook ("Science of Functional Programming", https://github.com/winitzki/sofp) to develop and explain these techniques with practical programmers in mind. My book is going to be very light on category theory, because I'm not going to use any advanced abstract concepts unless there is a significant gain for practical work.