I tried Haskell for 5 years and here’s how it was

One blogpost style which I find almost completely useless is “I tried Programming Language X for 5 days and here’s how it was.” Most of the time, the first impression is superficial discussing syntax and whether you could get Hello World to run.

This blogpost is I tried Haskell for 5 years and here’s how it was.

In the last few years, I have been (with others) developing ngless, a domain specific language and interpreter for next-generation sequencing. For partly accidental reasons, the interpreter is written in Haskell. Even though I kept using other languages (most Python and C++), I have now used Haskell quite extensively for a serious, medium-sized project (11,270 lines of code). Here are some scattered notes on Haskell:

There is a learning curve

Haskell is a different type of language. It takes a while to fully get used to it if you’re coming from a more traditional background.

I have debugged code in Java, even though I never really learned (or wrote) any Java. Java is just a C++ pidgin language.

The same is not true of Haskell. If you have never looked at Haskell code, you may have difficulty following even simple functions.

Once you learn it, though, you get it.

Haskell has some very nice libraries

You really have very nice libraries, written by people doing really useful things.

Conduit and Parsec are the basis of a lot of ngless code.

Here is an excellent curated list of Haskell library world (added May 4)

Haskell libraries are sometimes hard to figure out

I like to think that you need both hard documentation and soft documentation.

Hard documentation is where you describe every argument to a function and its effects. It is like a reference work (think of man pages). Soft documentation are tutorials and examples and more descriptive text. Well documented software and libraries will have both (there no need for anything in between, I don’t want soft serve documentation).

Haskell libraries often have extremely hard documentation: they will explain the details of functions, but little in the way of soft documentation. This makes it very hard to understand why a function could be useful in the first place and in which contexts to use this library.

This is exacerbated by the often extremely abstract nature of some of the libraries. Case in point, is the very useful MonadBaseControl class. Trust me, this is useful. However, because it is so generic, it is hard to immediately grasp what it does.

I do not wish to over-generalized. Conduit, mentioned above, has tutorials, blogposts, as well as hard documentation.

Haskell sometimes feels like C++

Like C++, Haskell is (in part) a research project with a single initial Big Idea and a few smaller ones. In Haskell’s case, the Big Idea was purely functional lazy evaluation (or, if you want to be pedantic, call it “non-strict” instead of lazy). In C++’s case, the Big Idea was high level object orientation without loss of performance compared to C.

Both C++ and Haskell are happy to incorporate academic suggestions into real-world computer languages. This doesn’t need elaboration in the case of Haskell, but C++ has also been happy to be at the cutting edge. For example, 20 years ago, you could already use C++ templates to perform (limited) programming with dependent types. C++ really pioneered the mechanism of generics and templates.

Like C++, Haskell is a huge language, where there are many ways to do something. You have multiple ways to represent strings, you have accidents of history kept for backwards compatibility. If you read an article from 10 years ago about the best way to do something in the language, that article is probably outdated by two generations.

Like C++, Haskell’s error messages take a while to get used to.

Like C++, there is a tension in the community between the purists and the practitioners.

Performance is hard to figure out

Haskell and GHC generally let me get good performance, but it is not always trivial to figure out a priori which code will run faster and in less memory.

In some trivial sense, you always depend on the compiler to make your code faster (i.e., if the compiler was infinitely smart, any two programs that produce the same result would compile to the same highly efficient code).

In practice, of course, compilers are not infinitely smart and so there faster and slower code. Still, in many languages you can look at two pieces of code and reasonably guess which one will be faster, at least within an order of magnitude.

Not so with Haskell. Even very smart people struggle with very simple examples. This is because the most generic implementation of the code tends to be very inefficient. However, GHC can be very smart and make your software very fast. This works 90% of the time, but sometimes you write code that does not trigger all the right optimizations and your function suddenly becomes 1,000x slower. I have once or twice written two almost identical versions of a function with large differences in performance (orders of magnitude).

This leads to the funny situation that Haskell is (partially correctly) seen as an academic language used by purists obsessed with elegance; while in practice, a lot of effort goes into making the code written as compiler-friendly as possible.

For the most part, though, this is not a big issue. Most of the code will run just fine and you optimize the inner loops at the end (just like in any other language), but it’s a pitfall to watch out for.

The easy is hard, the hard is easy

For minor tasks (converting between two file formats, for example), I will not use Haskell; I’ll do it Python: It has a better REPL environment, no need to set up a cabal file, it is easier to express simple loops, &c. The easy things are often a bit harder to do in Haskell.

However, in Haskell, it is trivial to add some multithreading capability to a piece of code with complete assurance of correctness. The line that if it compiles, it’s probably correct is often true.

Stack changed the game

Before stack came on the game, it was painful to make sure you had all the right libraries installed in a compatible way. Since stack was released, working in Haskell really has become much nicer. Tooling matters.

The really big missing piece is the equivalent of ccache for Haskell.

Summary

Haskell is a great programming language. It requires some effort at the beginning, but you get to learn a very different way of thinking about your problems. At the same time, the ecosystem matured significantly (hopefully signalling a trend) and the language can be great to work with.

Advertisements