I tried Haskell for 5 years and here’s how it was

One blogpost style which I find almost completely useless is “I tried Programming Language X for 5 days and here’s how it was.” Most of the time, the first impression is superficial discussing syntax and whether you could get Hello World to run.

This blogpost is I tried Haskell for 5 years and here’s how it was.

In the last few years, I have been (with others) developing ngless, a domain specific language and interpreter for next-generation sequencing. For partly accidental reasons, the interpreter is written in Haskell. Even though I kept using other languages (most Python and C++), I have now used Haskell quite extensively for a serious, medium-sized project (11,270 lines of code). Here are some scattered notes on Haskell:

There is a learning curve

Haskell is a different type of language. It takes a while to fully get used to it if you’re coming from a more traditional background.

I have debugged code in Java, even though I never really learned (or wrote) any Java. Java is just a C++ pidgin language.

The same is not true of Haskell. If you have never looked at Haskell code, you may have difficulty following even simple functions.

Once you learn it, though, you get it.

Haskell has some very nice libraries

You really have very nice libraries, written by people doing really useful things.

Conduit and Parsec are the basis of a lot of ngless code.

Here is an excellent curated list of Haskell library world (added May 4)

Haskell libraries are sometimes hard to figure out

I like to think that you need both hard documentation and soft documentation.

Hard documentation is where you describe every argument to a function and its effects. It is like a reference work (think of man pages). Soft documentation are tutorials and examples and more descriptive text. Well documented software and libraries will have both (there no need for anything in between, I don’t want soft serve documentation).

Haskell libraries often have extremely hard documentation: they will explain the details of functions, but little in the way of soft documentation. This makes it very hard to understand why a function could be useful in the first place and in which contexts to use this library.

This is exacerbated by the often extremely abstract nature of some of the libraries. Case in point, is the very useful MonadBaseControl class. Trust me, this is useful. However, because it is so generic, it is hard to immediately grasp what it does.

I do not wish to over-generalized. Conduit, mentioned above, has tutorials, blogposts, as well as hard documentation.

Haskell sometimes feels like C++

Like C++, Haskell is (in part) a research project with a single initial Big Idea and a few smaller ones. In Haskell’s case, the Big Idea was purely functional lazy evaluation (or, if you want to be pedantic, call it “non-strict” instead of lazy). In C++’s case, the Big Idea was high level object orientation without loss of performance compared to C.

Both C++ and Haskell are happy to incorporate academic suggestions into real-world computer languages. This doesn’t need elaboration in the case of Haskell, but C++ has also been happy to be at the cutting edge. For example, 20 years ago, you could already use C++ templates to perform (limited) programming with dependent types. C++ really pioneered the mechanism of generics and templates.

Like C++, Haskell is a huge language, where there are many ways to do something. You have multiple ways to represent strings, you have accidents of history kept for backwards compatibility. If you read an article from 10 years ago about the best way to do something in the language, that article is probably outdated by two generations.

Like C++, Haskell’s error messages take a while to get used to.

Like C++, there is a tension in the community between the purists and the practitioners.

Performance is hard to figure out

Haskell and GHC generally let me get good performance, but it is not always trivial to figure out a priori which code will run faster and in less memory.

In some trivial sense, you always depend on the compiler to make your code faster (i.e., if the compiler was infinitely smart, any two programs that produce the same result would compile to the same highly efficient code).

In practice, of course, compilers are not infinitely smart and so there faster and slower code. Still, in many languages you can look at two pieces of code and reasonably guess which one will be faster, at least within an order of magnitude.

Not so with Haskell. Even very smart people struggle with very simple examples. This is because the most generic implementation of the code tends to be very inefficient. However, GHC can be very smart and make your software very fast. This works 90% of the time, but sometimes you write code that does not trigger all the right optimizations and your function suddenly becomes 1,000x slower. I have once or twice written two almost identical versions of a function with large differences in performance (orders of magnitude).

This leads to the funny situation that Haskell is (partially correctly) seen as an academic language used by purists obsessed with elegance; while in practice, a lot of effort goes into making the code written as compiler-friendly as possible.

For the most part, though, this is not a big issue. Most of the code will run just fine and you optimize the inner loops at the end (just like in any other language), but it’s a pitfall to watch out for.

The easy is hard, the hard is easy

For minor tasks (converting between two file formats, for example), I will not use Haskell; I’ll do it Python: It has a better REPL environment, no need to set up a cabal file, it is easier to express simple loops, &c. The easy things are often a bit harder to do in Haskell.

However, in Haskell, it is trivial to add some multithreading capability to a piece of code with complete assurance of correctness. The line that if it compiles, it’s probably correct is often true.

Stack changed the game

Before stack came on the game, it was painful to make sure you had all the right libraries installed in a compatible way. Since stack was released, working in Haskell really has become much nicer. Tooling matters.

The really big missing piece is the equivalent of ccache for Haskell.


Haskell is a great programming language. It requires some effort at the beginning, but you get to learn a very different way of thinking about your problems. At the same time, the ecosystem matured significantly (hopefully signalling a trend) and the language can be great to work with.

(Computer-programming) language wars a bit silly, but not irrational

I don’t know where I heard it (and it was probably not first hand) the
observation of how weird it is that in the 21st century computer professionals
segregate by the language they use to talk to the machine. It just seems silly, doesn’t it?

Programming language discussions (R vs Python for data science, C++ or Python
for computer vision, Java or C# or Ruby for webapps, …) are a stable of
geekdom and easy to categorize as silly. In this short post, I’ll argue that
that while silly they are not completely irrational.

Programming languages are mostly about tooling

Some languages are better than others, but most of what it matters is not
whether the language itself is any good, but how large the ecosystem around it
is. You can have a perfect language, but if there is no support for it in your
favorite editor/IDE, no good HTTPS libraries which can handle HTTP2.0, then
working in it will be efficient or even less pleasant than working in Java. On
the other hand, PHP is a terrible terrible language, but its ecosystem is (for
its limited domain) very nice. R is a slightly less terrible version of this: not a great language, but a lot of nice libraries and a good culture of documentation.

Haskell is a pretty nice programming language, but working in it got much nicer
once stack appeared on the scene. The
language is the same, even the set of libraries is the same, but having a
better way to install packages is enough to fundamentally change your

On the other hand, Haskell is (still?) enough of a niche language than nobody
has yet written a tool comparable to ccache for
the C/C++ world (instantaneous rebuilds are amazing for a compiled language).

The value of your code increases if you program in a popular language

This is not strictly true: if the work is self-contained, then it may be very
useful on its own even if you wrote it in COBOL, but often the more people can
build upon your work, the more valuable that work is. So if your work is
written in C or Python as opposed to Haskell or Ada, everything else being
equal, it will be more valuable (not everything else is equal, though).

This is somewhat field-dependent. Knowing R is great if you’re a
bioinformatician, but almost useless if you’re writing webserver code. Even
general-purpose languages get niches based on history and tools. Functional
programming languages somehow seems to be more popular in the financial sector
than in other fields (R has a lot of functional elements, but is not typically
thought of as a functional language; probably because functional languages are
“advanced” and R is “for beginners”).

Still, a language that is popular in its field will make your own code more
valuable. Packages upon which you depend will be more likely to be maintained,
tools will improve. If you release a package yourself, it will be more used
(and, if you are in science, maybe even cited).

Changing languages is easy, but costly

Any decent programmer can “pick up” a new language in a few days. I can
probably even debug code in any procedural language even without having ever
seen it before. However, to really become proficient, it often takes much
longer: you need to encounter and internalize the most natural way to do things
in the new language, the quirks of the interpreter/compiler, learn about
different libraries and tools, &c. None of this is “hard”, but it all takes a
long time.

Programming languages have network effects

This is all a different way of saying that programming languages have network
. Thus, if I use language X, it is generally better for me if others
also use it. Not always explicitly, but I think this is the rationale for the programming language discussions.

Scipy’s mannwhitneyu function

Without looking it up, can you say what the following code does:

import numpy as np
from scipy import stats
a = np.arange(25)
b = np.arange(25)+4
print(stats.mannwhitneyu(a , b))

You probably guessed that it computes the Mann-Whitney test between two samples, but exactly which test? The two-sided or the one-sided test?

You can’t tell from the code because it depends on which version of scipy you are running and it has gone back and forth between the two! Pre-0.17.0 it used the one-sided test with the side being decided based on the input data. This was obviously the wrong thing to do. Then, the API was fixed in 0.17.0 to do the two-sided test. This was considered a bad thing because it broke backwards compatibility and now it’s back to performing the one-sided test! I wish I was making this up. 

Reading through the github issues (#4933, #6034,  #6062, #6100)  is an example of how open source projects can stagnate. There is a basic, simple, solution to the issue: create a corrected version of the function with a new name and deprecate the old one. This keeps backwards compatibility while allowing the project to fix its API. Once the issue had been identified, this should have been a 20 minute job. Reading through the issues, this simple solution is proposed, discussed, seemingly agreed to. Instead, something else happens and at this point, it’d take me longer than 20 minutes to just read through the whole discussions.

This is not the first time I have run into numpy/scipy’s lack of respect for backwards compatibility either. Fortunately, there is a solution to this case, which is to use the full version:

stats.mannwhitneyu(a, b, alternative='two-sided')

Anscombe’s Quartet Animated

Anscombe’s Quartet is a set of four 2D datasets which have the same mean and variance in both X & Y as well as the same relationship between the two variables, even though they look very different.

I built a little animation to show all four datasets and a smooth transition between them:

Animation showing Anscombe's Quartet

Animation showing Anscombe’s Quartet

The black line is the mean Y value and the two dotted lines represent the mean ± std dev., the blue line is the least square regression between x and y. These are recomputed at each frame. In a sense, all the frames are like Anscombe sets.


The script for generating these is on github. I enjoyed playing around with theano for easy automatic differentiation (these type of derivatives are easy, but somehow I always get a sign wrong or a factor of 2 missing in the first try).

The UK medal count really is impressive, the US is just as expected

Repeating my analysis from last week on medal counts. To recap, let’s look for models that predict a country’s medal count based on GDP/population and then check which countries over- or under-perform their size and wealth. In the end, simple total GDP at market rates was the best predictor.

Measured by ratio of obtained medals to predicted medals, Russia was still overperforming (and they got banned from several of the events, so that is very impressive, although we’ll never know of which of the other sports they should have gotten banned from). Interestingly, we also see several caucusian countries showing up. And the big winner of this year’s Olympics, Great Britain, does show up as getting many more medals than their GDP predicts.

Finally, note that France, which was underperfoming at the beginning of the Olympics, not only caught up, but made it to the over-performers table (très bien, la France!).

Over performing countries
                   delta  got  predicted     ratio
Russia         41.352770   56  14.647230  3.823248
Azerbaijan     11.717356   18   6.282644  2.865036
Great Britain  42.256925   67  24.743075  2.707828
New Zealand    10.872079   18   7.127921  2.525280
Kazakhstan      9.783999   17   7.216001  2.355875
Hungary         8.201280   15   6.798720  2.206298
Kenya           6.573411   13   6.426589  2.022846
Uzbekistan      6.550104   13   6.449896  2.015536
Australia      13.899721   29  15.100279  1.920494
France         19.639542   42  22.360458  1.878316

Note that neither the US nor China show up. If anything, they are performing slightly below expectations.

Now, for the bottom half:

Under performing countries
                          delta  got  predicted     ratio
India                -18.620390    2  20.620390  0.096991
Nigeria               -8.497863    1   9.497863  0.105287
Austria               -7.754915    1   8.754915  0.114222
United Arab Emirates  -7.728790    1   8.728790  0.114563
Singapore             -7.190392    1   8.190392  0.122094
Philippines           -7.185019    1   8.185019  0.122174
Finland               -6.753509    1   7.753509  0.128974
Portugal              -6.539120    1   7.539120  0.132641
Qatar                 -6.316772    1   7.316772  0.136672
Puerto Rico           -5.873931    1   6.873931  0.145477

India got two medals (neither of which gold, one silver and one bronze) even though they are on track to becoming one of the world’s largest economies (right now, their GDP is comparable to Italy’s, but growing fast, while Italy is stagnant).

Several oil countries (unearned wealth) are listed there. The 3 richest countries not to win a medal at all are Saudi ArabiaPakistan, and Chile; another trio of resource rich countries.


You can run the whole analysis on a mybinder repo.

At the Olympics, the US is underwhelming, Russia still overperforms, and what’s wrong with Southern Europe (except Italy)?

Russia is doing very well. The US and China, for all their dominance of the raw medal tables are actually doing just as well as you’d expect.

Portugal, Spain, and Greece should all be upset at themselves, while the fourth little piggy, Italy, is doing quite alright.

What determines medal counts?

I decided to play a data game with Olympic Gold medals and ask not just “Which countries get the most medals?” but a couple of more interesting questions.

My first guess of what determines medal counts was total GDP. After all, large countries should get more medals, but economic development should also matter. Populous African countries do not get that many medals after all and small rich EU states still do.

Indeed, GDP (at market value), does correlate quite well with the weighted medal count (an artificial index where gold counts 5 points, silver 3, and bronze just 1)

Much of the fit is driven by the two left-most outliers: US and China, but the fit explains 64% of the variance, while population explains none.

Adding a few more predictors, we can try to improve, but we don’t actually do that much better. I expect that as the Games progress, we’ll see the model fits become tighter as the sample size (number of medals) increases. In fact, the model is already performing better today than it was yesterday.

Who is over/under performing?

The US and China are right on the fit above. While they have more medals than anybody else, it’s not surprising. Big and rich countries get more medals.

The more interesting question is: which are the countries that are getting more medals than their GDP would account for?

Top 10 over performers

These are the 10 countries which have a bigger ratio of actual total medals to their predicted number of medals:

                delta  got  predicted     ratio
Russia       6.952551   10   3.047449  3.281433
Italy        5.407997    9   3.592003  2.505566
Australia    3.849574    7   3.150426  2.221921
Thailand     1.762069    4   2.237931  1.787366
Japan        4.071770   10   5.928230  1.686844
South Korea  1.750025    5   3.249975  1.538473
Hungary      1.021350    3   1.978650  1.516185
Kazakhstan   0.953454    3   2.046546  1.465884
Canada       0.538501    4   3.461499  1.155569
Uzbekistan   0.043668    2   1.956332  1.022322

Now, neither the US nor China are anywhere to be seen. Russia’s performance validates their state-funded sports program: the model predicts they’d get around 3 medals, they’ve gotten 10.

Italy is similarly doing very well, which surprised me a bit. As you’ll see, all the other little piggies perform poorly.

Australia is less surprising: they’re a small country which is very much into sports.

After that, no country seems to get more than twice as many medals as their GDP would predict, although I’ll note how Japan/Thailand/South Kore form a little Eastern Asia cluster of overperformance.

Top 10 under performers

This brings up the reverse question: who is underperforming? Southern Europe, it seems: Spain, Portugal, and Greece are all there with 1 medal against predictions of 9, 6, and 6.

France is country which is missing the most medals (12 predicted vs 3 obtained)! Sometimes France does behave like a Southern European country after all.

                delta  got  predicted     ratio
Spain       -8.268615    1   9.268615  0.107891
Poland      -6.157081    1   7.157081  0.139722
Portugal    -5.353673    1   6.353673  0.157389
Greece      -5.342835    1   6.342835  0.157658
Georgia     -4.814463    1   5.814463  0.171985
France      -9.816560    3  12.816560  0.234072
Uzbekistan  -3.933072    2   5.933072  0.337093
Denmark     -3.566784    3   6.566784  0.456845
Philippines -3.557424    3   6.557424  0.457497
Azerbaijan  -2.857668    3   5.857668  0.512149
The Caucasus (Georgia, Uzbekistan, Azerbaijan) may show up as their wealth is mostly due to natural resources and not development per se (oil and natural gas do not win medals, while human capital development does).
I expect that these lists will change as the Games go on as maybe Spain is just not as good at the events that come early in the schedule. Expect an updated post in a week.
Technical details

The whole analysis was done as a Jupyter notebook, available on github. You can use mybinder to explore the data. There, you will even find several little widgets to play around.

Data for medal counts comes from the medalbot.com API, while GDP/population data comes from the World Bank through the wbdata package.

Should “we” prefer more expensive medical treatments?

Imagine a disease with two possible treatments: Treatment A works OKish people and is cheap (50-500 USD per treatment).

Treatment B works as well as A (studies typically find no statistical difference, if anything B sometimes performs slightly worse), but is 10-100x more expensive than treatment A (1-10k USD per treatment).

Some patients prefer A and other prefer B as the side-effects and compliance requirements are different (none are very serious medically, but cause some annoyances). Some doctors are agnostic between A and B, while others tend to prefer A or B. The doctors who prefer and recommend B will often get some of the extra cash that B generates (but not always).


  • Should A be preferred to B?
  • Should the extra cost of B be supported by the patient themselves (as opposed to insurance covering both A and at similar reimbursement rates, private or public)?
  • Should B be allowed by regulation?

Now, the hard questions (with comments):

  • Does it matter if the disease is “major depression”, A is “anti-depressants”, and B is “cognitive behavioral therapy” (or other forms of talk-therapy)?

This is one specific context in which I keep hearing the argument that “B should be recommended as its as good as A“. It just seems like a very weak argument. SSRIs or other pharmacological anti-depressants should be the default treatment as they have lower costs.

Generic SSRI are about 10$/month: no therapist will give you a comparable rate: in fact, the major cost of generic SSRI therapy is whatever the physician charges to renew the prescription.

Perhaps if the technology for bot-based cognitive behavioral therapy catches up, then CBT may become as cheap as medication.

  • What if A is “physiotherapy” and B is “back surgery”?

Same: the cheaper treatment should be the default.