Grit: a non-existent alternative interface to git

This post describes grit, which is an vaporware alternative interface to git (grit does not exist).

There are a few alternative interfaces to git, but they typically make the same mistake: they reason that git is complex because it supports many workflows that are only rarely useful, while 10% of the functionality would fill the needs of 90% of the uses. Git’s separation of mechanism and policy is confusing. They then conclude that, therefore, a more opinionated tool with built-in support for a small number of common workflows would be better.

I think this throws out the baby with the bathwater: git’s flexibility is its strength, its problems are rather that the user interface is awful. These are conceptually simpler problems. If you build a less powerful version, this also ensure that the system is not fully compatible with git and, therefore, I cannot recommend it as “better git” to students

Here, I describe an alternative, in the form of an imaginary tool, called grit (named for the most important quality in a student learning git). Grit could be just an alternative interface to a git repository that is completely, 100% compatible, enabling everything that git already does. In fact, it adds one important piece of extra complexity to git’s model.

Here is what is (a part of what) wrong with git and how grit fixes them:

1. The git subcommand names are a mess

There has been progress on this front with the introduction of git switch and friends, which fixes the worse offender (the overloading of git checkout to mean a million different operations), but it’s still a mess.

Grit uses a multi-subcommand interface. For example:

  • grit branch create
  • grit branch list
  • grit branch delete and, if there are unmerged commits: grit branch delete --force
  • grit branch rename
  • and so on… In particular, grit branch prints a help message.

Deleting a remote branch in grit is grit branch delete-remote origin/mytopic (as opposed to git push origin :mytopic — seriously, who can ever discover that interface?). Can you guess what grit tag delete-remote does?

Git has the concept of porcelain vs. plumbing, which is actually a pretty great concept, but packaged very badly. This is the whole theme of this post: git-the-concept is very good, git-the-command-line-tool is very bad (or, using the convention that normal font git refers to the concept, while monospaced git refers to the tool: git is good, but git is bad).

In grit, all the plumbing commands are a subcommand of internal:

  • grit internal rev-list
  • grit internal ls-remote

Everything is still there, but casual users clearly see that this is not for them.

2. Grit has a grit undo subcommand

Not only is the perhaps the number one issue that people ask about git, it is absurd that it is so for a version control system! Fast and easy undo is a major selling point of version control, but with git, undoing an action takes some magical combination of git reflog/git reset and despair. In grit, it’s:

grit undo

That’s it! If the working space has not been changed since the last grit command, then it brings it (and internal git status) back to where it was before you used the command. grit anything-at-all && grit undo is always a no-op.

This requires a new concept (similar to stash) in git to store the undo history, but is such an obvious sore point that it’s worth the extra complexity.

Technicalities: Using the option --no-undo means that the command should not generate an undo history entry. While keeping track of undo history is often cheap, there are a few exceptions. For example, grit branch delete-remote requires one to fetch the remote branch to be able to undo its deletion later and --no-undo skips that step for speed. If commits would be lost, the user is prompted for confirmation (unless --force is used, in which case, the branch is deleted, promptly, forcefully, and forever).

3. Grit has no multi-command wizards by default

Wizards are typically associated with graphical interfaces: a series of of menus where the user inputs all the information needed for a complex task.

Amazon’s checkout wizard is one many of us use regularly, but here is an example from the Wikipedia page on Wizards:

Kubuntu install wizard
Kubuntu install wizard

On the command-line there are two possibilities of how to build wizards: (i) you open a command line dialog (do you want to continue? [Y/n] and so on) or (ii) you require multiple command invocations and keep state between them. Regular git has both of these, but it prefers to use (ii), which is the most complicated.

For an example of (ii): if you rebase and there is a conflict, it will drop you into the shell, expect you to fix the conflict, require a few git add operations, then git rebase --continue. Many git commands take --continue, which is a sure sign of a wizard.

For an example of (i), you can use git add -p: it will query you on the different chunks, and at the end execute the operation. Git add -p is actually great in that even if you have already accepted some chunks, if you use CTRL-C to quit, it will cancel the whole operation.

grit also has both, but prefers to use (i) whenever possible. If there is a conflict, it will start an interface similar to the existing git add -p and work chunk-by-chunk. It can start a subshell if you need more time or leave the state in a suspended animation (like happens now with git), but that is not the default. If you abort the operation (CTRL-C, for example), it cancels everything and leaves you in same situation as before you started the operation.

Arguably, the use of the index (staging area) can be seen as a form of a having a commit wizard, but it’s so fundamental to git that grit keeps it.

4. Grit performs no destructive actions without the --force flag!

With git, it’s impossible to know whether a command will destroy anything. For example, when merging, it may or may not work:

If there is a conflict, it will clobber your files with those awful <<<<< lines!

This happens with a lot of git commands: git checkout may or may not overwrite your changes (causing them to be thrown away). If fact, anything that causes a merge may lead to the conflict situation.

With grit, if something is non-trivial to merge or will potentially destroy an existing file, it will either (1) refuse to do it or require confirmation or (2) open a wizard immediately. For example, merge conflicts result in a wizard being called to merge them. If you want the old-school clobbering, you can always choose the --conflict-to-file option (on the command line or in the wizard itself).

Final wordsgrit does not exist, so we cannot know whether git’s problems really are mostly at the surface or if a deeper redesign really is necessary. Maybe something like grit will be implemented, and it will turn out that it is still a usability nightmare. However, one needs to square the git circle: how did it win the version control wars when it so confusing? It was not on price and, particularly for the open-source world, it was not by management imposition. My answer remains that git’s power and flexibility (that derives from its model as a very flexible enhanced quasi-filesystem under the hood) are a strength that it worth knowing about and climbing that learning curve, but git’s command line interface is an atrocious mess of non-design.

Meta-ethics as an empirical question

The fundamental question of meta-ethics, namely what is the nature of ethical judgements? Are they, in some sense real? is an empirical question. Namely, if independently derived intelligences converge on a set of ethical statements, then this will be evidence that these are real.

A good expression of this view comes from Ian Bank’s Culture Series, namely The Hydrogen Sonata, what is called by the author, The Argument of Increasing Decency:

There was also the Argument of Increasing Decency, which basically held that cruelty was linked to stupidity and that the link between intelligence, imagination, empathy and good-behavior-as-it-was-generally-understood — i.e., not being cruel to others — was as profound as these matters ever got.

In fact, it is not this particular quote that best illustrates the argument, but Culture series as a whole as can be seen as a scify rendition of Fukuyama’ End of History1.

A not-so-uncommon knee-jerk reaction to the fear of super-intelligence AI is to quip “well, if the AI does get so intelligent, then it will not be so aggressive.” Frankly, I admire your dedication to realistic meta-ethics, but I am not sure we should bet our whole civilization on it.

I think most people are strong moral realists. Many who call themselves relativists turn out, with only modest probing, to be rigid realists, who don’t even understand the question and think that the only dimensions along which variation is reasonable are relatively shallow cultural practices. Very few would go as far as to say that “paper-clipping the universe is a goal just as valid as trying to achieve a more egalitarian society where multiple people flourish”.

If super-intelligent AIs, when they inevitably appear2, do share some moral intuitions with human scripture, then, I think we can say that meta-ethics is empirically solved and the realist side will have won. Otherwise, we’ll all be dead and the whole question will be a bit irrelevant.


  1. The author of the series would disagree, partially because Fukuyama includes a lightly regulated free market as part of his End of History, while the Culture Series attempts a more Marxist view of history, with communism being the pinacle of civilization. However, not only should we not trust authors too much when they discuss their own work (given that they are so likely to be biased), but implicity, the series agrees with E. O. Wilson’s comment about Communism: Great system, wrong species (he meant that communism is great for ants, not for humans), as the Culture is run by powerful super-computers (Minds) and humans are, basically, pets (see this quote from Surface Detail: “Though drones, avatars and even humans are one thing; the loss of any is not without moral and diplomatic import, of course, but might be dismissed as merely unfortunate and regrettable, something to be smoothed over through the usual channels. Attacking a ship, on the other hand, is an unambiguous act of war.”). The books are also at their most insightful when they argue forcefully how the West (the Culture) will tremble a bit if faced with some violent religious fanatics, but it is actually more militaristic and less decadent than these religious lunatics believe (and that it itself thinks), so that, in the end, the Islamic State (represented by the Iridians) don’t really stand a chance.
  2. I am not predicting that this will happen any time soon. But I don’t see why it shouldn’t happen in the next few centuries, comparing our knowledge and technological abilities today with those of a millenium agao, it seems more reasonable to posit super-human AI by the year 3000 than to deny its possibility.

How Notebooks Should Work

Joel Grus’ presentation on why he does not like notebooks sparked a flurry of notebook-related discussion.

I like the idea of notebooks more than I like actual notebooks. I tried to use them in my analyses for a long time, but eventually gave up as there are too many small annoyances (some that the talk goes over, others that it does not, such as the fact that they do not integrate well with git).

Here is how I think they should work instead:

  1. There is no hidden state. Cells are always run from top to bottom.
  2. If you change a cell in the middle, you immediately clear its output and all those below and the whole thing is run from the top.

For example:

[1] : Code
Output

[2] : Code
Output

[3] : Code
Output

[4] : Code
Output

[5] : Code
Output

Now, if you edit Cell 3, you would get:

[1] : Code
Output

[2] : Code
Output

[3] : New Code
New Output

[ ] : Code

[ ] : Code

If you want, you can run the whole thing now and get the full output:

[1] : Code
Output

[2] : Code
Output

[3] : New Code
New Output

[4] : Code
New Output

[5] : Code
New Output

This way, the whole notebook is always up to date.

But won’t this be incredibly slow if you always have to run it from the top?

Yes, if you implement it naïvely where the kernel really does always re-run from the top, which is not likely to be usable, but you could do a bit of smart caching and keep some intermediate states alive. It would require some engineering, but I think you could keep a few live kernels in intermediate states to make the experience usable so that if you edit cell number 35, it does not need to go back to the first cell, but maybe there is a cached kernel that has the state of cell 30 and only 31 and onwards would need to be rerun.

It would take a lot of engineering and it may even be impossible with the current structure of jupyter kernels, but, from a human point-of-view, I think this would be a better user experience.

A day has ~10⁵ seconds

This is trivial, but I make use of this all the time when doing some back-of-the-envelope calculation on whether some computation is going to take a few days or a few years:

A day has about 10⁵ seconds (24*60*60 = 86,400 ≈ 100,000).

So, if an operation takes 1 ms of CPU time, then you can do 10⁸ operations per day on one CPU and ~10¹¹ on 1,000 CPUs.

Python’s Weak Performance Matters

Here is an argument I used to make, but now disagree with:

Just to add another perspective, I find many “performance” problems in
the real world can often be attributed to factors other than the raw
speed of the CPython interpreter. Yes, I’d love it if the interpreter
were faster, but in my experience a lot of other things dominate. At
least they do provide low hanging fruit to attack first.

[…]

But there’s something else that’s very important to consider, which
rarely comes up in these discussions, and that’s the developer’s
productivity and programming experience.[…]

This is often undervalued, but shouldn’t be! Moore’s Law doesn’t apply
to humans, and you can’t effectively or cost efficiently scale up by
throwing more bodies at a project. Python is one of the best languages
(and ecosystems!) that make the development experience fun, high
quality, and very efficient.

(from Barry Warsaw)

I used to make this argument. Some of it is just a form of utilitarian programming: having a program that runs 1 minute faster but takes 50 extra hours to write is not worth it unless you run it >3000 times. For code that is written as part of data analysis, this is rarely the case. Now I think it is not as strong of an argument as I previously thought. Now I believe that the fact that CPython (the only widely used Python interpreter) is slow is a major disadvantage of the language and not just a small tradeoff for faster development time.

What changed in my reasoning?

First of all, I’m working on other problems. Whereas I used to do a lot of work that was very easy to map to numpy operations (which are fast as they use compiled code), now I write a lot of code which is not straight numerics. And, then, if I have to write it in standard Python, it is slow as molasses. I don’t mean slower in the sense of “wait a couple of seconds”, I mean “wait several hours instead of 2 minutes.”

At the same time, data keeps getting bigger and computers come with more and more cores (which Python cannot easily take advantage of), while single-core performance is only slowly getting better. Thus, Python is a worse and worse solution, performance-wise.

Other languages have also demonstrated that it is possible to get good performance with high-level code (using JIT or very aggressive compile-time optimizations). Looking from afar, the core Python development group seems uninterested in these ideas. They regularly pop-up in side projects: psyco, unladen swallow, stackless, shedskin, and pypy; the last one being the only one that is in active development; however, for all the buzz they generate they never make into CPython, which is still using the same basic bytecode stack-machine strategy that it used 20 years ago. Yes, optimizing a very dynamic language it’s not a trivial problem, but Javascript is at least as dynamic as Python is and it has several JIT-based implementations.

It is true that programmer time is more valuable than computer time, but waiting for results to finish computing is also a waste of my time (I suppose I could do something else in the meanwhile, but context switches are such a killer of my performance that I often just wait).

I have also sometimes found that, in order to make something fast in Python, I end up with complex code, almost unreadable, code. See this function for an example. The first time we wrote it, it was a loop based function, directly translating the formula it is computing. It took hours on a medium sized problem (it would take weeks on the real-life problems we want to tackle!). Now, it’s down to a few seconds, but unless you are much smarter than me, it’s not trivial to read out the underlying formula from the code.

The result is that I find myself doing more and more things in Haskell, which lets me write high-level code with decent performance (still slower than what I get if I go all the way down to C++, but with very good libraries). In fact, part of the reason that NGLess is written in Haskell and not Python is performance. I still use Jug (Python-based) to glue it all together, but it is calling Haskell code to do all the actual work.

I now sometimes prototype in Python, then do a kind of race: I start running the analysis on the main dataset, while at the same time reimplementing the whole thing in Haskell. Then, I start the Haskell version and try to make it finish before the Python-analysis completes. Many times, the Haskell version wins (even counting development time!).

Update: Here is a “fun” Python performance bug that I ran into the other day: deleting a set of 1 billion strings takes >12 hours. Obviously, this particular instance can be fixed, but this exactly the sort of thing that I would never have done a few years ago. A billion strings seemed like a lot back then, but now we regularly discuss multiple Terabytes of input data as “not a big deal”. This may not apply for your settings, but it does for mine.

Update 2: Based on a comment I made on hackernews, this is how I summarize my views:

The main motivation is to minimize total time, which is is TimeToWriteCode + TimeToRunCode.

Python has the lowest TimeToWriteCode, but very high TimeToRunCodeTimeToWriteCode is fixed as it is a human factor (after the initial learning curve, I am not getting that much smarter). However, as datasets grow and single-core performance does not get better TimeToRunCode keeps increasing, so that it is more and more worth it to spend more time writing code to decrease TimeToRunCode. C++ would give me the lowest TimeToRunCode, but at too high a cost in TimeToWriteCode (not so much the language, as the lack of decent libraries and package management). Haskell is (for me) a good tradeoff.

This is applicable to my work, where we do use large datasets as inputs. YMMV.

The Dark Looking Glass (Black Mirror fan fiction)

I’m currently in the middle of Black Mirror’s Season 4, I feel that despite it being probably the best show on TV, it starting to repeat the same themes in a way that makes it predictable: brain implants, immersive video games, Siri’s sister, &c.

Thus, I decided to think up a few episodes ideas myself. I felt that the show has not really explored the possibility of better mind-altering drugs, so a couple of random ideas follow.

§

The Dark Looking Glass Episode 1
Title: What happens in Vegas, stays in Vegas.

A new drug has the following effect: you take it and it has no effects until you go to sleep. Then, it erases the memory of what happened that day, except for vague feelings and emotional affect.

People start taking it for parties. Las Vegas hotels have these organized parties where everyone has to take the drug so that nobody will remember what happens (except for hotel security, who stays sober, but is sworn to secrecy). Society becomes more and more conservative, with wild behaviour confined to these forgetful parties, which themselves become wilder. You fly to Vegas for drug-fueled, sexual bacchanals every few months. Then you forget the details of everything that happened, but still feel liberated and relaxed so you live a conservative lifestyle the rest of the year.

At one of these parties, a group of friends (in their 50’s, typical middle-class Americans) have their regular sexual orgy (“do you think this is what we do every time?” asks a woman to a man who is not her husband while they have sex).

Then, a freak accident kills the sober security so that everyone at the party is under the forgetful drug.

Once they realize that they are now truly and completely free, a husband kills a wife; a wife kills her lover. Another man is left for dead, but survives. People run, scramble, hide. Eventually, everyone collapses of exhaustion; thus triggering the drug’s effect. The next day, they wake up afraid, terrified, but nobody knows why. The bodies are discovered, the police is called. Nobody knows who killed whom, who beat up the barely-alive man on the floor. The police cannot make any headway either: the whole house is a mess, everyone was there, the murder weapons are at the bottom of the pool.

The orgy participants know that some of them killed others, that somebody beat somebody up, but nobody knows who the killers are (if there is more than one). As local police are both stuck and embarrassed (they were supposed to provide security and failed), the survivors are let go from Vegas and fly back to suburban Atlanta. To help their teenage kids with their homework.

The Dark Looking Glass, Episode 2
Title: Japan

Open with Lithium, by Nirvana.

Society adds a drug to the water that makes everyone be nicer. Crime becomes almost non-existent, wars disappear (diplomatic solutions are sought after and found), things are good. A group of natural water advocates, however, starts drinking normal, non-drugged water. Think organic-eating yoga mom, not gun wielding libertarian. They are accepted by society as everyone is so nice and tolerant.

As this group of hippies lives without the drug, they talk about having more complete feelings and the euphoria. It takes months to years for the drug to fully wash out of your system, though. As it does, it becomes clear that adults having strong emotions without a lifetime of learning how to manage them is not a good idea and they become violent. Society is completely unprepared for it. The police are not armed, even with a stick, as disputes were always dealt with by reasoning before. Psychologically, nobody knows what to do.

The only solution is to force everyone to take the drug. After thousands more deaths, the makeshift police force does overwhelm the rebels and drug them.

Peace is restored.

No, research does not say that you produce more when working 40 hours per week

Last week, a debate flared up on twitter on working hours in academia and there was the claim that it is irrational to work over 40 hours as output actually goes down. I do not believe this claim.

A few starting notes:

  1. I am happy to be contradicted with data, but too often I see this issue being discussed with links to web articles citing other web articles, finally citing studies which suffer from the issues listed below.
  2. Maximum output at work is traded off against other valid personal goals. It is fine to argue that you prefer to produce less and spend more time with family or have more hobbies. Seriously, it’s a good argument. I just want people to make it instead of claiming a free lunch.
  3. I’m using mIF (mili Impact Factor points) as the unit of academic output below. This is a joke. If you want to talk about the impact factor, we can talk about it, but this is not what this post is about.
  4. I agree that presentialism (i.e., measuring or valuing how long people are present at a job) is an idiotic system (or cultural trait). This is an even worse system than measuring impact factor points. Again, this is not what this post is about.

I mostly think that every time a scientist says “Research shows…” and they’re wrong or using it to boost their political/personal beliefs, then anti-science activists deserve a point.

Measurement is hard

People lie about how much they work. They lie to conform to expectations and lies go in multiple directions. Thus, even though I do think that Americans (on average) work more than Europeans, I also think that Americans exaggerate how much they work and some workaholic Europeans exaggerate how much time they take off.

Cross-country studies will also often impute the legal work hours to workers in different countries even though these may not correspond to hours worked (officially, I work less now than during my PhD, but I actually work way more now).

Even well-meaning self-reports are terribly inaccurate. People count time spent at work even though they spent a lot of it on non-productive activities. It can even be hard to define the boundary between work and non-work. There is obvious work (me, writing a rebuttal letter to reviewers). There is obvious non-work (me, spending 30 minutes in the morning reading the newspaper online sitting at my work desk). But there is a vast grey zone: me, reading about Haskell bioinformatics libraries, or me writing an utility package in my free time that I end up using intensively at work. Often the obviously productive work ends up using ideas from the not-so-obviously productive bits.

This should lead us down the path of distrusting empirical studies. Not completely throwing them out the window, but being careful before claiming that “research shows …”.

It should also lead us to distrust the anecdotal reports of people who say they work 60 hours per week or those who have impressive CVs and claim to work only 35 hours and take long holidays.

What do you mean by productivity?

Often there is a game that is played in these discussions with the word productivity, as it is not always clear whether it refers to output per hour or output per week. For the moment, let’s be strict and say use it in the output per hour sense.

Marginal productivity starts going down well before it turns negative. Thus, if you are optimizing for average productivity, you end up at a lower number of hour than if you are optimizing for total output. Here is what I mean (see an earlier post on the shape of this curve):

productivity.png

Let’s say that academics produce impact factor points (the example goes for most other knowledge work). Because there are fixed time costs in academia (as in almost all knowledge work), the first hours of the week produce 0 IFs. It will depend on the exact situation but 10 hours a week can easily be spent on maintenance work (up to 20 or 30 if one is not careful). Then, the very productive hours produce 15mIF/hour. As more hours are worked, one can become tired, and the additional hours start producing less than 15mIF (thus, marginal productivity is diminishing). As we take it to the extreme, our academic becomes so tired, he cannot produce anything at all or even produces negative IF (for example, by disrupting other people’s projects).

If you are hiring people by the hour, you want them to work to the point where output/hour is optimized, which is the traditional justification for why companies should have shorter work weeks. However, this can be well below the point at which output is maximal.

Looking at some empirical work, it does seem that while the point of productivity inflection is just about 40 hours per week, the point of maximum output is above 50 hours/week.

Screenshot_20170704_174633.png

Thus, if you are managing a widget factory, you may not want your workers working more than 40-45 hours for your own selfish reasons. But this does not mean that this is the point of maximum output.

Anecdotally, it does seem that many people work 40 hours at their main jobs and still engage in either a second lower-paying job or in non-leisure cost-saving activities (with lower implied wages than their main job, although these are untaxed).

Averages hide variances

Again, work that is directed at managers of widget factories is not necessarily a guide to your behaviour. Perhaps some workers peak (in their average productivity) at 30 hours, others 40, still others at 50. If you are managing as a group, go for the average (look at the spread in the empirical plot above).

Maybe this is not where your maximum is. Maybe too, one can train to increase one’s maximum. Maybe your maximum this week is at 20 hours and the next week at 60.

Also, as I write above, many people take either formal second job or undertake secondary cost-saving activities. Often these can be more flexibly scheduled than their main jobs. For example, someone who regularly does a longer trip to a cheaper grocery store to save a few bucks may skip that “second job” in the weeks where they are tired or have good leisure alternatives. Or they may only get around to fixing their own washing machine when they have a few hours without any better things to do.

As free-range knowledge workers, we get all of this flexibility already (remember the old joke that in academia you can work whichever 80 hours of the week you want). Perhaps this already alleviates many of the drawbacks of going above the widget-makers optimum. I certainly know that I enjoy the flexibility and that, while on average, I do work longer weeks, this is not true of every single week.

In a competition, payoffs can be heavily non-linear

It remains a great injustice that even though I can run 100m in just twice as much time as Usain Bolt, I cannot get even a tenth of his pay.

Sports are the extreme case as they are almost pure competition, but they do make the point clear: in competitive fields, just a bit more output can make a huge difference. In science, getting a project finished in 10 months instead of 11 months may be the difference between getting or not getting scooped. A paper that is just slightly better may get accepted while one that neglected that one extra experiment does not. A grant that scores two percentage points higher gets funding. And so on.

Unfortunately, in most cases, we cannot know what would have happened if we had just added that one extra experiment to the paper or submitted the grant without that bit of preliminary data we we collected just before submission. But saying that we can never know is an epistemological argument, the reality still remains that a little extra effort can have a big payout.

Conclusion

I keep reading/hearing this claim that “research shows that you shouldn’t work as much” or that “research shows that 40 hours per week is the best”. It would be good if it were true: it would be a free lunch, but I just do not see that in the research. What I often see is a muddling of the term “productivity” which does not appreciate the difference between maximum avg. output/hour and maximum output/week.

I am happy to be corrected with the right citations, but do make sure that they address the points above.

 

I tried Haskell for 5 years and here’s how it was

One blogpost style which I find almost completely useless is “I tried Programming Language X for 5 days and here’s how it was.” Most of the time, the first impression is superficial discussing syntax and whether you could get Hello World to run.

This blogpost is I tried Haskell for 5 years and here’s how it was.

In the last few years, I have been (with others) developing ngless, a domain specific language and interpreter for next-generation sequencing. For partly accidental reasons, the interpreter is written in Haskell. Even though I kept using other languages (most Python and C++), I have now used Haskell quite extensively for a serious, medium-sized project (11,270 lines of code). Here are some scattered notes on Haskell:

There is a learning curve

Haskell is a different type of language. It takes a while to fully get used to it if you’re coming from a more traditional background.

I have debugged code in Java, even though I never really learned (or wrote) any Java. Java is just a C++ pidgin language.

The same is not true of Haskell. If you have never looked at Haskell code, you may have difficulty following even simple functions.

Once you learn it, though, you get it.

Haskell has some very nice libraries

You really have very nice libraries, written by people doing really useful things.

Conduit and Parsec are the basis of a lot of ngless code.

Here is an excellent curated list of Haskell library world (added May 4)

Haskell libraries are sometimes hard to figure out

I like to think that you need both hard documentation and soft documentation.

Hard documentation is where you describe every argument to a function and its effects. It is like a reference work (think of man pages). Soft documentation are tutorials and examples and more descriptive text. Well documented software and libraries will have both (there no need for anything in between, I don’t want soft serve documentation).

Haskell libraries often have extremely hard documentation: they will explain the details of functions, but little in the way of soft documentation. This makes it very hard to understand why a function could be useful in the first place and in which contexts to use this library.

This is exacerbated by the often extremely abstract nature of some of the libraries. Case in point, is the very useful MonadBaseControl class. Trust me, this is useful. However, because it is so generic, it is hard to immediately grasp what it does.

I do not wish to over-generalize. Conduit, mentioned above, has tutorials, blogposts, as well as hard documentation.

Haskell sometimes feels like C++

Like C++, Haskell is (in part) a research project with a single initial Big Idea and a few smaller ones. In Haskell’s case, the Big Idea was purely functional lazy evaluation (or, if you want to be pedantic, call it “non-strict” instead of lazy). In C++’s case, the Big Idea was high level object orientation without loss of performance compared to C.

Both C++ and Haskell are happy to incorporate academic suggestions into real-world computer languages. This doesn’t need elaboration in the case of Haskell, but C++ has also been happy to be at the cutting edge. For example, 20 years ago, you could already use C++ templates to perform (limited) programming with dependent types. C++ really pioneered the mechanism of generics and templates.

Like C++, Haskell is a huge language, where there are many ways to do something. You have multiple ways to represent strings, you have accidents of history kept for backwards compatibility. If you read an article from 10 years ago about the best way to do something in the language, that article is probably outdated by two generations.

Like C++, Haskell’s error messages take a while to get used to.

Like C++, there is a tension in the community between the purists and the practitioners.

Performance is hard to figure out

Haskell and GHC generally let me get good performance, but it is not always trivial to figure out a priori which code will run faster and in less memory.

In some trivial sense, you always depend on the compiler to make your code faster (i.e., if the compiler was infinitely smart, any two programs that produce the same result would compile to the same highly efficient code).

In practice, of course, compilers are not infinitely smart and so there faster and slower code. Still, in many languages you can look at two pieces of code and reasonably guess which one will be faster, at least within an order of magnitude.

Not so with Haskell. Even very smart people struggle with very simple examples. This is because the most generic implementation of the code tends to be very inefficient. However, GHC can be very smart and make your software very fast. This works 90% of the time, but sometimes you write code that does not trigger all the right optimizations and your function suddenly becomes 1,000x slower. I have once or twice written two almost identical versions of a function with large differences in performance (orders of magnitude).

This leads to the funny situation that Haskell is (partially correctly) seen as an academic language used by purists obsessed with elegance; while in practice, a lot of effort goes into making the code written as compiler-friendly as possible.

For the most part, though, this is not a big issue. Most of the code will run just fine and you optimize the inner loops at the end (just like in any other language), but it’s a pitfall to watch out for.

The easy is hard, the hard is easy

For minor tasks (converting between two file formats, for example), I will not use Haskell; I’ll do it Python: It has a better REPL environment, no need to set up a cabal file, it is easier to express simple loops, &c. The easy things are often a bit harder to do in Haskell.

However, in Haskell, it is trivial to add some multithreading capability to a piece of code with complete assurance of correctness. The line that if it compiles, it’s probably correct is often true.

Stack changed the game

Before stack came on the game, it was painful to make sure you had all the right libraries installed in a compatible way. Since stack was released, working in Haskell really has become much nicer. Tooling matters.

The really big missing piece is the equivalent of ccache for Haskell.

Summary

Haskell is a great programming language. It requires some effort at the beginning, but you get to learn a very different way of thinking about your problems. At the same time, the ecosystem matured significantly (hopefully signalling a trend) and the language can be great to work with.

(Computer-programming) language wars a bit silly, but not irrational

I don’t know where I heard it (and it was probably not first hand) the
observation of how weird it is that in the 21st century computer professionals
segregate by the language they use to talk to the machine. It just seems silly, doesn’t it?

Programming language discussions (R vs Python for data science, C++ or Python
for computer vision, Java or C# or Ruby for webapps, …) are a stable of
geekdom and easy to categorize as silly. In this short post, I’ll argue that
that while silly they are not completely irrational.

Programming languages are mostly about tooling

Some languages are better than others, but most of what it matters is not
whether the language itself is any good, but how large the ecosystem around it
is. You can have a perfect language, but if there is no support for it in your
favorite editor/IDE, no good HTTPS libraries which can handle HTTP2.0, then
working in it will be efficient or even less pleasant than working in Java. On
the other hand, PHP is a terrible terrible language, but its ecosystem is (for
its limited domain) very nice. R is a slightly less terrible version of this: not a great language, but a lot of nice libraries and a good culture of documentation.

Haskell is a pretty nice programming language, but working in it got much nicer
once stack appeared on the scene. The
language is the same, even the set of libraries is the same, but having a
better way to install packages is enough to fundamentally change your
experience.

On the other hand, Haskell is (still?) enough of a niche language than nobody
has yet written a tool comparable to ccache for
the C/C++ world (instantaneous rebuilds are amazing for a compiled language).

The value of your code increases if you program in a popular language

This is not strictly true: if the work is self-contained, then it may be very
useful on its own even if you wrote it in COBOL, but often the more people can
build upon your work, the more valuable that work is. So if your work is
written in C or Python as opposed to Haskell or Ada, everything else being
equal, it will be more valuable (not everything else is equal, though).

This is somewhat field-dependent. Knowing R is great if you’re a
bioinformatician, but almost useless if you’re writing webserver code. Even
general-purpose languages get niches based on history and tools. Functional
programming languages somehow seems to be more popular in the financial sector
than in other fields (R has a lot of functional elements, but is not typically
thought of as a functional language; probably because functional languages are
“advanced” and R is “for beginners”).

Still, a language that is popular in its field will make your own code more
valuable. Packages upon which you depend will be more likely to be maintained,
tools will improve. If you release a package yourself, it will be more used
(and, if you are in science, maybe even cited).

Changing languages is easy, but costly

Any decent programmer can “pick up” a new language in a few days. I can
probably even debug code in any procedural language even without having ever
seen it before. However, to really become proficient, it often takes much
longer: you need to encounter and internalize the most natural way to do things
in the new language, the quirks of the interpreter/compiler, learn about
different libraries and tools, &c. None of this is “hard”, but it all takes a
long time.

Programming languages have network effects

This is all a different way of saying that programming languages have network
effects
. Thus, if I use language X, it is generally better for me if others
also use it. Not always explicitly, but I think this is the rationale for the programming language discussions.

Scipy’s mannwhitneyu function

Without looking it up, can you say what the following code does:

import numpy as np
from scipy import stats
a = np.arange(25)
b = np.arange(25)+4
print(stats.mannwhitneyu(a , b))

You probably guessed that it computes the Mann-Whitney test between two samples, but exactly which test? The two-sided or the one-sided test?

You can’t tell from the code because it depends on which version of scipy you are running and it has gone back and forth between the two! Pre-0.17.0 it used the one-sided test with the side being decided based on the input data. This was obviously the wrong thing to do. Then, the API was fixed in 0.17.0 to do the two-sided test. This was considered a bad thing because it broke backwards compatibility and now it’s back to performing the one-sided test! I wish I was making this up. 

Reading through the github issues (#4933, #6034,  #6062, #6100)  is an example of how open source projects can stagnate. There is a basic, simple, solution to the issue: create a corrected version of the function with a new name and deprecate the old one. This keeps backwards compatibility while allowing the project to fix its API. Once the issue had been identified, this should have been a 20 minute job. Reading through the issues, this simple solution is proposed, discussed, seemingly agreed to. Instead, something else happens and at this point, it’d take me longer than 20 minutes to just read through the whole discussions.

This is not the first time I have run into numpy/scipy’s lack of respect for backwards compatibility either. Fortunately, there is a solution to this case, which is to use the full version:

stats.mannwhitneyu(a, b, alternative='two-sided')