Modernity

Modernity

The Bourne shell was first released in 1977 (37 years ago).

The C Programming Language book was published in 1978 (36 years ago), describing the C language which had been out for a few years.

Python was first released in 1991 (23 years ago). The first version that looks very much like the current Python was version 2.2, released in 2001 (13 years ago), but the code from Python 1.0.1 is already pretty familiar to my eyes.

The future might point in the direction of functional languages such as Haskell, which first appeared in 1990 (24 years ago), although the first modern version is from 1998 (Haskell 98).

Vim was first released in 1991, based on vi released in 1976. Some people prefer Emacs, released a bit earlier (GNU Emacs, however, is fairly recent, only released in 1985; so 29 years ago)

The web was first implemented in 1989 with some preliminary work in the early 1980s (although the idea of hypertext had been around for longer).

§

The only really new software I use regularly is distributed version control systems, which are less than 20 years old in both implementation and ideas.

Edit: the original version of the post had a silly arithmetic error pointed out in the comments. Thanks

Friday Links

1. A nice talk on category theory

2. Reevaluating electroshock therapy

3. It’s world homeopathy awareness week! I’ll quote @mocost on twitter:

It’s World Homeopathy Awareness Week, apparently. Are you aware that homeopathy is pseudoscientific bullshit?

4. Angst in Germany over English invasion. I liked this bit about the lounge at train stations:

“I’m not sure if calling it a ‘lounge’ is better than using the German word ‘warteraum,’ ” Renner says. “I guess it’s more modern or hip.”

Warteraum just means waiting area (both literally and not-so-literately).

Recently, I got enough points to get “frequent travel status” on German trains. One of the perks is free access to the lounge area. Somehow, that perk would not feel like such a perk if they had written free access to the waiting area.

Quote of the Day

Shipping is a feature. A really important feature. Your product must have it. — Joel Spolsky

Papers are the same. Submission is an important argument. Your paper must have it.

Denmark

I was in Denmark last week, teaching software carpentry. The students were very enthusiastic, but they had very different starting points, which made teaching harder.

For a complete beginner’s to programming course, I typically rely heavily on the Python Tutor created by Philip Guo, which is an excellent tool. Then, my goal is to get them to understand names, objects, and the flow of control.

I don’t use the term variable when discussing Python as I don’t think it’s a very good concept. C has variables, which work like little boxes you put values in. If you’re thinking of little boxes in Python, things get confusing. If you try to think of little boxes plus pointers (or references), it’s still not a very good map of what Python is actually doing.

For more intermediate students (the kind that has used one programming language), I typically still go through this closely. I find that many still have major faults in their mental model of how names and objects work. However, for these intermediate  students have, this can go much faster [1]. If it’s the first time they are even seeing the idea of writing code, then it naturally needs to be slow.

Last week, because the class was so mixed, it was probably too slow for some and too fast for others.

§

A bit of Danish weirdness:

sausage

 A sausage display at a local pub

[1] I suppose if students knew Haskell quite well but no imperative programming, this may no longer apply, but teaching Python to Haskell programmers is not a situation I have been in.

2048 is just cow clicking in binary.

How to win at 2048 or why 2048 is a stupid game

(This post contains a 2048 spoiler)

I too was caught up in craze about 2048. Unfortunately, after a while I realised it’s a stupid game. It’s a stupid game, because it’s actually a puzzle. Once you figure out how to play it [1], there is no challenge, it’s just trivial and boring.

2048 becomes cow clicking in binary.

How to win at 2048: play without ever pressing the UP key and keep your largest values on the bottom row, aligned left-to-right [2]. Just make sure you keep your highest tile on the corner and keep feeding the bottom row as best as you can. The newly born 2s & 4s will appear on the top portion and you will always be able to put them together. The only dangerous moment is when you score at the bottom as you risk destroying the organization if you’re not careful.

2048

Above is how the board looks just before scoring the 2048 tile: Now go left, down, right, right, right, and you will have clicked all your cows.

[1] Or at least one way of playing it.
[2] This strategy can, of course, be applied in rotated form: play to the top-left corner instead of bottom-right.

Friday Links

1. Childhood obesity rates are not dropping in the US or on the importance of multiple testing correction.

2. Printing in Garamond saves ink because Garamond letters are smaller

(Also, likely won’t save a lot of money because big organizations don’t buy ink at Kinkos).

  1. via retraction watch, a funny correction:

    In the original publication, the author Mr. Joseph Lee’s educational qualification (M. Theol) was inadvertently published as a co-author name. This erratum is published to correct the author group, where Mr. Joseph Lee is the single author of the publication.

The importance of unit testing & version control for scientific software

This post is inspired by the fact that I’m teaching software carpentry in Denmark this week, but I have had this conversation a few times, so I thought I should write it down.

Often the reaction to teaching things like version control or unit testing to scientists is of the sort aren’t these things more appropriate for professional software developers who can put in the effort to learn them? I strongly disagree.

In fact, I’ll defend that unit testing and version control are more important for science than commercial software.

§

Let’s say you are running a web-based business. Unfortunately, your website’s code is a mess. Many of features were implemented by someone who left a while back and none of your new hires really understands what that code does. Fortunately, however, the code works, the site is pleasing to the eye and customers are happily paying for your services. Even these old code bases can have their lives stretched for far longer than you’d expect. Life is not that bad.

Let’s say, on the other hand, you are running a computer-based science enterprise. Unfortunately, your code is a mess. Many of the features were implemented by someone who left a while back and none of your new hires really understands what that code does. Fortunately, the code produces pretty plots. Unfortunately, you cannot explain what the plots represent beyond a vague idea. You can adapt the code to a new dataset, but never really sure why it’s working like it is and sometimes the outputs are downright mysterious. Life is pretty bad. You need to start over.

§

The difference is that in many commercial aspects, only the final output matters. If a website is pretty, it won’t much matter whether the CSS behind it is a mess. If the search engine gives the customers want they want, both costumer and vendor are happy and nobody will say I’ll buy, but first can we go over the methodological details? There are solid reasons to make the code clean and well-tested (in terms of minimizing the negative impact of individual members leaving the team or avoiding increasing costs to maintenance & extension), but it’s not required for success.

In science, however, it is not enough to have a pretty output plot. You also need to be able to explain the details behind the plot and be certain that the plot was produced the way you think it was produced. If the code gives a mysterious result, then it’s not OK to just add a hacky line of code adding 10 to result to make it work. Similarly, the ability to go back in time in your code is a nice thing in business, but can be essential in science because we value reproducibility.

§

This is why I think it’s very good that unit testing & version control are both part of the software carpentry core curriculum.