Haven’t They Suffered Enough?

Haven’t They Suffered Enough?

Every time I read about a plan to have more women and minority in science careers, I think of that famous New Yorker quip about gays getting marriedGays getting married? Haven’t they suffered enough?

Women in tenure-track positions? Haven’t they suffered enough?

§

I read this lament yesterday:

I was the lucky kid who never had to study for tests. I always scored in the 99% percentile on the annual state assessments.

[… Now I don’t make that much money.]

The national average at the time was that for every one faculty position, there were 200 applications. For our department, there were 300 applications for every one faculty position

[…]

Science will fail because the System is running the scientists out of it.

This is like nobody goes there anymore, it’s too crowded. In one sense it expresses a truth, but it is actually non-sensical.

The problem with science cannot simultaneously be that scientists are not sufficiently paid and that there are too many of them for the same position. And, if you argue that too many scientists are leaving academia, you also need to explain how this fits in with all the other complains about academia that focus on how hard it is to get a job.

§

If you want to make the argument that there should be more science funding, go ahead; I’ll support you 100%.

If you want to make the argument that postdoc salaries are so low that it’s hard to get a qualified candidate, go ahead; I’ll mostly disagree.

If you want to make the argument that the current system leads to sub-optimal science, go ahead, I might support or disagree depending on the details. In the comments to that article someone points out that in the current system PIs are incentivized to be overly conservative and focused on the short-term unlike the private sector which has a longer time-horizon (and perhaps more tolerance for failure). This sort of argument is much more interesting as it implies that there could be better mechanisms for funding.

§

But, reading these poor me laments, I actually conclude that the taxpayer is getting a great deal: it gets very smart people working 80 hour days for so little money that they cannot afford to go to the movies[1] and they even produce a lot of nice results. Man, your tax dollars are hard at work!

The goal of public science funding is to get as much science as possible. Scientists are a cost to the public to be minimized. It seems that this is working pretty well.

Can we structure the rest of the public sector to be like this? [2] We’d get excellent public services for much lower taxes (we could surely lower the Council Tax which seems to take such a big chunk of this poor fellow’s salary).

[1] I have to say I don’t fully believe that this guy has it this bad.
[2] Joking aside, I actually think that science funding is, in general, better than other types of funding at getting bang-for-public-buck. Tenure comes late in your career (and it is not enough to sit on your ass and not get fired for 2 years), the grant system is competitive, &c In spite of the fact that public funding dominates, very few people would argue that there is no competition in science.

Friday Links

1. On Schekman’s pledge to not publish high-profile. I almost called this a balanced view, but then realized that I probably used that phrase to refer to Derek Lowe’s work at least twice in the past. The man is smart and balanced, what can I say?

2. An interesting meeting report (closed access, sorry). Two highlights:

While discussing mutations that predispose to cancer, Nazneen Rahman (Institute of Cancer Research, UK) rightly reminded us that people make big decisions and have parts of their anatomy removed based on their genotype.

[…]

Jeanne Lawrence (University of Massachusetts Medical School, USA) convincingly showed that her lab was able to silence one entire copy of chromosome 21 in stem cells in vitro. Trisomy 21 or Down’s syndrome is caused by an extra copy of chromosome 21. […] Lawrence and colleagues inserted XIST (human X-inactivation gene) into chromosome 21 in stem cells with trisomy 21. They then showed using eight different methods that a single copy of the chromosome had indeed been silenced.

3. A good explanation of Bitcoin, the protocol

4. Interesting article about wine & technology in The Economist (which is one of the few mainstream magazines whose science coverage is worth reading [1]).

[1] Actually, I think it’s the only one who can be consistently trusted, but I enjoy anything by Ed Yong wherever he publishes and been reading some excellent articles by Carl Zimmer in The Atlantic.

Seeing is Believing. Which is Dangerous.

One of the nice things about being at EMBL is that, if you just wait, eventually you can hear the important people in your field speak. Today, I’m quite excited about the Seeing is Believing conference

But ever since I saw this advertised, I dislike the name Seeing is Believing.

Grey_square_optical_illusion

  1. Seeing is believing. This is unquestionable.
  2. But seeing is not always justified believing. Our seeing apparatus will often lead us astray. This is especially true on images which do not look like the ones we evolved for (and grew up looking at).
  3. The fact that seeing is believing is actually often a cognitive problem which needs to be overcome!

§

I can no longer find who said it a BOSC, but someone pointed out, insightfully, that a visualization is already an interpretation of the data, it may be wrong.

More often than not, I show you a picture of a cell, this is rarely raw data. The raw data is a big pixel array. By the time I’m showing it to you I’ve done the following:

  1. Chosen an example to show.
  2. Often projected the data from 3D to a 2D representation
  3. Tweaked contrast.

Point 1 is the biggest culprit here: the selection of which cell to image and show can be an incredibly biased process (even unconsciously biased, of course).

However, even tweaks to the way that the projection is performed and to the contrast can highlight or hide important details (as someone with a lot of experience playing with images, I can tell you that there is a lot of space for “highlighting what you want to show”). In the newer methods (super-resolution type methods), this is even worse: the “picture” you see is already the output of a big processing pipeline.

§

I’m not even thinking about the effects of the tagging protocols, which introduces their own artifacts. But we, humans, often make the mistake of saying things like “this is an image of protein A in cell type B” instead of “this is an image of a chimeric protein which includes the sequence of A, with a strong promoter in cell type B”.

§

We know that these artifacts and biases are there, of course. But we believe the images. And this can be a problem because humans are not actually all that great at image analysis.

Seeing is believing, which too often means that we suspend our disbelief (or, as we scientists, like to say: we suspend our skepticism). This is not a recipe for good science.

Update: On twitter, Jim Procter (@foreveremain), points out a great example: the story of the salmon fMRI: we can see it, but we shouldn’t believe it.

Why Science is a Third World Economy

Because people are cheap and things are expensive.

§

To a large extent, it is easier to get money to pay for people (salaries [1]) than to pay for things. Other times, people show up who are willing to work without being paid (they are self-funded). But then you need to get them materials to work with. For that, you need to actually spend some money.  And sometimes you actually have money, but it can only pay for things of type X, but not of type Y, which is what you wanted.

So, it often feel very much like the third-world: a lot of people standing around a few physical resources, and replacement of capital by labour.

§

A while back I read a review which was comparing several technologies for the same measurement task [2]. There were two high-quality methods in terms of the output. One was very automated but required you buy some kit (~$400), the other was artisanal.

The authors wrote that the first one was good because it was very fast, but expensive. The other one took a long time, but was cheap. They didn’t even price in the cost of labour! They didn’t even ask how many hours of graduate student time you can get for $400.

Which, of course, makes some sense in the public-funded bureaucratic world where money is not fungible. You cannot often reallocate money from stipends to materials.

§

And then there is that expensive piece of equipment that is not really used because there was a specific half-a-million grant to buy it, but then enthusiasm petered out and the person who was going to use it had gotten a different job by the time the thing was delivered that nobody here really cared to pick it up.

Yep, that’s a third world thing too.

[1] or stipends which are exactly like a salary, except for tax purposes.
[2] I could probably find it now if I looked, but I don’t actually want to lose track of the main point.

Is Cell Segmentation Needed for Cell Analysis?

Having just spent some posts discussing a paper on nuclear segmentation (all tagged posts), let me ask the question:

Is cell segmentation needed? Is this a necessary step in an analysis pipeline dealing with fluorescent cell images?

This is a common FAQ whenever I give a talk on my work which does not use segmentation, for example, using local features for classification (see the video). It is a FAQ because, for many people, it seems obvious that the answer is that Yes, you need cell segmentation. So, when they see me skip that step, they ask: shouldn’t you have segmented the cell regions?

Here is my answer:

Remember Vapnik‘s dictum [1]do not solve, as an intermediate step, a harder problem than the problem you really need to solve.

Thus the question becomes: is your scientific problem dependent on cell segmentation? In the case, for example, of subcellular location determination, it is not: all the cells in the same field display the same phenotype, your goal being the find out what it is. Therefore, you do not need to have an answer for each cell, only for the whole field.

In other problems, you may need to have a per-cell answer: for example in some kinds of RNAi experiment only a fraction of the cells in a field display the RNAi phenotype and the others did not take up the RNAi. Therefore, segmentation may be necessary. Similarly, if a measurement such as distance of fluorescent bodies to cell membrane is meaningful, by itself (as opposed to being used as a feature for classification), then you need segmentation.

However, sometimes you can get away without segmentation.

§

An important point to note is the following: while it may be good to have access to perfect classification, imperfect classification (i.e., the type you actually get), may not help as much as the perfect kind.

§

Just to be sure, I was not the first person to notice that you do not need segmentation for subcellular location determination. I think this is the first reference:

Huang, Kai, and Robert F. Murphy. “Automated classification of subcellular patterns in multicell images without segmentation into single cells.” Biomedical Imaging: Nano to Macro, 2004. IEEE International Symposium on. IEEE, 2004. [Google scholar link]

[1] I’m quoting from memory. It may a bit off. It sounds obvious when you put it this way, but it is still often not respected in practice.

Friday Links

1. The vast majority of statistical analysis is not performed by statisticians

Let me fish out one paragraph:

[I]n 1967 Stanley Milgram did an experiment to determine the number of degrees of separation between two people in the U.S. In his experiment he sent 296 letters to people in Omaha, Nebraska and Wichita, Kansas. The goal was to get the letters to a specific person in Boston, Massachusetts. The trick was people had to send the letters to someone they knew, and they then sent it to someone they knew and so on. At the end of the experiment, only 64 letters made it to the individual in Boston. On average, the letters had gone through 6 people to get there. This is where the idea of “6-degrees of Kevin Bacon” comes from. Based on 64 data points. A 2007 study updated that number to “7 degrees of Kevin Bacon”. The study was based on 30 billion instant messaging conversations collected over the course of a month or two with the same amount of effort

What really jumps at me is how close the values were between the 1967 experiment (with so few datapoints, immensily biased: they only took the ones that got there!) and the 2007 version (whose conclusion is actually 6.6).

  1. Odds ratio vs. risk ratio

Scientists being misleading, tabloids being misled.

I assume that the author’s question of “why is this still allowed?” is rhethorical. His analysis answers the question: if we only allowed honest reporting in epidemiology, epidemiological papers would be much less interesting to the tabloids.

3. A bit old, but interesting: Peer reviews on PLoS One paper take reviews public

4. Speaking of scientists (particularly public health “scientists”) behaving badly: one of my top scientific peeves is the over-selling of weak results in public health, especially in nutrition. I think this is more damaging to the cause of evidence based policy than almost any anti-science group. Many people will say things like “I don’t trust scientists: first it was don’t eat olive oil, now olive oil is good. No peanuts, yes to peanuts, now no to peanuts again; science is just whatever is fashionable, really.” [1]

So, I was happy to see Nature telling a Harvard Medical School nutricionist to shut up and stop mangling the science for “public benefit”.

5. Please stop putting the figures at the end of the manuscript

I have never heard anyone defend the current system of figures at the end of the manuscript (except on that’s the way it always was grounds).

Computers & networks normally have a two step impact on systems: (1) reproduce the old paper based procedures in digital form, and (2) reshape the procedures to be native. Science publishing is still stuck on step 1.

[1] One really good comment from a non-scientist friend: “until I met you and your scientist friends, I was mostly exposed to science through news reports of the sort of studies that now I realise all the other scientists sneer at.” We need to sneer more. (Yes, I have non-scientist friends; who’d have known?)

Friday Links

1. False Positives from Next-Generation Sequencing

2. What I look for in software papers

I [frequently review] software papers which I define as publications whose primary purpose is to publicize a piece of scientific software and provide a traditional research product with hopes that it will receive citations and recognition from other researchers in grant and job reviews. To me this feels very much like hacking the publication recognition system rather than the ideal way to recognize and track the role of software in research communities, but a very practical one in the current climate.

3. On the organic movement and charlatanism

Unfortunately, charlatanism sells. When I was last in Portugal, I was disappointed to find out that one of the organic stores I used to patronize for their premium produce and hard to find food items had gone over to mostly selling small bottles of holy water at €1000/litre and “natural pills”. Their salespeople went from scruffy to dressing in white coats as “pretend doctors”. Ugh.

4. I read Mind and Cosmos: Why the Materialist Neo-Darwinian Conception of Nature Is Almost Certainly False by Thomas Nagel.

Someone wrote about What Money Can’t Buy that it was wrong, but wrong in a way that many people are wrong. Therefore, it is a useful contribution to articulate exactly that argument. [1] In fact he writes:

I would like to defend the untutored reaction of incredulity of the reductionist neo-Darwinian account of the origin and evolution of life.

For example, here he is again:

In the available geological time since the first life forms appeared on earth, what is the likelihoo that, as a result of physical accident, a sequence of viable genetic mutations should have occurred that was sufficient to permit natural selection to produce the organisms that actually exist?

In the conclusion he calls neo-Darwinism a heroic triumph of ideological theory over common sense.

Perhaps it is the task of philosophy (which we may as well call the science of the gaps) to articulate common sense. But the celebration of ignorance that is behind these claims is a bit silly.

5. Here is a good comment on the book. I think this is very much in line with if you care about winning, belittle your opponents arguments; if you care about truth, you improve their arguments for them (I think this was originally a Milton Friedman quote).

Neo-Darwinism perfectly explains why there are zombies. Once you have RNA, zombies are just a matter of time. To explain conscienceness is a harder problem.

[1] Of course, they are wrong in different ways. Michael Sandel is perhaps morally wrong in that his ideas cause a lot of unnecessary suffering and death (although, in another moral conception, those deaths are necessary and just—this is not something we can decide scientifically by looking at the world, but morally by deciding whether it is better that we preserve purity in some conception or that we avoid the deaths of others). Or he is philosophically wrong in that his ideas are contradictory. On the other hand, Thomas Nagel is scientifically wrong.