Interview with João Carriço

Happy New Year! I am starting what will hopefully become a (semi-)regular feature: interviews! The first one is with João Carriço.

I’ve known João Carriço for a few years, but last year, we were both at Instituto de Medicina Molecular in Lisbon and we got to interact more and talked a lot of science and non-science.

Below is a little conversation we had on science, twitter, and bioinformatics. My questions are bolded:

You started at the bench and moved to the laptop. I’d say that this is the opposite of the typical path for a computational biologist. Do you agree?

As I see it there are two possible paths for computational biologist: you are either someone with Computational Science/Physics background or from Biology/Biochemistry background. I am closer to this second category since my degree was in Applied Chemistry, Biotechnology. However, at this point, I don’t believe there is a “typical” path. IF you like computer science and biology you can become a computational biologist with more or less work on either (or both) areas.

Does this impact the way you work when compared with someone who came in the other way, from pure computer science?

Indeed, it does. There are huge differences in how these two communities tackle the problems. Coming from a biology-like background I just focus how can I get the answer and visualize it, and try not to worry too much about how the algorithm is implemented. My first main goal is to get an answer out and then worry about the details of the algorithm.

On a related note, how do you see computational biology evolving? Will it remain “interdisciplinary” or become itself a discipline, so that people are trained as “computational biologists” as college students?

I believe that the second scenario you put forward is what is happening already. I can only hope that it won’t be the only way that the field will have to allow intake of new people, but will be for sure the largest contributor.

Is this a good development or would losing the interdisciplinary flavour be a loss?

There will be losses for sure. Fresh ideas from both fields will take longer to get integrated in bioinformatics/computational biology as the majority of the people will be taught the ropes of 2 different fields and how they can be knotted together and will be focused on that learning. Having something seemingly unrelated and, at first look, very specific for a field and somehow applying it to the other field will be more difficult. On the other hand it will create the much needed number of practical problem solvers for everyday tasks.

Have you already seen changes in how institutions handle computational biology (and bioinformatics)? How do you rate these changes?

Unfortunately I can’t say that I have seen it, at least with a serious commitment level. Now, with the new next generation sequencing approaches some people are starting to realize the need to have specialised personnel and infrastructure, but the institutions (at least the ones oriented in biological research) still don’t recognise the scientific merit or research needs of the field.

Let’s switch tracks a bit and talk about science per se, namely your projects: Your work seem to be a mixture of method development and applications that are very close to the clinic. Is this a fair characterisation?

You can say that. Not as close to the clinical practice as I would like though.

I work basically in data analysis and management of microbial typing methods, which are used to identify bacterial pathogens at strain level. This is important since we know that for several bacteria only a few lineages in a certain species are the cause of the majority of diseases while others are commensal to us. The aim of my research is to understand what is the best way to do this classification based on the available methods and how can that be useful to track and predict the appearance of problematic strains, i.e., the ones that can be spread faster or be resistant to antibiotics.

How does one go about managing a project where you may have someone who is a bench biologist, another person who lives inside the command line, and a medical doctor?

Well, as always the first step is finding a common language and trying to restrict our conversations to that. In the beginning, we don’t want a computer scientist explaining algorithmic details to an MD or an MD explaining in detail how the infection advances in a patient to a computer scientist as both will be at loss. Eventually this will come and the questions will rise and the common language will encompass it. The important thing is remind everyone to have an open mind about the subjects to be discussed.

You always have to be the “hinge” people as someone told me some time ago. For the biologists I am the computer scientist and for the computer scientist I am the biologist, thus being the hinge between to fields. But in the end I can say that we can get very interesting and productive results!

Do you think hospitals will start using next-generation-sequence-based tools to either analyse samples or monitor for antibiotic resistant outbreaks? If so, what’s the timeline: in 5 years or 20 years?

This will happen for sure in less than 5 years. The problem in that example is that the presence of a antibiotic resistance gene doesn’t always correlate to the strain being non-susceptible to that antibiotic. So in the next 2-5 years, if the price of sequencing continues to drop, I can see NGS being done routinely as first approach as you can get results in less than 12 hours and that could eventually guide or advise the antimicrobial therapeutic that should be prescribed or even detect some relevant virulence factors/toxins that the strain can produce .

Tell us about an interesting recent project of yours (published or unpublished).

In the last years I have been working a lot with ontologies and RESTful interfaces applied to microbial typing and NGS.

As I usually say, I’m not a big fan of this type of work as I prefer working in algorithms and visualisation, but I see it as the base for all my other work. Again, without a common language with which databases can communicate or simply reduce the overhead for data integration, having the best algorithms and result visualisation can be meaningless with you don’t get the data integrated correctly or even if you don’t get the data. It is the part of the big puzzle that most people recognise its necessity but shun away from doing it. I decided to bite the bullet and push (slowly) forward in this field and I’m being rewarded with some good collaborations and interest from the community. Hopefully, next year we will have a couple of publication to show the results.

You are an active science twitterer [João is @jacarrico]. Is this something that’s an to your work, a distraction, or both?

Being a very important addition to my work largely outweighs the distraction! Twitter gives me the ability to stay in contact with the worldwide bioinformatics community, since most of them are active twitterers/bloggers. Twitter gives me the ability to post a question and often minutes after the fact I get a couple of answers that save me several hours of reading through software manuals, FAQs or papers!

Do you think this is a fad that’ll go away or something that will stay? Perhaps in a modified form, but would you say that the idea of fast and unfiltered science “gossip” is here to stay?

The science gossip with all its juicy bits through twitter is here to stay. As I see it, the filtering of twitter gossip happens at blog level as tweets get blogged and commented upon, so the good parts will stay afloat from the noisy background.

Thank You!

João Carriço is a researcher at the Molecular Microbiology and Infection Unit of the Instituto de Medicina Molecular. He can be found on twitter as @jacarrico. A full list of his publications can be found in his google scholar profile


Non-Computational Thinking

There is a lot of talk about Computational Thinking, typically followed by the observation that we don’t know what it is or cannot define it.

Which I think it is true, but it is perhaps easier if we try to watch out for non-computational thinking instead.


Recently, the MLA defined how to cite a tweet in their (widely used) style:

Begin the entry in the works-cited list with the author’s real name and, in parentheses, user name, if both are known and they differ. If only the user name is known, give it alone.

Next provide the entire text of the tweet in quotation marks, without changing the capitalization. Conclude the entry with the date and time of the message and the medium of publication (Tweet). For example:

Athar, Sohaib (ReallyVirtual). “Helicopter hovering above Abbottabad at 1AM (is a rare event).” 1 May 2011, 3:58 p.m. Tweet.


I think this is an example of non-computational thinking: tweets have unique numeric IDs, so that you can link to them but they are not mentioned. They do discuss that the time stamp is time-zone dependent.

(Although, truth be told, twitter does not make it super-obvious how to get at the tweet ID; not sure why. You need to click on the tweet date to get to the tweet link and read the URL. [I updated this parenthetical paragraph in response to a comment below by Cheng H. Lee])