NG-meta-profiler & NGLess paper published

(I wanted to write about this earlier, but June was a crazy month with manuscript submissions, grant submissions, and a lot of travel.) The first NGLess manuscript was finally published. See this twitter thread for a summary, which I will not rehash here as I have already written extensively about NGLess and the ideas behind it here. I also wrote … Continue reading NG-meta-profiler & NGLess paper published

NGLess preprint is up

We have posted a preprint describing NG-meta-profiler and NGLess in general: NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language Luis Pedro Coelho, Renato Alves, Paulo Monteiro, Jaime Huerta-Cepas, Ana Teresa Freitas, Peer Bork bioRxiv 367755 My initial goal was to develop a tool that (1) used a domain-specific language to describe computation (2) was actually used in production. I did not want a proof-of-concept as one of … Continue reading NGLess preprint is up

Why NGLess took so long to become a robust tool (but now IS a robust tool)

Titus Brown posted that good research software takes 2-3 years to produce. As we are close to submitting a manuscript for our own NGLess, which took a bit longer than that, I will add some examples of why it took so long to get to this stage. There is a component of why it took so long … Continue reading Why NGLess took so long to become a robust tool (but now IS a robust tool)

Quick followups: NGLess benchmark & Notebooks as papers

A quick follow-up on two earlier posts: We finalized the benchmark for ngless that I had discussed earlier: As you can see, NGLess performs much better than either MOCAT or htseq-count. We tried to use featureCounts too, but that completely failed to produce results for some of the samples (we gave it a whopping 1TB … Continue reading Quick followups: NGLess benchmark & Notebooks as papers

Bug-for-bug backwards compatibility in NGLess

Recently, I found a bug in NGLess. In some rare conditions, it would mess up and reads could be lost. Obviously, I fixed it. If you’ve used NGLess before (or read about it), you’ll know that every ngless script starts with a version declaration: ngless “x.y” This indicates which version of NGLess should be running … Continue reading Bug-for-bug backwards compatibility in NGLess

How NGLess uses its version declaration

NGLess is my metagenomics tool, which is based on a domain specific language. So, NGLess is both a language and a tool (which implements the language). Since the beginning, ngless has had a focus on reproducibility and one the small ways in which this was implemented was that ngless requires a version declaration. Every ngless script … Continue reading How NGLess uses its version declaration

Eager Error Detection in Ngless: A big advantage of a DSL

One of the advantages of ngless is its error detection. For example, consider the following ngless script: ngless “0.0” input = fastq(“input.fq.gz”) mapped = map(input, ref=’hg19′) write(mapped, ofile=”output/mapped.bam”) If the directory output does not exist (maybe you meant to write outputs; I know I make this sort of mistake all the time), then ngless will … Continue reading Eager Error Detection in Ngless: A big advantage of a DSL