Micropia – a microbe museum

lichensThis month I had the chance to visit the ‘smallest’ museum in the world: Micropia in Amsterdam, The Netherlands. The goal of Micropia, opened in 2014, is to distribute knowledge about microbes to the general public. The museum is part of the zoo Artis but can be visited independently and has a separate entrance. The museum offers a great introduction into the wonderful world of microorganisms. Below an impression of the exhibition.

The tree of life at the entrance showing a 'representative selection of 1500 species, 500 of each domain' the data comes from NCBI. A neat feature; the species lighting up in UV light are only visible by microscope whereas the non illuminated branches (ie. mammels in the bottom right corner) do not.

The tree of life at the entrance showing a ‘representative selection of 1500 species, 500 of each domain’ the data comes from NCBI. A neat feature; the species lighting up in UV light are only visible by microscope whereas the non illuminated branches (ie. mammals in the bottom right corner) do not.

A tardigrade ~6,000x enlarged, living tardigrades are also present and visible under the microscope

A tardigrade ~6,000x enlarged, living tardigrades are also present and visible under the microscope. I’m wondering wether its genome is also contaminated?

Micropia also features an in-house lab used to maintain the living components of the collection.

Micropia also features an in-house lab used to maintain the living components of the collection.

In a separate room a stir flask with Photobacterium phosphoreum produced a beautiful glow

In a separate room a stir flask with Photobacterium phosphoreum produced a beautiful glow.

'Wall-of-fame' with more than 100 micro organisms in large petri dishes

‘Wall of fame’ with more than 100 microorganisms in large petri dishes

Close-up on the wall of fame, Aspergillus oryzae (used to ferment soybeans to produce soy sauce), Aspergillus arachidicola (discovered on peanuts), Klebsiella (this one was only named by genus) and a specimen just named 'yeast'

Close-up on the wall of fame, Aspergillus oryzae (used to ferment soybeans to produce soy sauce), Aspergillus arachidicola (discovered on peanuts), Klebsiella (this one was only named by genus) and a specimen just named ‘yeast’

Downstairs several product were featured that could not exist without micro-organisms such as yoghurt, kimchi and 'delicious' pickled herring.

Downstairs several product were featured that could not exist without microorganisms such as yoghurt, kimchi and ‘delicious’ pickled herring.

Overall the museum does a great job in showing the presence and use of microbes in daily life. For example the ‘wall of fame’ contains all kind of household attributes together the microorganisms that are commonly found on the objects. Furthermore there is a nice collection of examples of useful microorganisms to breakdown waste or produce medicine. All this is vividly illustrated with a wealth of interactive installations.

I was a bit time constrained so I might have missed it, but there was little emphasis for potential of engineered microbes. With museum sponsors such as BASF, DSM, Galapagos, MSD, I would expect that a significant portion of the exhibition would be dedicated to GMOs and the endless possibilities of synthetic biology and metabolic engineering. For example by showcasing the bio-production of insulin, artemisinin, or biofuel using microbes. I think the museum would be a great platform to continue the discussion in society on the use of GMOs and highlight the positive aspects.

In conclusion a great way to spend a few hours and get to know more about the more invisible forms of life.

Leave a Comment

Filed under Exhibition

Background on the poreFUME pre-print

porefumlogoLast week our pre-print on nanopore sequencing came online at bioRxiv. Nanopore sequencing is a relatively new sequencing technology that is starting to come of age. As part of this process we last year started playing with the ONT MinION sequencer. This post summarizes a bit of the background behind the pre-print.

Previously I covered the London Calling 2015 event  where a lot of progress on the development of the MinION was showcased. We were keen to find out how the MinION could contribute to our daily lab work, but also to see what new ground can be covered with this new sequencing technology.

One of the aspects colleagues in the lab are working on is the dissemination of antibiotic resistance genes, as a major healthcare challenge is the emergence of pathogens that are resistant against antibiotics. Therefor we thought of combining the MinION with antibiotic resistance gene profiling. More specifically; coupling functional metagenomic selections with nanopore sequencing.

Previous work in this field, for example by Justin O’Grady and colleagues, showed the use of the MinION [$] to identify the structure and chromosomal insertion site of a bacterial antibiotic resistance island in Salmonella Typhi.

Instead of going after single isolates, we set out the map the antibiotic resistance genes that are present in the gut (resistome) of a hospitalized patient. The resistome can influence the outcome of antibiotic treatment and it is therefor highly interesting to get insights in this complex network.   Through a collaboration under the EvoTAR programma with Willem van Schaik of the University of Utrecht we had a clinical fecal sample available of an ICU patient, which we used in the experiments.

Typical workflow of the construction and selection of a metagenomic workflow.

Typical functional metagenomic workflow where metagenomic DNA is isolated from a (complex) environment, in this case a fecal sample. The DNA is sheared, ligated and transformed in E. coli. When profiling for antibiotic resistance genes, the cells are plated on agar containing various antibiotics. Finally the metagenomic inserts are sequenced an annotated.

Key in the whole experimental setup to capture the resistome is the use of functional metagenomic selections. In contrast to culturing individual microorganisms directly from a fecal sample, metagenomic DNA is extracted from the sample. This metagenomic DNA is subsequently sheared, ligated and transformed in E. coli and finally plated out on solid agar containing various antibiotics. Only E. coli cells that harbor a metagenomic DNA fragment that encodes for an antibiotic resistant phenotype can survive. With these functional metagenomic selections in hand, the complexity of the resistome can be rapidly mapped.

And this is were the MinION comes in. Although other sequencing technologies, such as the Illumina and the PacBio platform, are available, they do not provide both long reads and low capital requirements.



After some initial failed attempts to get the MinION sequencer running in our lab, we started to see >100 Mbase runs in October last year. Also PoreCamp last December in Birmingham provided, on top of a great experience and nice people, some useful data (next week a new round of PoreCamp takes place).

In order to analyze the sequencing data that Metrichor generates we developed the poreFUME pipeline, which automates the process of barcode demultiplexing, error correction (using nanocorrect) and antibiotic resistance gene annotation (using CARD). The poreFUMe software is available on Github as a python script. The subsequent analysis is as well available on Github in a Jupyter notebook.

The jupyter notebook is available here

The Jupyter notebook with the analysis in the pre-print is available here.

In order to benchmark the nanopore sequencing data we also Sanger and PacBio sequenced the sample. From these results we could achieve a >97% sequence accuracy and we were able to identify all the 26 antibiotic resistance genes in both the Pacbio and nanopore set.

Since the whole workflow can be performed relatively quickly, it would be really interesting to move these techniques to the next stage and do in-situ resistome profiling. Especially integrating Matt Loose’s read-until functionally could open up new avenues. Furthermore these experiments were done with the R7 chemistry, however it seems that the new R9 chemistry is able to deliver even higher accuracies and faster turn-around.

The fasta files and poreFUME output used in the analysis are already online, the raw PacBio and MinION data is available at ENA

Update 2016-11-01: Added the ENA link to the raw data

Leave a Comment

Filed under Publications

SynBioBeta ’16 packed with innovation

sblogoLast Wednesday the SynBioBeta conference got kicked off at Imperial College. Central topic was the current state of synthetic biology and how (commercial) value can be gained by supplying tools and platforms. In the keynote by Tom Knight from Ginkgo Bioworks, and the afterwards chat with his old PhD student Ron Weiss (now professor at MIT), a few interesting points came by that illustrate the path synbio has taken over de last two decades.

Ginkgo Bioworks founder Tom Knight

Ginkgo Bioworks founder Tom Knight (Photo courtesy of Twist Bioscience)

Tom started of with a quote from Douglas Adams  “if you try and take a cat apart to see how it works, the first thing you have on your hands is a non-working cat” to illustrate the current (or not so far in the past) state of biology in general. He used the old ‘systems engineering’ of a Boeing 777 example to highlight where synbio should be going in his opinion. As in: 1. design using CAD 2. build 3. it works. So no more tinkering and endless design-build-test cycles. In order to do so he argued for an extra loop in the cycle, the simulate component. This would allow the end-user to design and simulate a layout before actually building and testing it.  However, he was quickly to note that we are currently lacking a lot of insights into the biology of a single simple cell, for exmample the Mycoplasma mycoides of which 149 of the 473 remain of unknown function but are essential for cell survival.

An improved version of the design cycle proposed by Tom Knight

An improved version of the design cycle proposed by Tom Knight

Also the VSLI analog was brought up and the panel noted that Voigts group last week came a step closer to this paradigm by rationally designing circuits and building them.

On the questions whether synbio is progressing fast enough Ron Weiss replied that it is not “as fast as we want”, he recalled the last chapter of his thesis describing a synthetic biology program language, which he laughingly categorized as “completely useless back then”. However the state of mind back in the 2000’s was “that within a year or 5” we would be able to build circuits with at least 30 gates (Voigts paper from last week showed a ‘Consensus circuit’ containing 12 regulated promoters). Tom was a bit more optimistic saying that “You overestimate what is going to happen in 5 and underestimate what happens in 10 years”. Bottom line was the central need to be able to make robust systems that can work in the real world and in order to do so more information is needed such as whole cell models. The session ended with a spot-on question from riboswitch pioneer Justin Gallivan, now at DARPA; “who is going to fund research this research to gain basic knowledge?”. For example, who is going to elucidate the function of the 149 proteins of unknown functions? One suggestion was that Venter should just pull out his checkbook again…

The investors’ perspective

Next on the program was the investors round table geared towards the commercialization aspect of synthetic biology. It was debated whether the use of the term ‘synbio’ would negatively affect your final product or whether it would boost sales, Veronique de Bruijn from IcosCapital argued that the “uneducated audience will definitely judge you” so she suggested to use the term ‘synbio’ cautiously. Business models, an ever debated topic, stroke more consensus among the investors, they all agreed that it is difficult for a platform technology to go out, hence it can be extremely difficult to apply the technology to the optimal specific product. Karl Handelsman from Codon Capital noted that when you do have a product company it is important to engage with customers early, so you build something they really want. Related to this he recalled that a product company at the West Coast typically exits for 60-80 million USD, so you should be aware that you can never raise more than ~9 mUSD throughout the lifetime of a company. When it came to engaging with Corporate Venture Capital, the panel unanimously appraised them for their expertise came, but care should be taken that your exit strategies are not getting limited by partnering up with them. The session was rounded off with a yes/no on the positive impact of Trump as president on synbio, only Karl was positive because this would definitely direct lots and lots of funding towards Life-On-Mars projects.

Applications of synbio by the industry

In the ‘Application Stack’ session five companies pitched their take on synbio and how this can be used as a value creator. Ranging from bacterial vitamin production by Biosyntia to harnessing the power of Deinoccocus. Particular interesting was Darren Platts’ talk who showing one of Amyris in-house developed tools on language specification in synthetic biology. The actual challenge here was not to write the software “pretty straightforward”, it was more difficult to get the users engaged in the project and adapting the tool. Their paper was published recently in ASC Synbio and the code is soon released on Github.

Is there place for synbio in big pharma?

The final session of the first day was titled ‘ Synthetic Biology for Biopharmaceuticals’ and here if found the talks of Marcelo Kern from GSK and Mark Wigglesworth from AstraZeneca especially interesting, they gave their ‘big pharma’ view on how to incorporate synthetic biology into the established workflows. GSK for example focused on reducing the carbon footprint by replacing chemical synthesis with enzyme catalysis. Another great example was the use of CRISPR to generate drug resistant cell lines to for direct use by the in-house screening department.

The first day was rounded of by Emily Leproust from Twist Bioscience, announcing that they would be happy to take new orders from June (!) on.

The future of synbio

The second day started of with a discussion on ‘Futures: New technologies and Applications’ by Gen9 CEO Kevin Mundanely and Sean Sutcliffe from Green biologics. Both showed examples of partnering by their company with academic institutions to get FTO into place. Sean also made an interesting comment that it took them about 4 years to commercialize “technology from the ’70” so he estimated it would take around 12 years before the CRISPR technology, now trickling into the labs, can be used on production scale in the fermenters.

A fun-and-fast-paced ‘Lightning Talks’ session gave industry and non-profit captains a platform of exactly 5 minutes to pitch their vision. Randy Rettberg gave a fabulous speech about the impact of iGEM on the synbio sector and concluded that iGEM helps cultivating the future leaders of the field. Gernot Abel from Novozymes highlighted a ‘citizen science’ project where the ‘corporate’ Novozymes worked together with biohacker space Biologigaragen in Copenhagen to successfully construct an ethanol assay. Along these lines Ellen Jorgensen from the non-profit Genspace pitched their “why a new generation of bio-entrepreneurs are choosing community labs over incubators/accelerators” at a price point of 100$/month versus 1000$/month. Dek Woolfson (known for his computationally designed peptides and cages) gave an academically tasting talk about BrisSynBio but finished his pitch that they are looking for a seasoned business person to help making their tools available for a broader public.


Dek Woolfson was one of the few (still) academics on stage. (Photo by: Edinburgh iGEM)

What happens when synthetic biology and hardware meet?

The hardware and robot session showcased, among others, Biorealize who are constructing a tabletop device to transform cells and incubate and lyse them, Synthase who just released an open source data management platform Antha and Bento Lab (currently running a very succesfull kickstarter campaign) highlighting their mobile PCR workstation. An interesting question was posed at the end as to how much responsibility Bento Lab was putting on the DNA oligo synthesis companies by democratizing and making PCR available to the general public. Bento Lab defended that they are supplying an extensive ethical guide with their product and that they don’t supply any reagents. Unfortunately this very interesting discussing was terminated due to a tight conference schedule.


Tabletop transformations, incubations and lysis in one go using Biorealize

A healthy microbiome using GMO’s?

In the final session of SynBioBeta a few examples of synbio applied to the microbiome came by. Boston based Synlogic is planning on starting the IND (Investigational New Drug) process on their E.coli equipped with ammonia degrading capabilities to combat urea cycle disorder. Xavier Duportet showed an example of Eligo Bioscience using CRISPR systems delivered by phages that selectively kill pathogens, such as Staphylococcus aureus, part of this exciting work was published in 2014 in Nature Biotech using mice models.


Eligo Bioscience and their CRISPR-delivered-by-phage technology (Photo by: Edinburgh iGEM)

After all these dazzling applications of synthetic biology, captain John Cumbers wrapped up SynBioBeta by also announcing the next event in San Fransico at 4th-6th October and in London next year again around April.

Personally I think the conference did a great job at gathering together the industrial synthetic biology community, from both early start-up to big pharma. Although the sentiment is that we are not as far as we want to be, there have been some considerable advancements over the last 15 years. From an investors perspective there is still a lot of uncertainty surrounding the run-time (and the inherently coupled rate of return) of synbio projects, however the recent numbers on VC funding are indicating there is an eagerness to take the leap. Taking together, a jam packed two days with high end exciting synthetic biology applications, it will be very interesting to see if Moore’s law also applies to synbio.

Disclaimer: The above write up is strongly biased by my own interests, so revert to the twitter hashtag #SBBUK16 to get a more colorful overview of the past two days.


Filed under Talk

How many new drugs does the FDA approve?

Recently a question was floated on Twitter as to how many drug approvals the FDA has done. Quickly a few answers came in where it heavily depends on what one counts as a ‘new drug’, is a registered generic molecule also a new drug or is this not innovative enough?

For the sake of doing statistics on these numbers I’ve extracted the datapoints of a few data resources. First from the FDA itself, they have a funky table showing the number of new drug application (NDA) approved, received and the number of new molecular entities. I’ve extracted the data and plotted this below from 1944-2011 and can be downloaded here.

FDA NDA Approvals & Receipts from 1944-2011 (data)

Another interesting categorisation is the source of the molecules. In 2009 John Vederas published a highly cited article on the origin of the FDA approved drugs between 1981 -2007. Unfortunately the raw data behind this plot is not available so I’ve interpolated the numbers from this article figure and plotted the data below, again can be downloaded here. It is pretty clear that the number of natural (derived) molecules is declining.

 Number of drugs approved in the US split up by source from 1981 to 2007 interpolated from Vederas et al. (data)

feature in Drug Discovery Today  by Kinch et al. shows extensive analysis of new molecular entities as well as the ones one paid for the R&D. As a commenter notes on PubMed Commons it is too bad the underlying data is not available. Therefor the graph shown below is an interpolation of their figure.

Number of new molecular entities (NME) approved by the FDA from  1930-2013  interpolated from Kinch et al. (data)

A quick comparison shows that the FDA NME numbers and the numbers by Kinch et al. are in the same ballpark, deviations can be due to my interpolation or a difference in counting NMEs, for example Kinch et al. are  “excluding imaging and diagnostic agents.

Comparison of new molecular entities as reported by the FDA and  Kinch et al. 

If anyone has a more comprehensive article or publicly available numbers that would greatly be appreciated.


Leave a Comment

Filed under Science Article

porecamp: a great week of nanopore sequencing

MinionThis week I attended porecamp at the University of Birmingham focused on the use of the MinION nanopore sequencer. The workshop was hosted by Nick Loman and included interactive sessions with Matt Loose, Mick Watson, Josh Quick, John Tyson, Justin O’Grady and Jared Simpson. So pretty much every aspect of nanopore sequencing, from library preparation to assembly polishing was covered. Below a brief overview of the activities that were going on, a detailed account will soon be written up in a F1000 article by the participants.

Everyone had the opportunity to bring some DNA samples to try in the new ‘native barcoding protocol’. This pre-release protocol allows for the pooling of multiple samples on one flow cell by, in an extra ligation step, attaching a barcode to the individual samples.  The initial results looked pretty good in the sense that it should be possible to obtain an equal distribution of DNA from a pooled library. It also became evident that the use of high quality DNA improves the output from the MinION. When working with genomic DNA the best strategy is to start with a fresh culture, directly phenol-chloroform extract and don’t freeze the DNA before the library prep.

Josh explaining the library prep protocol

Josh explaining the library prep protocol

John Tyson and Matt Loose thoroughly demonstrated the use of  software add ons to improve the process. Johns scripts optimize the way the sequencer selects the correct pore to sequence from and Matt his minoTour software let you realtime analyse the data as it comes of the sequencer, he also showed some pretty cool initial results of the read-until feature, for example to balance the reads of a pooled sample.

Matt performing a -1 G nanopore run

Matt performing a -1 G nanopore run

On the bioinformatics side we gave, after diving into the fast5 file format, the new mapper from Heng Li miniasm a try, resulting in very rapid genome assembly. It will be interesting to see how miniasm will find its way into the assembly pipelines.

Concluding this was an extremely valuable week to get to know everyone and exchange knowledge on the latest practices in the nanopore sequencing world. So again a big thanks to the perfect organization.

The course material is available on github and additional information can be found on twitter under #porecamp

Leave a Comment

Filed under Course

deFUME webserver paper published last week!

paperLast week we published our deFUME paper in the open access journal BMC Research Notes. The aim is an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, specifically targeting wet-lab scientists (or non-bioinformaticians).
A quick intro into function metagenomics: it’s a subfield of the more widly known metagenomics. The term metagenomics was first introduced by Handelsman and Clardy in 1998 and is a method to extract DNA from the environment (metagenome) and study this by either sequencing or functional analysis. The first case does what the name says, extract and sequence as much DNA as possible and using bioinformatics tools to try to determine the function. In this way Hess et al [2]  were able to computationally identify 27,755 putative carbohydrate-active genes in cow rumen. However a drawback of this method is that these genes need to experimentally validated.

Different phenotypes that can be observed, for example halo formation, pigmentation or morphological changes

Different phenotypes that can be observed when expressing a metagenomic library, for example halo formation, pigmentation or morphological changes.

Functional metagenomics works in that sense the other way around, a metagenomic library is transformed in a laboratory host (for example E. coli) and cultured while monitoring for a phenotypic change. For example if one is looking for proteases, the agar plate can be supplemented with milk and colonies creating a halo can be deemed positive for proteolytic activity. These colonies can subsequently be sequenced and predicted genes functionally annotated. For this last process we created the deFUME webserver, it integrates the whole process from vector trimming till domain annotation into one pipeline.

The workflow of deFUME is visualized in the figure below where processes are depicted in red and (intermediate) files in black:

deFUME webserver flowchart

deFUME web server flowchart, processes are in red and files/objects in black. From [1]

As input files deFUME takes either Sanger chromatograms (as .ab1 files) or, in case of a next generation run, the assembled nucleotide sequences in FASTA format. In the next steps the data is processed and annotated with BLAST and InterPro data. Leaving it for the user to interact with the data in an interactive table format for example to filter on e-value, remove hypothetical proteins or show more or less detail. Finally the annotations can be exported in FASTA or Genbank format or in a simple csv file.

Why would you use the webserver?

  1. It’s free for academic users
  2. It saves time compared to, for example running the same workflow in CLC
  3. It’s easy because you don’t spent time on intermediate files, for example vector trimming the contigs and pushing those to BLAST.
Screenshot of deFUME

Screenshot of deFUME showing the functional annotations (A) and the interactive toolbox (B). From [1]

So where did this idea originate from?

It actually started out in the summer 2013 with a small project at the CIID (Copenhagen institute for interaction design) where we designed all kinds of interactive visualizations. In the lab we had a functional metagenomic data set laying around but some colleagues found it challenging to analyze the data and interact with it. So out of curiosity I made the following sketch (on Github) in Processing that would, based on Interpro data, give a quick overview of the sequences and annotated Interpro domains.

Screenshot of the initial sketch made in Processing

Screenshot of the initial sketch made in Processing

This small processing sketch was a direct hit and the idea arose to make this kind of interaction wider available. One basic necessity would be to also include the data processing into the visualization so the user only has to push 1 button in order to get an interactive visualization.
Therefor we implemented a backend that runs on the Center for Biological Sequence (CBS) servers at the Danish Technical University (DTU) and handles the data pipeline, from basecalling to BLASTing. Another quick realization was that a Processing sketch is not extremely portable and user-friendly, a web interface on the other hand would be. Therefor we build a table based (using jqGrid) module to display the functional annotations and use the HTML5 canvas to draw a visual representation of the data. We used Javascript to let the different components talk to each other and some D3js to display a histogram of GO terms. On the backend the pipeline is implemented in Perl and all the data is structured and stored in a single JSON object that is delivered to the client using PHP.

What is next?
We are very happy with the current version but while developing we already came across a number of feature that would make a great appearance in version 2, for example EcoCyc integration, reporting of GC content over the stretch of the contig, exporting the InterPro annotations in the Genbank file and optimizing the coloring scheme. So incase you are a student and interested in working on deFUME you can drop me an email.

The deFUME paper can be found here, the webserver here with a working example here. Contributions can be made to the deFUME github repository.

[1] van der Helm, E., Geertz-Hansen, H. M., Genee, H. J., Malla, S. & Sommer, M. O. A. deFUME: Dynamic exploration of functional metagenomic sequencing data. BMC Res. Notes 8, 328 (2015).

[2] Hess, M. et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–7 (2011).

Leave a Comment

Filed under Publications

Ultimaker replacing temperature sensor

Last week the Ultimaker 2 gave an ominous ERROR – STOPPED TEMP SENSOR message.

The Ultimaker 2 temperature error

The Ultimaker 2 temperature error

After consulting Ultimaker support and measuring the resistance over the Pt100 sensor in the printer head (only 138 Ohm when heated up, which would correspond to only 100 C ) the culprit was quickly identified. Luckily the Ultimaker support page contains a very elaborate step-by-step instruction on how to replace the Pt100 sensor. Although the instruction is very clear it takes quite some time to  perform all of the disassembly and subsequent assembly steps to replace the Pt100. Be also sure to replace the temperature sensor and not the heating element since they have both the same shape, the heating element is only slightly bigger.

Heather element on the left and new Pt100 temperature sensor on the right

After removing the temperature sensor with the help of some WD40 from the heatblock it is pretty clear that the sensor was, for unknown reason, completely destroyed. Replacing the Pt100 with a fresh one from the factory directly solved the problem and we are happy printing again.

The broken Pt100 temperature sensor

Leave a Comment

Filed under 3Dprinting

Wrapup of Visualizing Biological Data ’15

Screen Shot 2014-03-09 at 7.46.53 PMFrom the 24th till the 27th of March I visited the Broad Institute of Harvard and MIT in Boston to attend the VizBi 2015 conference. The scope of this conference is to advance the knowledge in the visualization of biological data, the 2015 iteration was the 6th international meeting that took place. Hereby a long overdue recap of two talks that I thought were particular interesting.

On Wednesday John Stasko kicked off as a keynote speaker with some very interesting notions about the different applications of visualization; this should either be for presentation (=explanatory) or for analysis (=exploratory). This difference is important since they both have their own goals, for example when presenting results the goals are: to clarify, focus, highlight, simplify and persuade. However when analyzing data the goal is to explore, make decisions and use statistic descriptors.

However a good quote also passed by here “IF you know what you are looking for, you probably don’t need visualizations”.

So when you do decide you need a visualization it is most useful for analysis (=exploratory), in this case it can help you:

  • If you don’t know what you are looking for
  • Don’t have an a priori questions
  • Want to know what questions to ask

So typically these kind of visualizations; show all variables, illustrate overview and detail and facilitate comparison. A result of this setup is that “analysis visualizations” are difficult to understand, because the underlying data is complex, so the visualization is probably also difficult to understand. This is not a bad thing, however the user needs to invest time to decode the visualization.

A perfect example of a exploratory visualization is the Attribute Explorer from 1998[1]. Here the authors used the notion of compromise to analyze a dataset. For example when searching for a new house you might look at the price, the commuting time and the amount of bedrooms. However when setting a particular limit on each of these attributes you might miss the house that has a perfect price and number of bedrooms but is just a 5-minute longer commute. The paper shows that by implementing coupled histograms the user is still able to see these “compromise solutions”. The PDF of the article is available here showing some old school histograms.

The concepts of the Attribute Explorer from 1998 are nowadays still relevant

The concepts of the Attribute Explorer from 1998 are nowadays still relevant

The takeaway: a visualization of radically different if one presents the data or when one analyses the data

An often encountered problem with visualization is high data complexity; too high to visualize in one go. There are a few options to tackle this:

  • pack all the data in one complex representation
  • spread the data into multiple coordinated views (pixels are Johns friend)
  • use interaction to reveal different subsets of the data

When interaction with data users have different intends in a 2007 InfoVis paper by Stasko [2] there are 7 intends described:

  1. Select
  2. Explore
  3. Reconfigure
  4. Encode
  5. Abstract/Elaborate
  6. Filter
  7. Connect

However 95% of the intends are made up by Tooltip&Selection in order to get details, Navigation and Brushing&linking. This gives rise to a chicken-egg problem, why are only those 4 intends used so extensively and how can one make a visualization more effective?

An example Stasko showed was the use of a tablet[3] where there is a whole wealth of new gestures available, as is best illustrated in this video:

As a conclusion Stasko gives his own formula that captures the value of visualization.

Value of Visualization = Time + Insight + Essence + Confidence:

  • T: Ability to minimize the total time needed to answer a wide variety of questions about the data
  • I: Ability to spur and discover insights or insightful questions about the data
  • E: Ability to convey an overall essence or take-away sense of the data
  • C: Ability to generate confidence and trust about the data, its domain and context

download (2)

On Friday Daniel Evanko (@devanko) from the Nature Publishing spoke about the future of visualizations in publications. There is currently a big gap between all the rich data sets that people publish and the way these are incorporated in scientific articles. Evanko made some interesting points from a publisher perspective.

The current “rich” standards such as pdf are probably good for a dozen of years to come, however new formats such as D3, Java and R can break or could become unsupported at any time in the future. On the other hand the basic print format such as paper or microfilm can be kept for 100 years. Although this is a conservative standpoint in my opinion it indeed makes sense to keep the long term perspective in mind when releasing new publication formats, because who says Java will be supported in 20 years. However I think with thorough design (the community) should be able to come up with some defined standards that have the lifetime of a microfilm.

Another argument Evanko used was the fact that the few papers that are published with interactive visualization do not generate a lot of traffic from which the conclusion was drawn that the audience doesn’t want these kind of visualization so publishers will not offer them. Again I feel we can be dealing here with a chicken-egg problem.

I’m grateful to the Otto Mønsteds Fond for providing support to attend Vizbi ’15.skjold-otto-moensteds-fond



  1. Spence R, Tweedie L: The Attribute Explorer: information synthesis via exploration. Interact Comput 1998, 11:137–146.
  2. Yi Jsyjs, Kang Yakya, Stasko JT, Jacko J.: Toward a Deeper Understanding of the Role of Interaction in Information Visualization. IEEE Trans Vis Comput Graph 2007, 13:1224–1231.
  3. Sadana R, Stasko J: Designing and implementing an interactive scatterplot visualization for a tablet computer. Proc. 2014 Int Work. 2014:265–272.


Leave a Comment

Filed under Talk

Recap of the Nanopore sequencing conference ‘London Calling’ by ONT

MinionLast Thursday and Friday Oxford Nanopore Technologies (ONT) hosted it’s first conference ‘London Calling’ where participants of the MinION Access Program (MAP) presented their results and experiences after 11 months of the program. The CTO of ONT also delivered  a session where the future directions where outlined. Below a quick recap of two days of London Calling.

There were about 20 talks (agenda) by a broad range of scientist from microbiologists to bioinformaticians. A few observations I found interesting to share:

  • John Tyson (University of British Columbia) wrote a script that slightly alters the voltage along the run to keep the yield curve linear, he uses this method standard for each of his runs
  • The majority of the presenters just only use the 2D reads
  • A nice month-by-month overview of the MAP program can be found in Nick Lomans talk here
  • Miles Carroll (Public Health England), Josh Quick (University of Birmingham) and Thomas Hoenen, NIH/NIAID) went to Africa last year to sequence the Ebola virus outbreak and were able to map the outbreak on phylogenetic timescale, they used RT-PCR to generate the input material. Main conclusion here was that field sequencing with the MinION works, the Ebola mutation rate is not higher than other viruses, key drug targets are not mutating.
  • People are exploring a lot of options to use it in clinical setting, for example for rapid identification of bacterial infections (Justin O’Grady, University of East Anglia) or for pharmacogenomics (Ron Ammar, University of Toronto); in short which drugs not to prescribe to patients because their liver cannot metabolise them due to a genetic modification, read the paper here.
  • A detailed account on how to assemble a bacterial genome with only Nanopore data by Jared Simpson can be found on Slideshare, it’s an interaction version of this pre-print
  • Currently MinION + MiSeq data is the way to go short-term future (according to Mick Watson) for genome assembly. Alistair Darby, University of Liverpool argued to just use 1 sequencing technology to perform the whole genome assembly because to much time can/is wasted to integrate all the different sequencing methods with different algorithms.

DNA sequencing becomes really personal now

During the talks some requests were put forward:

  • More automation for lib prep / faster lib prep protocol (this will be tackled either with VolTRaxx and/or a bead protocol for low input material and a 10 minute protocol for 1D reads announced by CTO Clive Brown)
  • More stable performance between individual flow cells
  • Base calling off-line so no need to connect to the cloud
  • Tweaking the base caller for base pair modifications (for example methylation)

On Thursday afternoon there was the talk of Clive Brown the CTO of ONT. On Twitter the talk was compared with a “Steve Jobs style” way to reveal the new products.

A few points he presented:

  • There will be at the end of the year/next year a new MinION release that has the ASIC electronics not in the flow cell but in the MinION itself, this would drastically cut the price of the flow cells (from 1000$ -> 25$). Another big change here is the chip will contain 3000 channels instead of 512. Furthermore runtime of these device will also be around 2 weeks.
  • All the shipments should be room temperature soon
  • A “fast mode“ will be available within the next 3 months where a typical run will not generate 2Gbase of data but 40Gbase of data.
  • VoltTRAX is developed which can be clicked on a flow cell and will automate the full lib prep process, they imagine users can load a mL of blood sample on the VolTRAX and it will be prepped automatically.
  • At the same time ONT will implement a different price structure where you pay per hour of sequencing instead of per flow cell, so you can just run a MinION for 3 hours and pay, say 270$ and don’t pay anything else.
  • The PromethION (kind of 48 MinIONs in 1 machine and more channels per chip) will be launched with Sequencing Core facilities as their main costumer in mind, however they will create a MAP for this (PEAP) as well. The PromethION It will include the above improvements as well, making it potentially more productive than a HiSeq.
Oxford Nanopore Tcchnologies CTO Clive Brown showcasing the VolTraxx automatic sample preparation unit

Oxford Nanopore Tcchnologies CTO Clive Brown showcasing the automatic sample preparation device VolTRAXX.

In conclusion the conference atmosphere was very upbeat with a lot of enthusiasm for the future of nanopore sequencing. Can’t wait to get this MinION started.



1 Comment

Filed under Talk

3D printing: Prevent wrapping with ABS

UM22A common problem when 3D printing with ABS is the wrapping that occurs when printing larger objects. Wrapping is the bending of the outsides of the printed object due to shrinkage of the ABS when it cools down. There are already plenty solutions around (ie  this instructables  where Kapton tape is used) but I found this one working particularly well without any Kapton tape.

I found it works best to dissolve left over pieces of ABS in acetone and let is dissolve for about an hour (preferably use a glass jar since acetone dissolves several common plastics). The resulting solution becomes pitch black and is a bit viscous. Next heat up the glass bed (110 °C) of the printer and apply a thin layer of ABS around the outline of your print. Watch out for the fumes of acetone, since acetone is a very very flammable liquid! Use a ventilated room.

When the first layer of the brim is printed, add drops of the ABS solution on the corners of the brim. This will partly dissolve the brim but makes is stick even better to the plate.


Note that this ONLY works with ABS and not with PLA because PLA does not dissolve in acetone.


Filed under 3Dprinting