I don’t think there was a single session I attended at BIO 2017 that didn’t mention Big Data. One fact that has stuck with me is that Craig Venter (who led the privately-funded version of the Human Genome Project and now leads the Human Longevity Project) is in the top one percent of users of Amazon servers – putting him up there with Netflix and Zynga. That is a LOT of data!!
The rate at which DNA sequencing technologies are increasing their data output is faster than the rate at which computation power is growing, which means that our ability to generate this type of data has outstripped our ability to store and analyse it.
Venter’s Human Longevity Project has sequenced over 45,000 genomes since they kicked off, and one of their rate limiting factors is computer storage and CPU usage. This costs them about $1 million a month.
In a beautiful piece of circularity, Microsoft announced earlier in 2017 that within the next year or two they plan to be able to store data on strands of DNA and expect to have an operational storage system up and running by 2020. This will be achieved through DNA synthesis and in fact is already possible as illustrated over five years ago when a Harvard geneticist encoded one of his text books on strands of DNA. Currently it’s just a bit slow and expensive to be practical.
So, what can we do with all this data? One of the most fascinating talks I heard at BIO 2017 was given by Atul Butte – the Pricilla Chan and Mark Zuckerberg Distinguished Professor at the University of California, San Francisco (along with a long list of other impressive appointments). He is a champion for the use of public data to promote science (he was recognised by former US president Barack Obama for this commitment), and his passion is obviously infectious. More than half of his students go on to create startups in this space, even those who go into academia. My hands down favourite quote of the conference was his comment that “If we want to change the world, we can’t sit around writing papers about it.”
Atul shared a long list of startups born to create value and improve health outcomes using big public data. He started with Carmenta Bioscience, which he co-founded in 2013 with seed investment of US$2 million. Using data mining techniques and publicly-available data from women who had suffered preeclampsia to identify biomarkers associated with this life-threatening condition, Carmenta developed an extremely accurate diagnostic for preeclampsia and in less than two years sold for an undisclosed amount to Progenity. This is how you make money from science! And the best thing is, as Atule pointed out, data is totally reusable – it’s not like oil or water.
Unlike IT startups, one does not typically think of drug discovery companies starting up in garages or mothers’ basements. But NuMedii could have done exactly that. They are a four-person company using computational prediction and big public datasets to try and work out which currently-available drugs can be used to treat other diseases. And they aren’t the only ones doing this: 9 Computational Drug Discovery Startups Using AI.
Atule emphasised the importance of tools which can analyse and manage this data. Machine learning – a subset of artificial intelligence (AI) in particular has some interesting applications. It is great for non-linear things, like pictures and pattern recognition, which means it can be used on medical data like X-rays. For a great infographic on AI see the image below, or view the full PDF online.
Interestingly, at a recent AI meet-up in Auckland, a surgeon from Auckland Hospital noted that he spends a lot of time looking at x-rays as part of the triage process – he was keen to know if this is something that could be done by AI. The first question the crowd asked is “how much data do you have?” i.e. how many x-rays. “Heaps!” was the response…and spontaneously individuals in the room with the right expertise started mapping out potential solutions to this problem. I can’t help but wonder what goldmines of data might be lying around in our district health boards?
The convergence of AI and biotechnology offers an exciting opportunity for New Zealand – we have a wealth of the life science expertise which is essential for these types of companies, an increasing number of data scientists, and investors who love IT startups. This could be a match made in heaven.
At the end of the day, we have the data, we have the tools, in some ways the hardest part is still figuring out what question you want to ask - what is the unmet need? The answer to this question is where the opportunity lies.
Contact Kimberlee Jordan on Twitter @kimberlee_j
Callaghan Innovation is a New Zealand government innovation agency that works with Kiwi companies to accelerate commercialisation of their new technology ideas. Our National Technology Networks team supports businesses via our four technology platforms – Advanced Manufacturing, Advanced Materials, Biotechnology and Data & IoT, with the aim of helping companies rapidly connect to new and advanced technologies.
- 2 of 2