What do self-driving cars and prevention genomics have in common? Both are projects within Google X, a clandestine facility and offshoot of Google Inc. that aims to advance technology in significant ways. Google X Life Sciences is their latest endeavor and the Baseline Study is the first project. The focus is in medical genetics to develop a base-line model of the healthy human body, as opposed to mapping diseases, which has been the aim of many previous studies. They are collecting a broad spectrum of data, including how the study individuals digest food, react and metabolize drugs to how fast their hearts beat and other biological data. Their DNA will be sequenced and other body fluids collected and analyzed. As Google X Life Sciences builds the infrastructure they are starting out small, enrolling only 175 individuals in the study. But make no mistake, this will be a major advancement in the use of Big Data in medical research.
So what is big data and why should you pay attention to it? In March, the Harvard Magazine ran an article titled “Why is Big Data a Big Deal”? Experts have coined the term, Big Data as significant improvements in statistical and computational methods. The end-goal is to create knowledge by linking datasets and visualizing patterns. Historically, humans have been doing this for a long time on limited amounts of information but computers are increasingly getting better. In the past medical research is often designed to minimize variables in order to observe direct effects. The innovation here is to be able to accommodate vast, multifactorial data sets to extract meaningful findings and insights in a new way.
Biotech has been leveraging fledgling efforts in Big Data. This can be as small as influencing scientific sales to as big as reframing how we develop scientific hypotheses. Multinational life science research providers most likely are already deploying algorithms sorting through massive electronic records to predict which product advertisements individual scientists should see as they surf the web. On a grander scale, Big Data has provided fundamentally new approaches to experimental design. Instead of studying what specific pathways lead to cancer we can ask how can we avoid getting cancer. Calico, backed by Google is definitely on to this idea and is sequencing the genomes of healthy 100-year olds to help find the fountain of youth.
Big Data is also contributing to career development in biotech. First, for job seekers and executive recruiters, we are already seeing Linkedin’s role. They are deploying algorithms to predict links between similar people, skills and job opportunities. So far it can be inexact and sometimes mismatched correlations but they will get better with time. Second, big data will drive new and more career opportunities in what, for lack of a term, might be called Big Science. Existing technologies that are already based on high-throughput methods, such as, synthetic biology, sequencing efforts, and genomics, in general, may benefit most readily. Gone are the days of one gene, one protein and one pathway. And lastly, Big Data will usher in a new opportunities for people who thrive on asking big questions and getting insightful answers.