The Next Generation of Proteome Analysis New

Estimated reading time: 6 minutes
The proteome is an incredibly rich and dynamic source of biological information, key to understanding how cells function in sickness and health. Genes and genomes give hints as to what might happen in a cell, but proteins do the work and reflect what is happening in real time. However, it is challenging to measure the breadth and detail of the proteome at scale. Consequently, scientists are constantly trying to understand how the proteome affects physiological behaviors without a complete picture.
Parag Mallick, PhD
Co-founder and Chief Scientist
Nautilus Biotechnology
In this Innovation Spotlight, Parag Mallickco-founder and chief scientist at Nautilus Biotechnology, explains how unlocking the proteome can lead to breakthrough solutions for crises in healthcare and beyond.
What challenges do scientists face when studying the proteome?
First off, proteins are highly dynamic. They change abundance, location, and modification state every second of every day. These changes can greatly alter a protein’s impact on biology. For instance, modifications such as phosphorylation and ubiquitination might activate an enzyme’s activity or cause a protein to degrade. Additionally, protein abundances span an enormous range, and reliably measuring proteins across such a range of concentrations is extremely challenging. The analytical ranges of most measurement techniques cover only 1-3 orders of magnitude, but protein concentrations potentially vary across more than 10 orders of magnitude. Finally, proteins are extremely different from each other biophysically—some are large, others are small; some are positively charged, others are hydrophobic. Tolerating such a wide variety of physical characteristics is extremely challenging for most measurement modalities.
What would a deeper exploration of the proteome mean for our understanding of biology?
Biology is the coordinated action of billions of individual molecules within 30 trillion cells working together to sense and respond to changes in their environment, and we believe such sense and response systems are driven by changes in protein abundance, location, and modification state. Thus, investigating the proteome helps us better understand our biology. This deeper understanding could transform all areas of scientific research, from basic biology to drug development: enabling earlier disease detection, novel biomarker discovery, and more precise therapeutic interventions.
How did you become interested in developing a next-generation protein analysis platform?
I have spent years at the intersection of computation and large-scale biology, taking a multi-scale biology approach to understand how diseases start and progress, as well as to identify diagnostic and prognostic biomarkers. These studies have made it incredibly clear that existing proteomics tools are powerful and can be applied to ask many questions. However, they have also made it clear that current approaches are insufficient.
After decades of working to improve existing workflows, I began to see an opportunity to start with a fresh slate and build something totally new that still complements existing approaches. That led me to co-found Nautilus with tech leader Sujal Patel in order to actualize this vision to democratize proteomics.
What can Nautilus’s platform measure and how does it work?
At its core, the Nautilus Proteome Analysis Platform is based on the concept of massive scale “Iterative Mapping” of protein molecules at the single-molecule level. Here, individual protein molecules from a sample are denatured and then deposited onto individual landing pads within a nano-fabricated flow-cell such that each protein molecule resides at a unique coordinate. Following conjugation, we sequentially introduce a series of fluorescently labeled affinity reagent probes into the flow cell and detect which protein molecules they bind.
To achieve unprecedented scale, our platform iteratively probes billions of individual, intact, and uniformly distributed protein molecules deposited on massive flow cells. The resulting binding pattern maps either short amino acid sequences or specific modifications to each single protein molecule on each landing pad. Finally, our machine learning framework uses these binding patterns to identify the specific proteins or proteoforms found on each landing pad, comprehensively profiling the landscape of the proteome and targeted proteoforms.
This foundational technology can be applied in two distinct, but complementary modes of operation. In our broadscale mode, we use a completely novel class of affinity reagents. These “multi-affinity probes” are designed in-house to semi-specifically recognize short amino acid sequences (3-4 amino acids long). The output is a list of protein identities and abundances. In our targeted mode, we use off-the-shelf affinity reagents that are highly specific to individual modification sites on targeted proteins. By counting the number of times we observe a given combination of modifications, we can quantitatively measure the proteoform composition of the sample across thousands of protein variants.
In Iterative Mapping, the Nautilus Proteome Analysis Platform repeatedly interrogates billions of single, intact protein molecules on its massive, nanofabricated protein arrays using probes that either bind to short protein sequences (~3 amino acids) or protein features such as post translational modifications and isoform-specific sequences that combine to define proteoforms. Nautilus’ machine learning-powered algorithms use the observed binding patterns to identify each protein or proteoform at the single-molecule level, and identifications are summed to provide protein or proteoform counts. This method thus provides quantitative maps of the proteome.
Nautilus Biotechnology
What are some examples of applications for the Nautilus Proteome Analysis Platform?
Novel proteomic data generated by our platform has the potential to improve medicine in many ways, including the discovery of disease biomarkers and drug targets. We recently shared data at the US Human Proteome Organization (HUPO) meeting and in a preprint demonstrating that our platform can measure hundreds of proteoforms of the Alzheimer’s disease-associated Tau protein with abundances varying over about 3 orders of magnitude in neuronal organoids, multicellular integrated brains, mouse brains, and for the first time, human brains. We are encouraged by this data and the adaptability of our technology to assess other proteins of interest.
What excites you most about next-generation proteomics?
I believe we are in the midst of a proteomics revolution, with everyone from die-hard proteomics researchers to opinion leaders in the broader biopharma community beginning to recognize this revolution. Being able to decipher the full complexity of the proteome will serve as a rising tide for all of science, and help researchers answer biological questions that were previously out of reach.
We are proud to continue making early progress toward understanding mysterious proteins and proteoforms with leading partners such as Genentech. This progress is testament to the origin of our name “Nautilus”, recalling a vessel of newfound discovery. Like us, many in this space are pioneering a convergence of biology, math, and technology that will only accelerate our exploration of the treasures hidden in the proteome.