Rutgers‒Camden Researchers Publish Paper Examining the Structure of Proteins Linked to Diseases
Rutgers researchers used university resources to study oily regions of proteins and determine what suspects can contribute to diseases
Rutgers‒Camden researchers have written a study bridging the divide between genomics – the study of genetics and proteomics – the study of proteins.
The study titled “Contiguously hydrophobic sequences are functionally significant throughout the human exome” was published in The Proceedings of the National Academy of Sciences (PNAS). The research centers around the connection between gene mutations in protein sequences and diseases.
The research found that mutations tend to occur within the body in oily regions of proteins. This is typical of Mendelian diseases such as Alzheimer’s, Parkinson’s Disease, and epilepsy. For example, there is a known mutation in the protein apolipoprotein E, which packages fats. That mutation increases a person’s risk of late-onset Alzheimer’s. The team used this information, and information about many other mutations, to detect patterns.
“It can be tough to figure out which differences in DNA are important, and what components of our own DNA as humans affect our traits, our health, and disease,” said Grace Brannigan, lead principal investigator, associate professor in the physics department, and program director for the Center for Computational and Integrative Biology at Rutgers‒Camden.
Brannigan and her team first observed the striking pattern while using a supercomputing platform, formerly run by the Rutgers Office of Advanced Research Computing, called Caliburn to identify how mutations in “structureless” proteins could still cause disease. They then checked for the same pattern across all human proteins and found that mutations in the pattern were more likely to cause disease. Their work was made possible by the Busch Biomedical Grant program, designed to enhance biomedical research at Rutgers, which awarded the team a bridging grant in 2019 to fund their research.
During the COVID-19 pandemic, Brannigan, a computational biophysicist whose work focuses on the central nervous system, collaborated with her husband of nearly 20 years, Matthew Hansen, a computational population geneticist at the University of Pennsylvania, while the two were working from home. They brought together their knowledge of their respective fields to work with each other, which proved difficult at times being that each was operating under their own scientific dictionary within their field. To Brannigan, “interaction,” “function,” and “structure” meant one thing, and to Hansen, they meant another.
Partnering with Brannigan’s Ph.D. student, Ruchi Lohia, they worked through these barriers in terminology and techniques to combine their knowledge and better understand how variations and mutations in DNA are associated with different diseases.
The success of this approach inspired Prof. Brannigan to lead a CCIB team along the same theme of bridging the divide between genomics and proteomics. The team received a $2 million research traineeship grant from the National Science Foundation for an innovative biology program out of the Center for Computational and Integrative Biology at Rutgers‒Camden, titled Codes4Life (C4L). The initiative uses software engineering and artificial intelligence to advance graduate STEM training models preparing the next generation of innovators and pioneers.
In the future, the team may use genomic data from the All of Us Researcher Workbench to look at the oily regions of proteins in more detail. The program is a workbench of the National Institutes of Health (NIH) and enables principal investigators at Rutgers to access electronic health records, physical measurements, genomic data, surveys, and more from a large, diverse database.
Brannigan, Hansen, and Lohia found that when looking at “hydrophobic blobs” – the stretches of amino acid chains that make up oily regions of a protein – in a row, the longer and oilier the regions were, the more likely they were to harbor mutations linked to disease.
“A protein is like a chain of beads, and each bead is an amino acid with its own properties. Some are sticky (or oily) beads, while others are not. When you mutate an amino acid, you are changing the property of an individual “bead” on the chain – its stickiness, size, or shape. Some of those changes don’t matter and others do, and we don’t know why. It’s very natural to consider each bead individually, which is what many people have done in the past, but we found that it’s more useful to look for stretches of the same bead type in a row. What we found is that many very sticky beads (i.e. oily amino acids) in a row signal a region that is likely to have disease-causing mutations,” said Brannigan.
The team’s work could help researchers who have a list of possible suspected mutations that might be contributing to a disease. Any of the mutations that occur in a very oily region of the protein are particularly likely to be disease-causing.
Brannigan and Hansen also worked together with a team of students and postdocs to develop a software, called the Blobulator, which is intended to help researchers simplify the sequence they are studying and hone in on certain regions in order to easily identify mutations.
“There's so much information to look at and the information can be noisy. Our method of conducting research allows us to simplify that information a lot, and when you can simplify information, the statistics become much more practical,” said Brannigan.
Brannigan hopes that the research will help raise confidence in the scientific community about which mutations are likely to cause certain diseases. Genetic testing companies are not allowed to inform patients about risky mutations unless a certain confidence level has been reached. It’s possible that these insights will help patients obtain more high-quality information from their DNA, so they can find out what diseases they are likely to develop in the future. Having this prior knowledge could enable people to take steps to prevent the onset of those diseases.