Dissecting specificity of short linear motifs in protein quality control
Loading...
Date issued
Authors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Reuse License
Description of rights: InC-1.0
Abstract
Cellular processes hinge on protein interactions with genetic materials, enzymes, and modifiers. Protein can interact with other proteins via structured domains or Short Linear Motifs (SLiM). SLiMs, typically 3 to 10 amino acids long, govern diverse functions, including protein quality control (PQC). By ensuring correct localization, degradation, and protein complex assembly, the PQC intricately preserves protein homeostasis, a critical determinant of cellular health. Despite their significance in maintaining proteostasis, SLiMs and their role in various protein quality control pathways are yet to be discovered entirely. In this work, I built tools to identify SLiMs in PQC, acting as localization signals and degrons in degradation pathways.
Degrons are found extensively at protein termini from bacteria to mammals. Though extensively studied, our understanding of the prevalence and specificity of degrons at termini is still incomplete. Here, I built a pipeline to analyze the Deep mutational scanning (DMS) experiments that help dissect the specificity of degrons. The pipeline first performs the quality assessment of DMS experiments. Then, it performs downstream analysis to infer degron motifs, using simple visualization for shorter peptides and interpretable machine learning for longer peptides.
To systematically study the N-terminal amino acid specificity, I employed simple visualization techniques after assessing the quality of the experiment. This is performed on the stability profile from DMS of N-terminal diresidue constructs in the H. sapiens cell line constructs. Furthermore, to study the eukaryotic C-degron pathway in an unbiased fashion, I applied Interpretable deep learning on the stability profile of over 40k random peptides in yeast C terminomes to infer degron motifs. From the ~10% of putative C-degron peptides in this library, I found 21 potential C-degron motifs. Combining the results from machine learning, mutagenesis, and genetic screens reveals that the F-box substrate receptor of SCF ubiquitin ligase, Das1, targets ~40% of degrons and recognizes at least five distinct but overlapping motifs.
SLiMs also play a vital role as protein localization signals. Biophysical properties of various localization signals help in their correct localization. However, which features of the localization signals are crucial for targeting is poorly understood. In this work, I investigate how various biophysical properties of transmembrane domains(TMD) in Tail-anchored proteins help in localization. Analyzing the localization of TMD-variants reveals combinations of biophysical properties that help in compartment-specific localization. For example, short, low hydrophobic TMD-variants with positively charged C-terminals are more prone to mitochondrial localization. In contrast, hydrophobic TMD-variants with positively charged N-terminals tend to localize in PM. To extract the general but quantitative rules for localization, I created a random forest, which revealed the importance of the combined effect of hydrophobicity, flanking charge, length and cysteine content of TMD-variants in distinguishing between organelle-specific localization.
Overall, this work demonstrates the pipeline for dissecting specificity of SLiMs in PQC. The computational pipeline for dissecting degrons is scalable for different lengths of proteins and for any biophysical features at any protein termini. This pipeline could be used for analysis of future DMS experiments. Furthermore, the pipeline for dissecting the localization signals could be utilized to design new organelle-specific targeting signals.