Show simple item record

dc.contributor.authorWang, Shaokai
dc.date.accessioned2024-04-26 20:42:03 (GMT)
dc.date.available2024-04-26 20:42:03 (GMT)
dc.date.issued2024-04-26
dc.date.submitted2024-04-24
dc.identifier.urihttp://hdl.handle.net/10012/20508
dc.description.abstractThis thesis explores deep learning methods for protein identification and property prediction, encompassing two primary areas: mass spectrometry-based protein sequence identification and protein property prediction. We introduce a method that enhances the identification rate of MHC-I peptides and facilitates the discovery of novel mutated MHC-I peptides. In the domain of property prediction, we present three novel approaches for the early diagnosis of amyloidosis, the discovery of anticancer peptides and the classification of anticancer peptide functional type. Identification of Novel MHC-I Peptides with Tandem Mass Spectrometry: The study of immunopeptidomics requires the identification of both regular and mutated MHC-I peptides from mass spectrometry data. For the efficient identification of MHC-I peptides with either one or no mutation from a sequence database, we propose a novel workflow: NeoMS. It employs three main modules: generating an expanded sequence database with a tagging algorithm, a machine learning-based scoring function to maximize the search sensitivity, and a careful target-decoy implementation to control the false discovery rates (FDR) of both the regular and mutated peptides. Experimental results demonstrate that NeoMS both improved the identification rate of the regular peptides over other database search methods and identified hundreds of mutated peptides that have not been identified by any current methods. Further study shows the validity of these new novel peptides. Deep learning boosted amyloidosis diagnosis: Amyloid light chain (AL) amyloidosis is a disorder characterized by the deposition of antibody light chains in organs. The importance of early and accurate diagnosis in AL amyloidosis cannot be overstated, as it enables timely implementation of appropriate treatment strategies and improves patient outcomes. Therefore, developing a highly accurate method using antibody sequencing and computational techniques is crucial to address this urgent need. While several computational methods have been developed to predict AL amyloidosis, they heavily depend on manually extracted features, and their performance falls short of satisfactory levels. We present DeepAL, a deep learning-based approach to predict AL amyloidosis with high precision. DeepAL utilizes a pre-trained model to extract light chain features and then trained with AL amyloidosis knowledge. In evaluations conducted on two benchmark datasets, DeepAL surpasses the performance of previous approaches. Additional experiments demonstrate that features extracted from the pre-trained model have significantly enhanced overall performance. Anti-cancer peptides identification and activity type classification with protein sequence pre-training: Cancer remains a significant global health challenge, responsible for millions of deaths annually. Addressing this issue necessitates the discovery of novel anti-cancer drugs. Anti-cancer peptides (ACPs), with their unique ability to selectively target cancer cells, offer new hope in discovering low side-effect anti-cancer drugs. We introduce DUO-ACP, a model serving dual roles in ACP prediction: identification and functional type classification. DUO-ACP employs two embedding modules to acquire knowledge about global protein features and local ACP characteristics, complemented by a prediction module. When assessed on two publicly available datasets for each task, DUO-ACP surpasses all existing methods, achieving outstanding results. We further interpret the contribution of each part of our model, including the two types of embeddings as well as ensemble learning. On a new curated dataset, the prediction results of DUO-ACP closely match existing literature, highlighting DUO-ACP's generalization capabilities on previously unseen data and displaying the potential capability of discovering novel ACP. Novel fine-tuning strategy on pre-trained protein model enhances ACP functional type classification: Cancer remains one of the most formidable health challenges globally. ACPs have recently emerged as a promising new therapeutic strategy, recognized for their targeted and efficient anti-cancer properties. To fully leverage the potential of ACPs, computational methods that can accurately discover and predict their functional types are indispensable. We present ACP-FT, a deep learning model that is fine-tuned from a pre-trained protein model specifically for predicting the functional types of ACPs. Employing a novel fine-tuning approach alongside an adversarial model training technique, our model surpasses existing methods in classification performance on two public datasets. Additionally, we provide a thorough analysis of our training strategy's effectiveness. The experimental results demonstrate that our two-step fine-tuning approach effectively prevents catastrophic forgetting in the pre-trained model, while adversarial training enhances the model's robustness. Together, these techniques significantly increase the accuracy of ACP functional type predictions.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectbioinformaticsen
dc.subjectdeep learningen
dc.subjectmass spectrometryen
dc.subjectpre-trained modelen
dc.titleDeep Learning Methods for Novel Peptide Discovery and Function Predictionen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws-etd.embargo.terms0en
uws.contributor.advisorMa, Bin
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages