Barešić, Anja (2012) Structural analysis of single amino acid polymorphisms. Doctoral thesis, University College London, Faculty of Life Sciences.
|
PDF (PhD thesis)
- Published Version
- other
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (13MB) | Preview |
Abstract
Understanding genetic variation is the basis for prevention and diagnosis of inherited disease. In the ‘next generation sequencing’ era with rapidly accumulating variation data, the focus has shifted from population-level analyses to individuals. This the- sis is centred on the problem of gathering, storing and analysing mutation data to understand and predict the effects single amino acid mutations will have on protein structure and function. I present analysis of a subset of mutations and a new pre- dictive method implemented to expand the coverage of the structural effects by our pipeline. I characterised a subset of pathogenic mutations: ‘compensated pathogenic devia- tions’. These are mutations which cause disease in humans, but the mutant residues are found as native residues in other species. During evolution, they are presumed to spread through populations by coevolving with another, neutralising mutation. When compared with uncompensated mutations, they often cause milder structural disruptions, prefer less conserved structural environments and are often found on the protein surface. I describe the development of a new analysis to test the effects of mutations by predicting residues involved in protein-protein interfaces where the structure of the complex is unknown. Two machine learning methods (multilayer perceptrons and, in particular, random forests) show an improvement over previously published protein- protein interface predictors. This new method further increases the ability of the SAAPdb analysis pipeline to show the effects of mutations on protein structure and function. Furthermore, it is a template for building prediction-based structural analysis methods for the pipeline, where available structural data are insufficient. In summary this thesis examines mutations from both an evolutionary and a disease perspective. In addition, a novel method for predicting protein interaction regions is developed thus expanding the existing pipeline and furthering our ability to under- stand mutations and use them in a predictive context.
Item Type: | Thesis (Doctoral thesis) |
---|---|
Uncontrolled Keywords: | single amino acid polymorphism ; protein structure ; compensated pathogenic deviation ; protein-protein interface prediction ; random forest |
Subjects: | NATURAL SCIENCES > Biology > Biochemistry and Molecular Biology BIOTECHNICAL SCIENCES > Biotechnology > Bioinformatics |
Divisions: | UNSPECIFIED |
Depositing User: | Anja Barešić |
Date Deposited: | 28 Jul 2020 11:20 |
URI: | http://fulir.irb.hr/id/eprint/5880 |
Actions (login required)
View Item |