Estudio computacional de las basese moleculares de la especifidad funcional en familias de proteínas
- RAUSELL DE FRIAS, ANTONIO
- Alfonso Valencia Herrera Director
Universidade de defensa: Universidad Autónoma de Madrid
Fecha de defensa: 13 de xuño de 2011
- Federico Gago Badenas Presidente
- Ramón Díaz Uriarte Secretario/a
- Ugo Bastolla Vogal
- Juan Antonio García Ranea Vogal
- Patrick Aloy Calaf Vogal
Tipo: Tese
Resumo
Throughout evolution, homologue proteins diverge in sequence as a result of the evolutionary pressure exerted on the variability arising from mutation, duplication, speciation and deletion events. At evolutionary scale, this divergence process is usually translated in an internal organization of the proteins within the family into protein subfamilies. This organization assumes that the different subfamilies represent functional features that are specific within the context of the common function of the families. Several computational studies have shown the relationship between subfamily structure, residues that are differentially conserved among the subfamilies (SDPs) and key aspects of functional specificity. The main goal of this thesis is to deepen the current understanding of the functionally driven divergence of protein families by performing a novel study that, for the first time, analizes at large scale the relationship between subfamilies and differential interaction patterns among homologue proteins, while also taking into account the implication of SDPs in protein-protein interfaces. Consequently, this study combines the implications of functional sites with those of ligand interacting sites. In order to perform this large-scale study, a novel sequence-based computational method to analize protein families was developed which is able to discern both the subfamilies¿ internal structure and their differentially conserved residues in a coherent and simultaneous manner: the S3det method. This method provides a methodology that can be applicable to a big set of proteins making it possible to obtain a representative number of subfamilies and SPDs. The results obtained show that protein family organization into subfamilies answers in a general way to functional differential features that are both related to the specific enzymatic activity and to distinctive sets of interacting proteins. Moreover, positions that are differentially conserved in subfamilies (SDPs) appear to be structurally associated to functional regions that correspond to catalytic sites, ligand union sites and protein-protein interfaces. Indeed, such associations occur both in terms of the spatial distance distributions and in the relative enrichments. Most importantly, the implication of the SDPs in protein-protein interfaces is especially clear in the case of heterocomplexes interfaces. These observations allows to propose that binding specificity evolves by selecting key residues differentially conserved in the subfamilies as pivotal points indicative of binding with their effectors. In a complementary manner, two other computational methods were developed (Xdet and Mcdet). The former exploits quantitative functional information while the latter makes use of supervised classifications, and both are used to predict residues determining functional specificity. Their predictive ability has been demonstrated in protein alignments where sequence similarities do not correspond to functional similarities. Taken together, the results of this thesis provide generality and quantitative support to the hypothesis stating that sequence divergence accumulated in protein families is driven by functional divergence.