A Computational Approach for Identifying the Chemical Factors Involved in the Glycosaminoglycans-Mediated Acceleration of Amyloid Fibril Formation

Please download to get full document.

View again

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Similar Documents
Information Report

Health & Lifestyle


Views: 56 | Pages: 10

Extension: PDF | Download: 0

A Computational Approach for Identifying the Chemical Factors Involved in the Glycosaminoglycans-Mediated Acceleration of Amyloid Fibril Formation
  A Computational Approach for Identifying the ChemicalFactors Involved in the Glycosaminoglycans-MediatedAcceleration of Amyloid Fibril Formation Elodie Monsellier 1¤ , Matteo Ramazzotti 1 , Niccolo ` Taddei 1 , Fabrizio Chiti 1,2 * 1 Dipartimento di Scienze Biochimiche, Universita` di Firenze, Firenze, Italy,  2 Consorzio interuniversitario ‘‘Istituto Nazionale Biostrutture e Biosistemi’’ (I.N.B.B.), Roma, Italy Abstract Background:   Amyloid fibril formation is the hallmark of many human diseases, including Alzheimer’s disease, type IIdiabetes and amyloidosis. Amyloid fibrils deposit in the extracellular space and generally co-localize with theglycosaminoglycans (GAGs) of the basement membrane. GAGs have been shown to accelerate the formation of amyloidfibrils  in vitro  for a number of protein systems. The high number of data accumulated so far has created the grounds for theconstruction of a database on the effects of a number of GAGs on different proteins. Methodology/Principal Findings:   In this study, we have constructed such a database and have used a computationalapproach that uses a combination of single parameter and multivariate analyses to identify the main chemical factors thatdetermine the GAG-induced acceleration of amyloid formation. We show that the GAG accelerating effect is mainlygoverned by three parameters that account for three-fourths of the observed experimental variability: the GAG sulfationstate, the solute molarity, and the ratio of protein and GAG molar concentrations. We then combined these threeparameters into a single equation that predicts, with reasonable accuracy, the acceleration provided by a given GAG in agiven condition. Conclusions/Significance:   In addition to shedding light on the chemical determinants of the protein:GAG interaction and toproviding a novel mathematical predictive tool, our findings highlight the possibility that GAGs may not have such anaccelerating effect on protein aggregation under the conditions existing in the basement membrane, given the values of salt molarity and protein:GAG molar ratio existing under such conditions. Citation:  Monsellier E, Ramazzotti M, Taddei N, Chiti F (2010) A Computational Approach for Identifying the Chemical Factors Involved in theGlycosaminoglycans-Mediated Acceleration of Amyloid Fibril Formation. PLoS ONE 5(6): e11363. doi:10.1371/journal.pone.0011363 Editor:  Colin Combs, University of North Dakota, United States of America Received  March 16, 2010;  Accepted  May 18, 2010;  Published  June 29, 2010 Copyright:    2010 Monsellier et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permitsunrestricted use, distribution, and reproduction in any medium, provided the srcinal author and source are credited. Funding:  This work was partially supported by grants from the Italian Ministero dell’Istruzione, dell’Universita` e della Ricerca (projects PRIN 2007B57EAB, PRIN20083ERXWS and FIRB RBNE03PX83) and the European Union (Project EURAMY). The funders had no role in study design, data collection and analysis, decision topublish, or preparation of the manuscript. Competing Interests:  The authors have declared that no competing interests exist.* E-mail: fabrizio.chiti@unifi.it¤ Current address: Laboratoire d’Enzymologie et Biochimie Structurales, CNRS, Gif-sur-Yvette, France Introduction  Aggregation of proteins in the form of extracellular amyloidfibrils is a consistent mechanism underlying a group of diversehuman diseases, including neurodegenerative disorders and non-neuropathic conditions [1]. These disorders differ for the type of protein undergoing aggregation, for the type of organs involved inamyloid deposition and, consequently, for the clinical profilefeatured in each case. Among the most prominent neurodegen-erative conditions are Alzheimer’s and Creutzfeldt-Jakob diseases,which affect the central nervous system via extracellular deposits of the amyloid  b  peptide and prion protein, respectively [1].Examples of non-neuropathic conditions are light chain amyloid-osis and hemodialysis-related amyloidosis, where deposits arefound in joints, skeletal tissue, heart, kidney, etc. In these two casesthe proteins involved are the immunoglobulin light chain and  b 2-microglobulin, respectively [1]. Amyloid fibrils are often localized in close proximity tobasement membranes, a specialized component of the extracellu-lar matrix that is mainly built of collagen and glycosaminoglycans(GAGs) [2–4]. GAGs are long unbranched polysaccharides thatoften occur as O- or N- linked side chains of proteoglycans, withthe exception of hyaluronic acid existing in a free form. Naturallyoccurring GAGs include heparin, heparan sulfate, dermatansulfate, keratan sulfate, chondroitin sulfate and hyaluronic acid.Other non-physiological derivatives of natural GAGs have beenused for studies  in vitro , such as fully-O-desulfated heparin anddextran sulfate [5–6]. GAGs have been found intimatelyassociated with all types of amyloid deposits  in vivo  so far analyzed[7–14], leading to the hypothesis that they have fundamentalrelevance in amyloidogenesis [2,4,15]. More importantly, GAGshave been attributed an active role in amyloidogenesis, as theydisplay an ability to promote fibrillogenesis  in vitro  for a number of protein or peptide systems [5,6,16–26]. The proteoglycanperlecan, in particular, has been implicated as an importantfactor determining amyloid fibril formation [2–4]. The active roleof GAGs and proteoglycans in amyloid fibril formation in vivo hasalso been supported by the observation that inhibitors of heparan PLoS ONE | www.plosone.org 1 June 2010 | Volume 5 | Issue 6 | e11363  sulfate proteoglycan synthesis can reduce amyloid formation[27,28].Studies on the effect of GAGs on amyloid fibril formation haveconsisted so far on investigations focusing on a single protein, andon one or a limited number of GAGs. This has allowed the effectof one or more GAGs to be studied only on one particular systemand in well defined experimental conditions. Nevertheless, thegeneric ability of GAGs to influence the process of amyloid fibrilformation, independently of the GAG used, protein studied andsolution conditions employed, encourages a systematic study using a heterogeneous database reporting different GAGs and proteinsystems and a variety of solution conditions. In this study we havecollected all the experimental data so far published on the effect of GAGs on amyloid fibril formation  in vitro . The data includedifferent GAGs, proteins and experimental conditions and havebeen reported by different investigators. Using a number of singleparameter studies, as well as a multivariate analysis, we havestudied the database as a whole. We have identified the genericchemical determinants responsible for the GAG-mediated accel-eration of amyloid fibril formation, and have used this knowledgeto build a predictive equation of the effect of GAGs on proteinaggregation. Methods Data collection  Articles were collected from PubMed using the keywords‘‘(protein OR peptide) AND (aggregation OR amyloid ORfibrillation) AND (GAG OR glycosaminoglycan OR proteoglycanOR heparin OR heparan)’’. Among the articles retrieved, onlythose presenting both kinetic data of aggregation  in vitro  and a clearexplanation of the experimental conditions used to obtain suchdata were kept for further analysis. Experimental conditionsinclude the nature and molar concentrations of the protein andGAG, and the precise characteristics of the milieu (composition,pH and temperature). Experiments performed in the presence of additional parameters susceptible to have important effects on theaggregation kinetics in the absence and in the presence of GAGs,such as metal ions, were discarded.We chose the aggregation half-time (  t  1/2  ), that is the time atwhich the specific signal used to follow the aggregation reactionreaches half of its final value, to describe the kinetics of proteinaggregation.  t  1/2  was preferred to the rate constant of elongation(  k  agg   ) or the lag phase duration (  t  lag   ) because the latter parametercannot be compared in different experiments if the lag phase isabsent. When only  k  agg   and  t  lag   were mentioned in the article, weused the following equation to calculate the  t  1/2  value [29]: t 1 = 2 ~ t lag  z  2 k  agg  ð 1 Þ For each set of experimental conditions, we calculated  G  , thenatural logarithm of the ratio between the  t  1/2  values in theabsence and in the presence of the GAG: G  ~ ln t 1 = 2  0 ð Þ t 1 = 2  GAG ð Þ    ð 2 Þ Thus, if a GAG accelerates and decelerates the aggregationprocess  G   is positive and negative, respectively. In the absence of alag phase,  G   is equal to Ln [  k  agg  (GAG)/ k  agg  (0)] (compare equations1 and 2).In cases where the authors of the original articles did notmention any kinetic parameters, but showed only kinetic traces,the in-house developed software plot2data was used to extract thedata. The software allows the user to map a Cartesian 2-D spaceon a computer image containing a graph, in order to extracting the coordinates of interesting points and making them available astext values. The extracted data were then manually re-plotted, andthe resulting plots were fitted to equations 3 or 4, depending on theabsence or presence of a detectable lag phase, respectively [29]: A t ~ A ? z A 0 { A ? ð Þ e { t k agg  ð 3 Þ A t ~ A 0 z  A ? { A 0 1 z e k agg  t 1 = 2 { t    ð 4 Þ where  A 0 ,  A t   and  A ‘  are the signal intensities of the techniquesused to monitor aggregation at time 0, t, and  ‘ , respectively.  A 0 ,  A ‘ ,  k  agg   and  t  1/2  were used as floating parameters in the procedureof best fit.The resulting dataset, summarizing the  G   values andthe corresponding experimental conditions in which theywhere collected, is presented in Table S1 (see SupplementaryInformation). Multivariate analysis For the multivariate analyses,  G   was set as the single dependent variable. Different parameters describing the GAGs, polypeptidechains and experimental characteristics were set as independent variables. These include, for the GAG, the number of sulfates perdisaccharide unit, the number of negative charges per disaccharideunit, the chemical nature of the uronic acid (iduronic or glucuronicacid), the position of the sulfate (N- or O-sulfates), and themolecular weight; they also include, for the protein, the length,charge, composition in lysine and arginine residues, folding status(globular or natively unfolded proteins) and association withdisease (disease-related or model proteins); finally they include thesolute molarity and the protein:GAG molar ratio for theexperimental conditions. All the independent variables that weredichotomous (nature of the GAG uronic acid; position of thesulfates on the GAG; folding status of the protein; proteinassociated or not with disease) were recoded into dummy variablesand their interaction terms with other variables were taken intoaccount. We also systematically looked for the presence of possiblequadratic effects for each continuous variable.The multivariate analyses were performed with the MicrosoftExcel add-on software PHStat2 [30], a tool that allows astatistically coherent construction and optimization of multivariateregression models. Both stepwise and best-subset model construc-tion methods were used to reduce the number of significant variables. The final model was the one that best fulfilled thefollowing characteristics: significance of each independent variable(   p  variable , 0.05); significance of the model (   p model , 0.05); adjustedcoefficient of determination (  R  2adj  ) as close to 1 as possible; absenceof collinearity between the different independent variables,detected with the variance inflation factor (VIF); homogeneousdistribution of the residuals (homoscedasticity). Bootstrap and jackknife tests Two statistical approaches were used to verify the significanceand robustness of the chosen model. In the bootstrap test 100subsets of the original dataset comprising 39 entries were GAG and Protein AggregationPLoS ONE | www.plosone.org 2 June 2010 | Volume 5 | Issue 6 | e11363  randomly created, each time using 2/3 of the 39 entries (training sets, 26 entries each). Each of the 100 training sets was used toperform the same multivariate analysis previously performed onthe whole dataset and to obtain a set of regression parameters.Each of the resulting 100 sets was then used in the predictiveequation detailed below (equation 5, see results) to calculate  G   values on the remaining subset of 1/3 entries (test set, 13 entries).This led to the creation of 100 different sets of predicted andobserved  G   values, that were evaluated by linear regressionanalysis to record correlation coefficients and p-values throughgoodness of fit F-statistic.In the jackknife test, single entries were systematically removedfrom the full dataset of 39 entries and the multivariate analysis wasrepeated on shortened datasets of 38 entries (for a total of 39 steps),to obtain regression parameters with which we computed thepredicted  G   value for the removed entry using equation 5 (seeresults). After the analysis was completed for the 39 removedentries the 39 predicted  G   values were plotted against thecorresponding experimental values and the resulting plot wasanalyzed by means of a linear regression. Results General strategy The general strategy adopted for this study is presented inFigure 1 (see also the  Methods   section). Briefly, experimental datareporting the effect of GAGs on the kinetics of amyloid fibrilformation were collected from previously published articles using aprecise and rigorous method, after an extensive search of theliterature (Figure 1, step 1). The resulting dataset summarizes theeffects of different GAGs on the aggregation kinetics of differentproteins, together with the precise experimental conditions inwhich these effects were recorded in each case (GAG and proteintypes and concentrations; composition, ionic strength, total soluteconcentration, pH and temperature of the milieu). The effect of aGAG on protein aggregation was described by  G  , that is thenatural logarithm of the ratio between the aggregation half-time t  1/2  in the absence and in the presence of the GAG (see  Methods   ).The resulting dataset comprises 39 sets of data, representing 8different proteins, 16 different GAGs and a variety of experimentalconditions (see Table S1 in Supplementary Information). The 8proteins include both globular proteins, such as the immunoglob-ulin light chain variable domain, and natively unfolded proteins,such as a-synuclein. Some proteins are directly involved in disease,such as the  b -amyloid peptide, while others are model proteins,like human muscle acylphosphatase. The 16 GAGs are eitherexisting GAGs from different families, such as heparin ordermatan sulfate, or chemically modified GAGs such as fullydesulfated heparin or dextran sulfate.To identify the determinants responsible for the accelerating effects of GAGs on protein aggregation, we analyzed the influenceof different parameters on  G  . This was done by performing inparallel single parameter fittings, through a search of correlationsbetween  G   and a variety of parameters analyzed one by one(Figure 1, step 2a), and a multivariate analysis, that is acombination of different parameters as independent variables ina single equation to describe  G   as a function of all analysableparameters simultaneously (Figure 1, step 2b). The parametersthat appeared from both step 2a and 2b to play a significant roleon the GAG-mediated acceleration of protein aggregation werethen combined into a single predictive equation yielding   G   as afunction of the key parameters only (Figure 1, step 3). Finally, the validity and the robustness of the model and predictive equationwere assessed by statistical tests (Figure 1, step 4). Single parameter analysis: characteristics of the GAGs We first looked at the influence of the GAG sulfation state on  G  .When the  G   value was plotted against the number of sulfatemoieties per GAG disaccharide unit for all the 39 entries of thedataset, a significant linear positive correlation was observed(Figure 2A, r=0.52,  p =7.10 2 4  ). The analysis was repeated byplotting average  G   values, where each average  G   value is the meanof the  G   values related to the same sulfation state (Figure 2B). Again, the average  G   value was found to correlate significantlywith the number of sulfates per disaccharide unit (Figure 2B,r=0.97,  p =0.001). To limit the complications arising from theheterogeneity of proteins used in the study, we restricted theanalysis to a single protein type, i.e.  a -synuclein (Figure 2C) andthe 173–243 fragment of gelsolin (Figure 2D), two polypeptides forwhich enough data were available for a statistical analysis. Thecorrelation was found to be significant in both cases (Figure 2C,D,r=0.89 and  p =2.10 2 4 in both cases). The high significance of thecorrelations shown in Figure 2A–D confirms the dependence of the  G   value on the sulfate state of the GAG and suggests that thesulfate moieties have comparable effects in the aggregation of the various proteins analyzed. Importantly, in all cases the straight lineof best fit passes through the srcin of the graph, where both the  x  and  y  variables have values of 0. This observation indicates that inthe absence of sulfates the GAGs have no effects on the kinetics of protein aggregation. While these data demonstrate that the Figure 1. Scheme of the general strategy used in this study. doi:10.1371/journal.pone.0011363.g001GAG and Protein AggregationPLoS ONE | www.plosone.org 3 June 2010 | Volume 5 | Issue 6 | e11363  GAG and Protein AggregationPLoS ONE | www.plosone.org 4 June 2010 | Volume 5 | Issue 6 | e11363  sulfation state of the GAG is a key determinant of the GAG-induced acceleration of protein aggregation, they also show that itis not the only one, as GAGs with the same number of sulfategroups per disaccharide units can have very different effects onprotein aggregation (Figure 2A).We then looked at the importance of the GAG negative chargein determining   G   (Figure 2E–H). The number of sulfates and thenumber of negative charges per disaccharide unit of a GAG aretwo highly correlated parameters, as each sulfate moiety brings 1negative charge. However they are not identical, as most of theGAGs have one additional negative charge per disaccharide unitdue to the presence of a carboxylate group. Significant correlationswere observed between the  G   value and the number of charges perdisaccharide unit whatever dataset was considered (Figure 2E–H).The slopes of the lines of best fit were found to be identical when  G   values are plotted versus the number of either sulfate moieties ornegative charges (Figure 2A–H). However, in the latter plots thelines of best fit do not pass through the srcins of the graphs, buthave  G   values of 0 when the number of negative charge is  ca  . 1(Figure 2E–H). This implies that the absence of effect on proteinaggregation is observed when the GAGs carry one negative chargeper disaccharide unit (i.e. only the carboxylate group) and nosulfates. Therefore, the correlation between the  G   value and thenegative charge per disaccharide unit arises from the GAGsulfation state, with the carboxylate group appearing to have noeffect.The sulfate moieties in GAGs can be N- or O-sulfates. It hasbeen proposed that N- and O-sulfates can have different effects onprotein aggregation [31]. In our dataset, we did not observe anysignificant difference between the effects of N- or O-sulfated GAGson protein aggregation kinetics (not shown). GAGs can also differin terms of the type of the hexuronic acid, which can be eitheriduronic or glucuronic acid. It has been suggested that GAGscontaining iduronic acid could be more active, due to the greaterconformational flexibility of the iduronic pyranose ring withrespect to the glucuronic pyranose ring [32]. However, we couldnot identify any significant difference between the effect of GAGswith iduronic or glucuronic acid on protein aggregation, wheneither all the data with the same GAG sulfation state wereconsidered (Figure 3A) or when the analysis was restricted to datawith the same sulfation state of GAG and only the 173–243fragment of gelsolin as a polypeptide (Figure 3B). Finally, the  G   value was not found to correlate with the molecular weight of theGAG. Therefore, it seems that the sulfation state is the only GAGcharacteristic that has a significant effect on the GAG-mediatedacceleration of amyloid fibril formation. Single parameter analysis: characteristics of the proteins In a second step, we studied the influence of differentparameters of the polypeptide chains. We looked at the effect of the protein length, charge, and composition in lysine and arginineresidues, described in some cases to be responsible for GAGbinding [6,33]. We also divided the proteins of our dataset intoglobular or natively unfolded proteins, or into disease-related ordisease-unrelated. We could not identify any significant correlationbetween  G   and any of these parameters, with any of the datasetused. This result could be due to the small number andheterogeneity of proteins in the database. Single parameter analysis: characteristics of theexperimental conditions We thoroughly analyzed the importance of the experimentalconditions in determining the  G   value. Most of the experimentsreported in our dataset were carried out at physiologicaltemperature and pH, and under identical conditions of ionicstrength (see Table S1). As a consequence, the influence of thesethree parameters could not be analyzed. To have an estimator of buffer composition that could be used as a descriptive parameterfor our database, we analyzed the influence of the total soluteconcentration of the buffer. A significant negative correlation wasfound between the  G   value and the solute molarity whenconsidering the entire dataset (Figure 4A, r=0.47,  p =0.003). Ahigher solute molarity is associated with a less pronouncedaccelerating effect of the GAG on protein aggregation(Figure 4A). The analysis was repeated by plotting average  G   values, each calculated over a range of solute molarity, for theentire dataset; the analysis confirmed the presence of a correlation(Figure 4B, r=0.84,  p =0.04). In order to limit the problemsarising from the heterogeneity of the GAGs used, only data of theGAG heparin were considered in a subsequent analysis. Acorrelation was still observed when all  G   values obtained withheparin were plotted against solute molarity (Figure 4C, r=0.63,  p =0.01), as well as when average  G   values, each calculated over arange of solute molarity, were plotted versus solute molarity(Figures 4D; r=0.73,  p =0.09).The next studied parameter was the ratio of molar concentra-tions of the GAG and protein used in the experiments. A clearpositive correlation existed between the  G   value and the Figure 2. Influence of the number of sulfates and negative charges per GAG disaccharide unit on protein aggregation.  A–D:dependence of the  G  value on the number of sulfates per GAG disaccharide unit; E–H: dependence of the  G  value on the number of negative chargesper GAG disaccharide unit; A and E: different GAGs, proteins and experimental conditions; B and F: idem, but each  G  value in the plot is the mean of all the  G  values obtained with a GAG with the same number of sulfates or negative charges; C and G: only  G  data of   a -synuclein in identicalexperimental conditions are plotted; D and H: only  G  data of the 173–243 fragment of gelsolin in identical experimental conditions are plotted. In allplots the solid lines represent the lines of best fit; the  r   and  p  values of the linear regression and the slope of the line of best fit are reported in eachplot.doi:10.1371/journal.pone.0011363.g002 Figure 3. Influence of the chemical nature of the uronic acidpresent in the GAG on protein aggregation.  A: different GAGs,proteins and experimental conditions; B: different GAGs, only the 173–243 fragment of gelsolin in identical experimental conditions. In bothcases only GAGs with 2 sulfates per disaccharide unit are considered.Experimental errors indicate standard deviations. The high  p  valuesindicate lack of statistical significance.doi:10.1371/journal.pone.0011363.g003GAG and Protein AggregationPLoS ONE | www.plosone.org 5 June 2010 | Volume 5 | Issue 6 | e11363
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!