Protein Engineering vol.10 no.8 pp.895–903, 1997 Modelling protein unfolding: hen egg-white lysozyme M.A.Williams1,2, J.M.Thornton1 and J.M.Goodfellow3 Laboratory of Molecular Biology, Department of Crystallography, Birkbeck College, Malet Street, London, WC1E 7HX, UK 1Also at the Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK 2Present address: Molecular Structure Division, NIMR, The Ridgeway, Mill Hill, London NW7 1AA, UK 3To whom correspondence should be addressed Introduction Considerable progress has been made recently in obtaining structural information about intermediates along the folding pathways of proteins. A variety of experimental techniques, principally H–D exchange NMR, circular dichroism and fluorescence spectroscopy and protein engineering, have provided information about the changing environments and configurations of individual residues during the folding process (Baldwin, 1993; Fersht, 1993; Evans and Radford, 1994; Serrano, 1994). Structural models of the intermediate states have then been built which are consistent with these data. Unfortunately, the experimental information is insufficient to determine high-resolution structures of the folding intermediates and we have therefore turned to molecular simulation techniques in an attempt to model them. An atomic resolution simulation of a protein folding from an initial random coil state is currently computationally impracticable. Consequently, several researchers have recently carried out simulations of the unfolding of proteins in aqueous solution in order to derive models of the folding intermediates (Daggett and Levitt, 1994). Modelling the unfolding process is a tractable problem, as the system starts from a well defined state whose stability can be artificially disrupted, precipitating rapid structural change. This disruption is usually produced by increasing the temperature of the simulation to very high © Oxford University Press 895 Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 A novel modelling procedure, which rapidly unfolds a protein by enhancing solvent penetration of its core, was used to investigate the unfolding pathway of hen egg-white lysozyme. Early on the unfolding pathway there is a dramatic disruption of the tertiary contacts within the protein, which decouples its domains. Subsequently, the helical domain slowly loses its compactness and the helices fluctuate rapidly. The protein then adopts a ‘molten globulelike’ structure in which the native β-sheet is essentially intact. The modelled structures have properties similar to those of lysozyme’s experimentally characterized partially folded states and provide insight into its complex (un)folding process. The sequence of unfolding events shows how the unfolding pathway of a multidomain protein may be most similar to its fastest, but not necessarily its dominant, folding pathway. Keywords: folding intermediate/hen lysozyme/molecular modelling/molten globule levels (225 and 327°C have both been used in several studies), which dramatically increases the rate of unfolding, reducing the total unfolding time to about 1 ns, compared with the several milliseconds expected under experimental conditions. We have developed a novel, alternative, method for unfolding of a protein at experimentally attainable temperatures by enhancing solvent penetration of the protein core. The thermodynamics of solvent–solvent and solvent–protein interactions are vitally important in determining the structure and stability of proteins (Dill, 1990; Levitt and Park, 1993). Experimentally, protein unfolding is initiated by a change in the relative thermodynamics of solvent–protein, solvent–solvent and protein–protein interactions which results in an increased propensity for the solvent molecules to interact with protein rather than other solvent molecules. We aim to mimic the protein unfolding process using molecular dynamics-based modelling of the protein–solvent system at 80°C and encouraging the protein to become more solvated. As the protein structure fluctuates during the dynamics, relatively large cavities form transiently within its core. To promote unfolding, we periodically interrupt the dynamics and insert water molecules into some of these cavities. We have previously shown that a cavity within a protein is increasingly likely to be hydrated as its size and the number of polar groups which line the cavity increase (Williams et al., 1994). Thus we chose to insert water molecules into those cavities which are sufficiently large and polar that they could be hydrated under conditions in which proteins unfold, but which would not necessarily be occupied in the native structure at 25°C. In an effort to mimic the diffusional process, we also require that a new water molecule may only be inserted at a position less than 4.5 Å from an existing water molecule. During the subsequent period of molecular dynamics, the protein reacts to the disturbance created by each new set of water molecules. Each individual water molecule can stabilize or destabilize the local structure or simply diffuse out of the cavity into the surrounding bulk solvent. However, the net effect of this enhanced solvent penetration protocol (detailed in Figure 1 and under Methods) is to cause the protein to unfold rapidly at 80°C. This protocol can model the early stages of protein unfolding using an order of magnitude less computational effort than previously reported procedures. We chose to study the unfolding of hen egg-white lysozyme using this novel methodology. Hen egg-white lysozyme (HEWL) is an α 1 β protein with a large α-domain containing four α-helices and a 310-helix and a smaller β-domain consisting of a triple-stranded anti-parallel β-sheet, an irregular loop containing two disulphide bridges and a 310-helix (Figure 2). The protein contains a total of four disulphide bonds, all of which remain intact in both the reported experimental and our theoretical studies. Extensive experiments show that HEWL has multiple folding pathways, that on the dominant folding pathway the α-domain becomes substantially folded prior to the β-domain and that a significant minority of molecules M.A.Williams, J.M.Thornton and J.M.Goodfellow follow a much faster route on which the domains fold at similar rates (Radford et al., 1992; Dobson et al., 1994; Keifhaber, 1995). This protein has also been investigated by other workers using several modelling protocols (Hunenberger et al., 1995), including one of the pioneering applications of the very high temperature method (Mark and van Gunsteren, 1992). Methods The modelling procedure is summarized in the flowchart shown in Figure 1. An X-ray crystal structure of HEWL (1hel, Wilson et al., 1992) is placed in a periodic box of water molecules and all those within 2.3 Å of a protein atom are removed. The AMBER/OPLS potential (Weiner et al., 1984; Jorgensen and Swenson, 1985) is used to model the interactions of the protein and the TIP3P potential (Jorgensen et al., 1983) those of the water molecules. The charge states of the protein residues are set at those expected for pH 7 (States and Karplus, 1987). Interactions are set to zero for residues or molecules separated by more than 8 Å. The dielectric constant is set equal to 1. All molecular dynamics and minimization are carried out using AMBER (Pearlman et al., 1991). The protein–water system is initially simulated in this periodic manner for 10 ps of molecular dynamics at 300 K and 1 atm pressure, using the constant NPT algorithm of Berendsen et al. (1984) with a coupling constant of 0.2 ps. The time step for each iteration of the dynamics is 2 fs and SHAKE is used to fix the length of covalent bonds involving hydrogen. Following this ‘relaxation’ phase, those water molecules further than 5 Å from a protein atom are removed from the system, using the program SHELL (M.A.W.), and the periodic boundary conditions no longer applied. The subsequent modelling of the protein surrounded by its shell of waters is carried out with a 896 Fig. 2. Crystal structure of hen egg-white lysozyme (Wilson et al., 1992). (a) Ribbon representation showing two α-helices (A and B) at the N-terminal end of the protein (pale blue) leading into a three-stranded βsheet and a large loop (green) which is followed by a helical segment (dark blue) consisting of a 310-helix, two more α-helices (C and D) and the Cterminal 310-helix. The protein has four disulphide bonds (C6–C127, C30– C115, C64–C80, C76–C94), which are shown in yellow. The active site of the enzyme lies in the cleft at the top right of the structure. (b) The molecular surface of the crystal structure of HEWL in the same orientation. distance-dependent dielectric at constant temperature. Although this particular ‘solvent shell’ approach does not precisely mimic the behaviour of a rigorous and computationally demanding periodic box simulation with long cut-off distances or Ewald sum and does somewhat reduce the mobility of surface residues, it has been shown to give good structural agreement with both those more exact simulations and crystal structures, producing molecular systems which are conformationally stable, conserve energy and exhibit very good hydrogen bonding (Guenot and Kollman, 1992, 1993; Arnold and Ornstein, 1994). We believe that the solvent shell protocol is appropriate and adequate for our purpose of rapid generation of structural models of partially unfolded proteins. Following the relaxation phase, the system undergoes a number of cycles of modelling, each involving a period of dynamics followed by repair of the solvent shell and solvent insertion into selected cavities within the protein. Preliminary studies showed that the protein responds very rapidly to the insertion of the new water molecules. The expansion of the protein framework, the convergence of the total energy and the formation of new cavities occur substantially in the first Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 Fig. 1. Flow diagram of the modelling procedure used to unfold the protein. Modelling HEWL unfolding Results and discussion Change in global properties during unfolding The extent of the structural change undergone by HEWL during unfolding is most simply monitored by the Cα root mean square (r.m.s.) deviation from the crystal structure (Figure 3a) and changes in the radius of gyration Rg (Figure 3b). The deviation of a ‘control’ modelling study of HEWL, carried out at 25°C with no solvent insertion, remains below 2.0 Å r.m.s. and the protein Rg increases by only 4% over 100 cycles of modelling. Such changes are in line with those expected for a protein which is effectively transferred from a crystal to a solution environment (Smith et al., 1993). In contrast, under unfolding conditions, carried out at 80°C with new water molecules inserted into suitable cavities, the protein structure undergoes dramatic punctuated change. The protein remains near its starting structure for the first 12 cycles (structures A to B) then expands extremely rapidly (structures B to C) to a state which is stable over the following 30 cycles (structures C to D). It then undergoes a second slower expansion (structures D to E), finally reaching another structural plateau (structures E to F). Changes in the principal moments of inertia reveal that the protein expands roughly equally in all directions during unfolding. The β-sheet is maintained throughout the unfolding, although there is some fraying at the ends of strands (Figure 3c). In contrast, the α-helical content fluctuates more and falls to 40–50% of the value for the crystal structure, with the most rapid loss of helical content coinciding with the first major expansion phase. The loss of helical structure is not uniformly distributed with helices C and D and the 310-helices being much more disrupted than helices A and B. The robustness of this model of the unfolding pathway was investigated by repeating the unfolding procedure, as detailed in Methods, using an alternative starting crystal structure (4lyz; Diamond, 1974). The stages in the resulting alternative pathway are very similar to those described above and the structures at each stage have similar global properties. The only distinct differences between the alternative and original pathways are that the helices, particularly helix B, are less structured in the alternative pathway following the first rapid expansion and that consequently the alternative final structure is slightly more expanded. Our modelling procedure appears to produce a consistent unfolding pathway from different initial structures and consequently we shall only describe the original pathway in detail in this paper. However, partially unfolded structures along the simulated pathway differ in detail, reflecting the expected diversity of such structures in a real population. The values of parameters used in our protocol, as detailed in Methods, are not definitive, but were chosen to give relatively rapid unfolding and produce well defined intermediate structures. If the modelling procedure is applied to our original crystal structure, but using a different temperature and/or with water insertion at different depths, the unfolding path also passes through a similar sequence of stages, i.e. rapid expansion followed by a structure with fluctuating helices and a relatively stable sheet, with higher temperatures and deeper insertion increasing the rate of unfolding. Inserting water molecules into smaller and/or less polar cavities modifies the unfolding path, as this disrupts more stable regions of the protein. Solvent penetration of the protein core At the end of the first cycle of the unfolding procedure, only two new water molecules are inserted, occupying cavities in the protein core between the domains. The relaxation which follows this insertion causes the protein to expand, opening up new cavities into which water molecules may be inserted in subsequent cycles. Cavities that are suitable for water insertion do not occur on every cycle and few or no new water molecules are inserted during each of the first 10 cycles (Figure 4a). The water molecules that are inserted disrupt the protein structure slightly so that some other molecules are able to diffuse from the surrounding solvent shell into the space between the domains during the molecular dynamics phase. Despite this, the structure at cycle 10 is still well packed, with only a small increase in cavity volume (Figure 3d). During the next few cycles, the protein undergoes rapid structural change (structures B to C). The protein expands dramatically, losing many close packing interactions within the core and creating a large amount of empty space inside the protein (Figure 3d) into which many new water molecules are inserted (Figure 4a). The protein responds to these insertions by expanding still further such that at the end of cycle 15, 44 new water molecules are inserted into the cavities in the 897 Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 picosecond of dynamics, with little further change in the period between 1 and 5 ps. Consequently, in line with our aim of producing protocol capable of rapidly unfolding proteins, we decided that the dynamic phase of each cycle should consist of 200 steps of steepest descents minimization followed by only 1 ps of molecular dynamics. After the dynamics the movement of protein and water often leaves gaps in the shell. The SHELL program removes any waters that have moved more than 5 Å from a protein atom, then immerses the protein and its shell of waters in a large bath of water. Any of these bath water molecules within 5 Å of a protein atom and more than 2.3 Å from every protein atom or shell water molecule is added to the shell. This repair process is vital as it maintains a uniform shell of water around the protein as it expands during the unfolding process. No water molecules are placed in cavities at this stage. The structure of the protein and its shell of water molecules is then analysed using PRO_ACT (Williams et al., 1994), which identifies and characterizes the empty cavities within the structure. When a cavity is found which has a sufficiently high probability of being hydrated, a water molecule is placed in an arbitrary orientation in that cavity by the program INSERT (M.A.W.). Analysis of protein crystal structures has shown that the probability of a cavity being hydrated increases as its size and polarity increase (Williams et al., 1994). In the modelling reported in detail here, we chose to place water molecules in any cavity with a probability of hydration of .20%. The new structure, with repaired shell and possibly additional water molecules in its interior cavities, is then simulated as described above and a new cycle begins. Secondary structures were assigned using SSTRUC (D.Smith, UCL), which implements the algorithm of Kabsch and Sander (1983). The stereochemistry of the structures was analysed using PROCHECK (Laskowski et al., 1993). Solvent accessibilities were calculated using ACCESS (S.Hubbard, UCL) and contact maps and cavity volumes (i.e. the total volume of probe spheres with a diameter .1.9 Å that can be inserted into the protein) using PRO_ACT. Black and white illustrations were created with ROMPLOT (R.Laskowski, UCL) and colour illustrations with GRASP (Nicholls et al., 1991). M.A.Williams, J.M.Thornton and J.M.Goodfellow protein, principally between the two folding domains (Figure 5). The separation of the secondary structure elements in the protein core makes many residues solvent accessible and SHELL (Figure 1 and Methods) increases the amount of surface water accordingly (Figure 4b). The principal core of the protein, formed by the interaction of the first two helices with each other and with the structure between the start of the β-sheet and the end of the C helix, is disrupted and becomes substantially solvated, as indicated indirectly by the reduction of interior cavity in the protein at this time (Figure 3d). The protein structure then remains stable for some time before a second set of water insertions are made (Figure 4a) precipitating a gentler structural transition. The main effect of these later insertions is to disrupt what are apparently two smaller well packed cores within the protein at the interface of the B and D helices and the interface of the C helix plus the preceding 310-helix and the β-sheet plus loop structure. The rate-limiting transition and a late-folding intermediate? Despite the burial of 10 water molecules within the protein (cf. six in the crystal structure), the domain interface is still fairly well packed at cycle 10 of the modelling procedure. Then, eight water molecules are inserted at the end of cycle 11, which together with the waters already present so disrupt the structure’s integrity as to precipitate a catastrophic expansion of the structure, creating a poorly packed protein interior which is quickly penetrated by water molecules (Figure 6a and b). 898 The rapidity of the change in the protein structure suggests that the structural integrity of the protein in its native state is highly cooperative and that the disruption of a few contacts has a domino effect, which results in the loss of interactions between the two domains of the protein and between the A 1 B helices. Changes in the backbone conformation of the protein are very much localized to the loop between the A and B helices with lesser changes occurring in the loops between other secondary structural elements (Figure 7a). The secondary structure elements are only slightly disrupted, with some distortion of the region which formed the C-terminal 310-helix in the crystal structure and some fraying of the ends of the α-helices. The similarity of this catastophic ‘unlocking’ of the protein to the period of rapid expansion on the unfolding pathway of chymotrypsin inhibitor II immediately following its transition state (Li and Daggett, 1994) suggests to us that changes occurring during the first transition (B–C) represent the ratelimiting step in the unfolding of HEWL. The transition state itself, the state with the highest free energy during this transition, is difficult to identify definitively since we have no way of accurately determining the free energy of any of our model structures. However, it is most likely similar to the structures obtained in the early part of the transition (Figure 6a), in which previously favourable packing interactions are disrupted but not yet fully compensated by increased sidechain entropy or hydration. Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 Fig. 3. Changes in structural parameters during unfolding. (a) Cα r.m.s. deviation of the model from the crystal structure as a function of the number of cycles of modelling. The grey triangles represent the control procedure (25°C, no solvent insertion) and the black the unfolding prodedure (80°C, shallow insertion). (b) Radius of gyration of the models from the control and unfolding procedures distinguished as in (a). (c) Proportion of each model structure forming α-helices (black) and β-sheet (grey). (d) Cavity volume of the model structures from the unfolding prodecure. Modelling HEWL unfolding Fig. 5. Ribbon representation of model structure C (cycle 16 of the unfolding procedure), illustrating the 44 new water molecules (blue) that were inserted at the end of the previous cycle. In the metastable state (C–D) which follows this transition, the two lobes of the molecule are essentially decoupled and the active site is substantially disrupted (Figure 6b). The A helix and C-terminus have moved away from the rest of the protein (i.e. some Cα atoms of the A and C helices have moved more than 10 Å further apart) and much of the sheet Fig. 6. Domain interface viewed from above the active site, (a) at cycle 12 and (b) at cycle 14 during the first transition. Cavities are represented by the red mesh and buried or cleft water molecules by the blue mesh. through to the end of the C helix has moved away from the rest of the protein, with the C-terminal end of the C helix moving furthest. The loss of specific side-chain packing and consequent increased solvent accessibility of many residues (Figure 7b) will result in a more disordered side-chain structure than in the native state. In particular, two (W28 and W108) of the three tryptophan residues which were buried in the crystal structure become partially solvent exposed and only W111 retains a similar degree of burial to that in the native state including complete burial of the indole NH group. (This pattern of tryptophan burial is also found on the alternative pathway derived using a different starting crystal structure, despite it having somewhat different secondary structure at this point). A consequence of the increased solvent exposure of side chains is that the protein surface is more hydrophobic than in the native state (e.g. the non-polar solvent-accessible surface area of structure D is 60% greater than that of A). A late intermediate on the folding pathway has been characterized by spectroscopic studies of its tryptophan residues and interactions with ligand molecules (Itzhaki et al., 1994). The experiments suggest that the late-folding intermediate is a substantially collapsed state which buries the W111 indole, though lacks many fixed tertiary interactions including those in the domain interface and which attains most of the secondary indole structure of the molecule. The set of structures from the metastable state (C–D) is consistent with these observations and might be a good model for this late-folding intermediate. 899 Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 Fig. 4. Changes in solvation during unfolding. (a) Average number of water molecules found buried (black), in clefts (grey) and newly inserted into the protein core (cross hatched) per model for each five cycle segment of the unfolding procedure. (b) Number of water molecules covering the surface of the protein as a function of the number of cycles of modelling. M.A.Williams, J.M.Thornton and J.M.Goodfellow Fig. 7. Structural parameters of the late-folding intermediate and molten globule-like states. (a) Local structural deviations from the crystal structure of structures C (black triangles) and F (grey triangles) defined as the r.m.s. fit of every five residue segment of each structure to the corresponding segment of the crystal structure. The values for structure F are shifted up by 1 Å for clarity. (b) Solvent accessibility differences of each residue of structures C (black triangles) and F (grey triangles) from those of the crystal structure. The solvent accessibility of the residues in the crystal structure (shifted up by 200%) is shown as the black line. The values for the solvent accessibility change on going from the crystal structure to structure C are shifted up by 100%. Fluctuating α-helices and a stable β-sheet After the first transition (structures B to C), the lengths of the helices fluctuate. In the compact but solvated state which follows this transition the main chain C5O and H–N groups in helices can quickly alternate between bonding with each other and with nearby water molecules. When further tertiary contacts are lost in the second transition, the protein becomes less compact and the average rate of reforming main-chain helical hydrogen bonds is consequently reduced, broken bonds tend to persist and helices consequently disappear altogether. The stability of the helices seems to be sequence and perhaps tertiary structure dependent. The helices A and B are essentially intact, though some of their side chains change conformation, whereas only the central part of C remains and the D and 310helices have entirely disappeared. However, even in the C and D helices, despite the loss of the main-chain hydrogen bonds, only one residue at the beginning of each of the D and C helices has moved out of the α-region of the Ramachandran plot. Thus, ‘refolding’ of these helices might well occur on a longer time-scale than that of our modelling procedure. Such 900 fluctuating helices have been proposed as the explanation of the discrepancy between the rate of formation of helical structure measured by circular dichroism and the slower rate measured by the protection from exchange of amide hydrogens in several proteins (Chaffotte et al., 1992; Evans and Radford, 1994). In contrast to the helices, the triple-stranded sheet remains intact throughout, despite transiently losing a few hydrogen bonds. Relatively persistent β-sheets have been observed in other unfolding simulations (Daggett and Levitt, 1993; Vijayakumar et al., 1993; Hunenberger et al., 1995) and it has been suggested that this stability is due to burial of a large amount of non-polar surface in the sheet and to the cooperative nature of the structure. Also, in lysozyme the β-sheet is rather polar and its cooperative resistance to dynamic fluctuations is probably enhanced by its side chain–side chain hydrogen bonds and salt bridge. A transient molten globule? The final 35 structures obtained by the unfolding procedure (structures E to F) have very similar global properties (e.g. Rg, shape and solvent-accessible surface area) and define a second metastable state for the protein under our modelling conditions. This state is substantially more expanded, more solvated and has less secondary structure than the earlier metastable state (C–D). This state (E–F) has many of the properties of a ‘classical molten globule’—a compact, mobile, partially structured state, which can be stabilized for many proteins under conditions of low pH and which has been proposed as a general intermediate on protein folding/unfolding pathways (Ptitsyn et al., 1991; Ptitsyn, 1995). The radius of gyration of our model of HEWL increases by 18% after 100 cycles (Figure 3b), which is within the range (15–35%) reported for other simulations (Daggett and Levitt, 1992). The structure has lost many of its native tertiary contacts (Figure 8) and the majority of those residues which were buried in the Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 Fig. 8. Cα–Cα distance maps for the crystal structure (upper left triangle) and structure F (lower right triangle). The radius of the circle representing each residue pair depends on their separation. The largest circles represent a distance of ,5 Å and the radius is reduced linearly with separation to 15 Å, which is the largest separation shown. Modelling HEWL unfolding Fig. 9. Final ‘molten globule-like’ model structure (F) from the unfolding procedure. (a) Atomic representation with tryptophan residues shown in purple and the backbone ribbon coloured according to the local r.m.s. deviation from the crystal structure (red representing the least and blue the greatest deviation). (b) Molecular surface of structure F of HEWL in the same orientation as in (a). native structure have become substantially solvent exposed, including all of the tryptophan residues (Figures 7b, 9a and 9b). The increase in solvent accessibility of many of the previously buried residues will inevitably lead to their having greater mobility. This high side-chain mobility is compatible with experimental observations of chemical shift broadening in NMR spectra of molten globules (Alexandrescu et al., 1993). The amount of non-polar surface exposed to solvent is 100% greater in structure F than in the crystal structure (the total surface area has increased by only 60%), in accord with the observations of increased binding of the hydrophobic probes, lower solubility, increased likelihood of aggregation and higher heat capacity of molten globules (Kim and Baldwin, 1990). Hen lysozyme has not been found to form an equilibrium molten globule, although the homologous proteins α-lactalbumin and human and equine lysozyme do (Haezebrouck et al., 1995; Morozova et al., 1995; Schulman et al., 1995) and a transient state similar to these molten globules is thought to exist on the hen lysozyme folding pathway (Ikeguchi et al., 1986). In general terms, the properties of the structures E to Conclusions The properties of the structures generated during unfolding are similar to those observed experimentally for the folding/ unfolding intermediates of many proteins. In particular, the model pathway we have presented exhibits many features found in the experimental studies of the folding of HEWL. The first step in unfolding is an ‘unlocking’ of the two independent folding/unfolding domains, the reverse of the final step of folding (Dobson et al., 1994). In common with the experimentally characterized late-folding intermediate, our compact ‘unlocked’ structure loses many fixed tertiary interactions and the integrity of the sugar binding site, buries tryptophan residue W111 completely and maintains much of the native secondary structure. However, the ‘molten globulelike’ structures, which we observe subsequently on the unfolding pathway, do not correlate with the dominant kinetic intermediates observed early on the folding pathway. What is the reason for the discrepancy between the observed structures of the majority of early-folding molecules and our late-unfolding structures? The experimentally characterized folding pathway has several branches. The dominant route involves prior folding of the α-domain, followed by formation of the β-sheet domain. Although the majority of folding molecules go down this slow pathway, a small minority take a ‘fast track’ in which the two domains form at similar rates (Dobson et al., 1994; Kiefhaber, 1995). This provides an explanation for our observation of ‘molten globule-like’ structures in which a disrupted α-domain coexists with a somewhat less disrupted β-domain. The majority of unfolding proceeds by the quickest route, along the reverse of the ‘fast track’, on which the α- and β-domains form, and presumably are disrupted, at similar rates. It seems plausible that it is generally true that the different stability and kinetic accessibility of α 901 Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 F resemble those of the well characterized α-lactalbumin molten globule. Most of the protein’s residues remain in their native region of the Ramachandran plot in accord with CD observations (Chyan et al., 1993). Only some of the secondary structure is completely stable (in our case 90% of the β-sheet and 40% of the helix), while some helices fluctuate and allow amide hydrogen exchange with solvent, in accord with H–D exchange data (Chyan et al., 1993). However, it is difficult make a more detailed comparison as the helices which are relatively resistant to H–D exchange in the experimental studies of the equilibrium molten globules of α-lactalbumin (B and C) and human (A, B and C) and equine (A, B and D) lysozymes are different from each other and those which we find to be stable for our HEWL molten globule-like structures (A and B only, see Figures 7a and 9a). Qualitatively these differences are easily rationalized since the stability of helices is strongly dependent on sequence and local environment, particularly the presence of disulphide bonds (Yang et al., 1995); however, only speculative inferences as to the detailed unfolding pathway of HEWL can be drawn from either the experimental data on these molten globules or our models. The fact that helices A and B are stable in both lysozyme molten globules and our model structures could be taken as support for our models. The greater flexibility of helices C and D seen in our models of HEWL is in contrast to the observed stabilty of at least one of the two in the known experimental molten globules and could be the reason why HEWL lacks an equilibrium molten globule state, as has been suggested previously (Haezebrouck et al., 1995). M.A.Williams, J.M.Thornton and J.M.Goodfellow 902 of its unfolding pathway persist despite changes to the details of the modelling protocol. The relative rapidity of the unfolding protocol that we have described, involving a self-repairing solvent shell and enhanced solvent penetration of the protein core, opens the way for extensive modelling of the (un)folding intermediates of many other proteins (including very large proteins) for which experimental data are available. We hope to find further correlation with the experiments on these proteins and enable a productive dialogue between theory and experiment. Perhaps we could attempt to begin that dialogue by pointing out that unfolding experiments on mutant hen egg-white lysozymes, particularly mutations of tryptophan, β-sheet and domain interface residues, could potentially confirm or deny key aspects of our model pathway. Acknowledgements We thank Drs C.Dobson, P.Evans, P.Kim and L.Smith for encouragement and valuable discussions and are particularly grateful for the comments of Chris Dobson and Phil Evans on earlier versions of this paper. This work was supported by grants from the UK Biotechnology and Biological Sciences Research Council (GR/H37051 and GR/H63678). References Alexandrescu,A.T., Evans,P.A., Pitkeathly,M., Baum,J. and Dobson,C.M. (1993) Biochemistry, 32, 1707–1718. Arnold,G.E. and Ornstein,R.L. (1994) Proteins: Struct. Funct. Genet., 18, 19–33. Baldwin,R.L. (1993) Curr. Opin. Struct. Biol., 3, 84–91. Berendsen,H.J.C., Postma,J.P.M., van Gunsteren,W.F., DiNola,A. and Haak,J.R. (1984) J. Chem. Phys., 81, 3684–3690. Caflisch,A. and Karplus,M. (1994) Proc. Natl Acad. Sci. USA, 91, 1746–1750. Chaffotte,A.F., Guillou,Y. and Goldberg,M.E. (1992) Biochemistry, 31, 9694–9703. Chyan,C.-L., Wormald,C., Dobson,C.M., Evans,P.A. and Baum,J. (1993) Biochemistry, 32, 5681–5691. Daggett,V. and Levitt,M. (1992) Proc. Natl Acad. Sci. USA, 89, 5142–5146. Daggett,V. and Levitt,M. (1993) J. Mol. Biol., 232, 600–619. Daggett,V. and Levitt,M. (1994) Curr. Opin. Struct. Biol., 4, 291–295. Diamond,R. (1974) J. Mol. Biol., 82, 371–390. Dill,K.A. (1990) Biochemistry, 29, 7133–7155. Dobson,C.M., Evans,P.E. and Radford, S.E. (1994) Trends Biochem. Sci., 19, 31–37. Evans,P.E. and Radford,S.E. (1994) Curr. Opin. Struct. Biol., 4, 100–106. Fersht,A.R. (1993) FEBS Lett., 325, 5–16. Guenot,J. and Kollman,P.A. (1992) Protein Sci., 1, 1185–1205. Guenot,J. and Kollman,P.A. (1993) J. Comput. Chem., 14, 295–311. Haezebrouck,P., Joniau,M., Vandael,H., Hooke,S.D., Woodruff,N.D. and Dobson,C.M. (1995) J. Mol. Biol., 246, 382–387. Hunenberger,P.H., Mark,A.E. and van Gunsteren,W.F. (1995) Proteins, 21, 196–213. Ikeguchi,M., Kuwajima,K., Mitani,M. and Sugai,S. (1986) Biochemistry, 25, 6965–6972. Itzhaki,L.S., Evans,P.A., Dobson,C.M. and Radford,S.E. (1994) Biochemistry, 33, 5212–5220. Jorgensen,W.L. and Swenson,C. (1985) J. Am. Chem. Soc., 107, 1489–1496. Jorgensen,W.L., Chandrasekar,J., Madura,J.D., Impey,R.W. and Klein,M.L. (1983) J. Chem. Phys., 79, 926–935. Kabsch,W. and Sander,C. (1983) Biopolymers, 22, 2577–2637. Kiefhaber,T. (1995) Proc. Natl Acad. Sci. USA, 92, 9029–9033. Kim,P.S. and Baldwin,R.L. (1990) Annu. Rev. Biochem., 59, 631–660. Laskowski,R.A., MacArtur,M.W., Moss,D.S. and Thornton,J.M., (1993) J. Appl. Crystallogr., 26, 283–291. Levitt,M. and Park,B.H. (1993) Structure, 1, 223–226. Li,A. and Daggett,V. (1994) Proc. Natl Acad. Sci. USA, 91, 10430–10434. Mark,A.E. and van Gunsteren,W.F. (1992) Biochemistry, 31, 7745–7748. Morozova,L.A., Haynie,D.T., Arico-Muendel,C., van Dael,H. and Dobson,C.M. (1995) Nature Struct. Biol., 2, 871–875. Nicholls,A., Sharp,K.A. and Honig,B. (1991) Proteins: Struct. Funct. Genet., 11, 281–296. Pearlman,D.A. et al. (1991) AMBER 4.0. University of California, San Francisco. Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 and β structures means that the unfolding pathways of α 1 β and more complex multidomain proteins will often not simply be the reverse of the dominant folding pathway, but will follow the fastest path avoiding the ‘kinetic traps’ encountered by most folding molecules. Our novel unfolding protocol is a valuable alternative to those published protocols which require the use of very high temperatures to increase the unfolding rate. There is always the concern with high-temperature protocols that the large kinetic energy present in the protein allows it to cross energy barriers which it could not usually cross under experimental conditions and consequently produce unrealistic structures. In practice, this potential problem has not prevented hightemperature studies from providing many useful insights into the unfolding/folding of several proteins (Daggett and Levitt, 1994). One such early study of HEWL at high temperature (Mark and van Gunsteren, 1992) also showed that unfolding proceeds via a metastable molten globule-like state, though with relatively little secondary structure. Later theoretical studies involving the forced expansion of the molecule (Hunenberger et al., 1995) additionally showed separate domains and relatively stable β-structure, though the structures differed in detail from ours and reservations were expressed in respect of the artificiality of the perturbations applied to the protein in these studies in order to produce rapid unfolding (Hunenberger et al., 1995). Our protocol for enhancing solvent diffusion into the protein is intended to mimic a natural process and we believe it provides a relatively gentle way to unfold proteins in a reasonable time. However, we are clearly making some significant assumptions about the unfolding process and the modelling protocol. Primarily we are assuming that changes in protein solvation are the most important events in the unfolding process and that enhancement of the diffusion of water into the protein without a corresponding enhancement of the rate of internal structural rearrangement of the protein itself does not significantly alter the structures that are observed on the folding pathway. It has been observed in some high-temperature studies that, because of the relative rates of solvent penetration and protein structural rearrangement, some cavities may persist for a substantial length of time in unfolding proteins without becoming hydrated and that these cavities allow a rearrangement of the protein’s internal structure, which in turn precipitates further unfolding. Our protocol is unlikely to allow polar cavities (in particular) to persist and this may influence the unfolding pathway. However, the observation of persistent unhydrated cavities is not universal [e.g. rapid solvation of cavities was observed by Caflish and Karplus (1994)] and may itself be dependent on particular simulation protocols. Even if, as we assume, the different relaxation rates of water and protein have no substantial effect on the main features of the pathway, there will be minor effects. In particular, there is limited scope for changes in side-chain internal geometry in the short time-scale of each modelling cycle in our current protocol. Consequently, we have restricted our analysis to a level of detail that we believe is likely to be adequately modelled by this current protocol and which is equivalent to that currently offered by experiment. As future experiments provide more detailed information, it will be necessary for theoretical models of all kinds to improve, if they are to continue to make a contribution to knowledge in this field. Fortunately, at this stage, our investigations of HEWL show good agreement with experimental data and that many features Modelling HEWL unfolding Ptitsyn,O.B. (1995) Curr. Opin. Struct. Biol., 5, 74–78. Ptitsyn,O.B., Pain,R.H., Semisotnov,G.V., Zerovnik,E. and Razgulyaev,O.I. (1991) FEBS Lett., 262, 20–24. Radford,S.E., Dobson,C.M and Evans,P.A. (1992) Nature, 358, 302–307. Schulman,B.A., Redfield,C., Peng,Z., Dobson,C.M. and Kim,P.S. (1995) J. Mol. Biol., 253, 651–657. Serrano,L. (1994) Curr. Opin. Struct. Biol., 4, 107–111. Smith,L.J., Sutcliffe,M.J., Redfield,C. and Dobson,C.M. (1993) J. Mol. Biol., 229, 930–944. States,D.J. and Karplus,M. (1987). J. Mol. Biol., 197, 123–130. Vijayakumar,S., Vishveshwara,G., Ravishanker,G. and Beveridge,D.L. (1993) Biophys. J., 65, 2304–2312. Weiner,S.J., Kolman,P.A., Case,D.A., Singh,U.C., Ghio,C., Alagona, G., Profeta,S. and Weiner,P. (1984) J. Am. Chem. Soc., 106, 765–784. Williams,M.A., Goodfellow,J.M. and Thornton,J.M. (1994) Protein Sci., 3, 1224–1235. Wilson,K.P., Malcolm,B.A. and Matthews,B.W. (1992) J. Biol. Chem., 267, 10842–10849. Yang,J.J, Buck, M., Pitkeathly,M., Kotik,M., Haynie,D.T., Dobson,C.M. and Radford,S.E. (1995) J. Mol. Biol., 252, 483–491. Received June 5, 1996; revised April 23, 1997; accepted April 29, 1997 Downloaded from http://peds.oxfordjournals.org/ by guest on September 9, 2014 903
© Copyright 2025