Free energy calculations of the wild-type and the variant human T cell lymphotropic virus type 1 Tax peptide presented by the MHC to the TCR have been performed using large scale massively parallel molecular dynamics simulations. The computed free energy difference (−1.86 ± 0.44 kcal/mol) using alchemical mutation-based thermodynamic integration agrees well with experimental data (−2.9 ± 0.2 kcal/mol). Our simulations exploit state-of-the-art hardware and codes whose algorithms have been optimized for supercomputing platforms. This enables us to simulate larger, more realistic biological systems for longer durations without the imposition of artificial constraints.
The recognition of antigenic epitopes by the TCR is the key molecular event at the heart of the immune response. Small peptide fragments derived from both host and pathogen proteins bind to MHC and are subsequently recognized by TCR. The binding of the peptide-MHC (pMHC)3 complex to the TCR ultimately results in various protective functions, including release of cytokines to cause local inflammation and specific killing of virus-infected cells (1). Presented by the same MHC molecule, peptides with single amino acid substitution have different binding affinities with the same TCR, thereby stimulating different physiological activity (2). Some peptides with single amino acid substitution become immunodominant epitopes, others weaker epitopes, while others show no T cell activity. Thus, exploring the molecular principles underlying the selectivity of TCRs is of central interest in the quest for new vaccines.
The formation of the peptide, MHC, and TCR (TCR-pMHC) complex (see Fig. 1⇓) has been studied using appropriate biochemical and biophysical techniques including x-ray crystallography (2, 3) and surface plasmon resonance binding studies (4). However, the molecular recognition mechanisms that underlie TCR-mediated T cell activation are not fully understood. To address this question we need deeper understanding of the molecular interactions between peptide, MHC, and TCR. Experimental methods give only partial insights by measuring macroscopic properties such as rates and equilibrium constants complemented by the static picture provided by x-ray crystallography under far from physiological conditions. Theoretical approaches offer us the molecular insights we crave. A broad spectrum of sophisticated methods for the prediction of T cell epitopes has now evolved (5). One of the best routes to such prediction is via atomistic molecular dynamics (MD) simulations. Atomistic MD simulations can, in principle, provide the ultimate details of these dynamic phenomena. A variety of biomolecular systems of practical interest have been simulated to obtain a microscopic picture and thus understand the relationship between structure, function, and dynamics. One of the most promising approaches for achieving such an understanding is provided by free energy simulations (6).
In the present study, we seek to analyze the well-characterized HLA-A*0201 peptide A6 TCR system described by Ding and others (2, 3) and calculate the binding free energy difference of the wild-type Tax peptide (LLFGYPVYV) and its mutant Tax P6A (LLFGYAVYV). In the work of Ding et al. (2), interactions between three peptide variants of the human T cell lymphotropic virus type 1 Tax peptide bound to HLA-A2 and the A6 TCR were studied using a variety of experimental techniques: T cell assays, kinetic and thermodynamic measurements, and x-ray crystallography. Our aim is to supplement this using cutting-edge atomistic MD simulation of long duration. Our approach exploits recent novel developments in both hardware and software which permit increasingly large biologically interesting systems to be studied over relatively long time intervals using atomistic MD simulations. The HLA-A*0201-A6-TCR system has also been studied recently by Michielin and Karplus (7) who included only the mutated peptide residue and its local environment in their simulation, and which was also constrained using arbitrary stochastic boundary conditions (8). Most of the complexities of the system were ignored by forcing regions far from the mutated residue to be rigid. The agreement of the calculated and experimental (9) binding free energy difference between the wild-type and mutant peptides was remarkably good. However, despite the excellent agreement reported by these authors, the constraints used have an overt and substantial ad hoc character. The use of constraints is primarily for reasons of “computational efficiency”, as the full system may otherwise be out of reach with available computational resources. The general usefulness of such constraints remains open to question, and we will investigate it in the present paper. Recently, we showed that the binding of peptides within the MHC-binding groove can be accurately predicted only when the full, unconstrained peptide-MHC model is considered (10). This confirms the earlier general conclusions of Hansson et al. (11) that the properties of “established models” may change when using different methods in terms of system sizes, boundary conditions, and the treatment of longer range interactions. Our present purpose is to simulate the dynamic behavior of the HLA-A*0201-A6-TCR complex using minimal approximations or simplifications, allowing us to predict measured affinities for TCR-pMHC systems using very large scale nanosecond time scale MD simulations without constraints. Our paper is fundamentally a modeling study of TCR-pMHC systems. The rigorous approach we have adopted makes the scientific results of the present work a benchmark for future theoretical studies of this and related TCR complexes.
Materials and Methods
Free energy cycle
The binding free energy difference between two molecules can be obtained from the thermodynamic cycle (see Fig. 2⇓).
Free energy simulation
The free energy calculations model the “alchemical” transformation of the native peptide substrate into a mutated peptide, in both the pMHC and TCR-pMHC complexes. In this study, sequence differences between peptides are modeled as mutations. The sixth residue of the peptide is mutated from Pro to Ala. The transformation of one residue into another is performed alchemically by “phasing in” one part while “phasing out” the other. A single peptide is introduced with a hybrid residue at the sixth position. Atoms that are different with respect to any of their parameters are duplicated. They include the N and Cα backbone atoms of the sixth residue, and hydrogen atoms bonded to them, whose charges are changed during the mutation, as well as the side chains of the Pro and Ala residues, which appear/disappear during the mutation. The system is divided into three blocks (12) and the hybrid potential function is written in the form: where i = 1,2,3. The transformation is performed from the native peptide (Pro) to the mutated one (Ala). In this case, the subscript 2 and 3 represent Ala and Pro of the peptides, respectively, while 1 represents the rest of the system. As the alchemical transmutation parameter λ is changed from 0 to 1, the potential energy changes from that describing the proline-containing peptide to that describing the alanine-containing one. The term Ub represents all of the bonding interactions in the system, including torsional terms; they are not scaled. The Uijnb term corresponds to nonbonded interactions within each block (i = j) or between different blocks (i ≠ j). Only the nonbonded interactions between blocks 1 and 2, and between blocks 1 and 3 are scaled (13). In the thermodynamic integration (TI) method, the ensemble average of the derivative of the Hamiltonian with respect to λ, 〈∂U/∂λ〉, is calculated and then integrated to yield ΔG, where the integral is evaluated numerically over a number of thermodynamically equilibrated intermediate points or windows.
Free energy component analysis
The potential energy used in these simulations is the sum of pairwise interactions; hence it is possible to evaluate the contributions to the free energy change of individual atoms or groups of atoms: where i is an atom or group of atoms belonging to block 1 (Equation 2). Equation 4 shows that the free energy can be decomposed into specific components, ΔGi. These arise from the corresponding terms in the potential energy function representing specific atomic interactions. Because the ensemble average involves the total potential energy, each component has an implicit dependence on all the others. Although the contributions of individual atoms or groups of atoms are dependent on the path of the alchemical mutation (14), they have significant meanings when that path is carefully chosen and clearly defined (15). In practice, the path is determined by the hybrid potential energy in which the coupling parameter λ is changed, and by the relevant conformations of the system, when sampled in an adequate fashion. In this study, a single coupling parameter is used for the nonbonded interactions, as already used by Michielin and Karplus (7). Because no structural constraints are added in our simulations (see below), the molecular conformation is expected to be more flexible in our study than that reported in Michielin and Karplus’ work (7). Due to this and differences in the evaluation of the alchemical process, the free energy components are not expected to be the same as those determined by Michielin and Karplus (7), even though the same potential form is used.
In this study, we used NAMD (16), a highly scalable massively parallel MD code (17), running on 256 processors of Pittsburgh Supercomputing Center’s LeMieux, the most powerful supercomputer system in the world for open research. Our simulations show very substantial acceleration relative to single processor runs, reducing the time needed to simulate a nanosecond of a system of ∼100,000 atoms to 10 h. It required ∼75,000 CPU hours and 35 gigabytes of disk space for all the simulations.
We performed reference simulations of four systems (TCR-Tax(P6)-MHC, TCR-Tax(A6)-MHC, Tax(P6)-MHC, and Tax(A6)-MHC) to assess simulation stability. Explicit solvent molecular dynamics simulations and free energy calculations were performed using NAMD. The x-ray structures of the TCR-pMHC (2, 3) and pMHC (18) complexes were used as the initial structures for the simulations. There is no x-ray structure available for the free Tax(A6)-MHC complex. Instead its structure was built from the x-ray structure of the Tax(P6)-MHC, i.e., by taking the Cβ of the Ala residue to be equal to that of the Pro. Atomic charges, van der Waals (vdW), and stereochemical force-field parameters for the proteins were taken from the CHARMM22 all-atom force field (19). Periodic boundary conditions were applied. The XPLOR (X-PLOR version 3.1, Yale University) and CHARMM (20) packages were used for preparing the initial molecular models and analyzing the final results. The complexes were neutralized by adding counterions (Na+) and then solvated using TIP3P water molecules (21). The distance between the edges of the solvent box and the closest atom in the protein was 10 Å. This procedure resulted in a final atom count of 53,032 and 53,034 for the Tax(P6)-MHC and Tax(A6)-MHC models, and 102,143 and 102,990 for the TCR-Tax(P6)-MHC and TCR-Tax(A6)-MHC systems, respectively.
The electrostatic interaction was divided into two parts: a local component which was calculated every time step with a local interaction distance of 12 Å, and a long-range component which was evaluated every four time steps. Particle mesh Ewald (PME) (22) was used for the full electrostatic calculations. The vdW interactions were truncated using the switch method with a switch range of 10–12 Å. Hydrogen to heavy atom bonds, and the internal geometry of the water molecules, were constrained using SHAKE (23) and the equations of motion were integrated using a Verlet algorithm with a 2 fs time step. The systems were maintained at 300 K and one atmosphere pressure using a weak-coupling algorithm (24) in the isothermal-isobaric ensemble (NPT).
To properly prepare the models for production runs, the following protocol was adopted to thermally equilibrate the x-ray structures. First, potential energy minimizations were run with position restraints on heavy protein atoms to remove any specious high energy contacts. Then a series of simulations were performed with the backbone atoms, and initially also the side-chain heavy atoms, harmonically restrained at their x-ray positions. This was followed by free dynamics, assigning random velocities based on a Maxwellian distribution at 100 K as the initial temperature. The systems were heated to the final temperature of 300 K by periodically reassigning the velocities. Then a 2 nanosecond (ns) production simulation was run for each complex.
Free energy simulations
Essentially the same procedure was used for the alchemical simulations. The hybrid solutes were solvated using periodic boundary conditions, as described above. In this case the final simulation cells had 53,040 atoms for the hybrid Tax(P6A)-MHC, and 102,151 atoms for the hybrid TCR-Tax(P6A)-MHC, respectively. Simulations were run with the alchemical transmutation parameter of Equation 2 λi = 0.02, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 0.98, increasing from the initial value λ1 = 0.02. The x-ray structures were energy minimized, then gradually heated to the desired temperature of 300 K as above. The resulting structure was used as the starting point for the simulation at λ1 = 0.02. For the other values of λ, the initial structure used was that given by equilibration from 200 picosecond (ps) simulation performed at λi−1. For each λi value, the system was first equilibrated for 200 ps using the potential U(λi), followed by a 1 ns production simulation for the TCR-pMHC system, and 1.8 ns for the pMHC system. The configurations of trajectories were saved every picosecond, i.e., every 500 time steps. The derivatives of the Hamiltonian (Equation 3) at the end points (λ→0 and λ→1) were extrapolated using two approximations for the vdW interactions. The first scheme assumed that the mean second derivatives of the free energy are identical (linear approximation) for λ = 0→0.02 and λ = 0.02→0.1, and for λ = 0.9→0.98 and λ = 0.98→1.0. The second scheme assumed that the free energy derivative is proportional to λ−3/4 or (1 − λ)−3/4 at the end points (25). The average value of these two approximations was used as the vdW contribution to the free energy (13). The end points of the electrostatic component were extrapolated using the linear approximation only because the electrostatic interaction changed smoothly at these points. The block average method (26) was used to calculate the average properties and their uncertainties. The data was divided into blocks and deviations of the block averages from the total trajectory average were calculated to obtain the variance.
The root-mean-square (RMS) deviations were analyzed after least square fitting of main chain atoms to their x-ray-defined coordinates at the Ag-binding site. This site consists of two long α-helices and an eight-stranded antiparallel β-sheet. The atoms in other domains and atoms within the loops of the α1-α2 domain were not included in the fitting because of the interdomain motions and the flexibility of the loops. The RMS deviations/fluctuations of the complexes which bind Tax(A6) and Tax(P6) have similar behavior for the α1-α2 domain, peptide and complementarity determining regions (CDRs) of the TCR. Hence only those for Tax(P6) are plotted. Fig. 3⇓ shows the time evolution of the RMS deviations for the α1-α2 domain and the peptide from the x-ray structures. In the pMHC system, the RMS deviations of the peptide are comparable with those of the atoms within the α1-α2 domain, while in the TCR-pMHC system the RMS deviations of the former are smaller than those of the latter. For the TCR-pMHC system, it takes ∼400 ps for the α1-α2 domain to attain equilibrium, and 800 ps for the peptide to equilibrate. For the pMHC, the RMS deviation increased slowly during the first 1 ns of the simulation, while the peptide appears to have reached equilibrium as a result of the original thermal equilibration process itself. In the following analyses, trajectories from the final 1 ns of both simulations were used.
In Fig. 4⇓ the RMS deviations of heavy atoms within each residue in the MD trajectories are compared with those from corresponding x-ray structures. The two long α-helices and the β-sheets within the α1-α2 domain have relatively small deviations (1.14 Å for bound and 1.00 Å for unbound, on average), while the loop regions have large deviations (1.69Å for bound and 1.81Å for unbound, on average) from the x-ray structures. Most of the residues in the α-helices have obviously larger deviations in pMHC than those in TCR-pMHC because these residues in pMHC are directly exposed to the solvent, while they are buried away from solvent within the TCR-pMHC complex. The residues P50 to E63 are an exception, being located around the “high point” of the α1-helix. By contrast, most of residues in the β-sheets have slightly smaller RMS deviations in the pMHC complex than those in the TCR-pMHC complex. The Tax residues have comparable RMS deviations in bound and unbound states, except for the first (1.51 Å vs 2.68 Å) and fifth (1.09 Å vs 3.73 Å). In the unbound state, main chain and side chain atoms in the first residue undergo conformational changes because it is partially exposed to the solvent. These conformational changes are not expected to affect the rest of the peptide or the MHC because experimental evidence indicates that there is no global shift or other conformational change even after the first residue is removed (18). The fifth residue is also exposed to the solvent and exhibits a high RMS conformational change from the x-ray structure. This residue is clearly a good candidate for establishing strong interactions with the TCR. The CDR3s of the TCR, which dominate the interactions between the TCR and the pMHC, also have small RMS deviations (1.39 Å for CDR3α and 1.40 Å for CDR3β). The other CDRs have reasonable RMS deviations (<3.03 Å). The RMS deviations of the regions far from the binding site (the constant domains and parts of the variable domains of TCR, α3, and β2m domains; see Fig. 1⇑) are relatively large because of interdomain motions. If we isolate each domain and then fit it to its x-ray structure, the calculated RMS deviation for each domain is relatively small. Although a 2 ns simulation is not long enough for a complete sampling of the global motions of such a large system, local motion around the binding site is well-sampled on this time scale.
It is instructive to compare structural differences in the various simulations with different peptides, especially because only one path from Tax(P6) to Tax(A6) was used in the alchemical mutation of pMHC and TCR-pMHC. The averaged RMSDs of different regions are listed in Table I⇓, both for backbone atoms and for all heavy atoms. The α-helices and β-sheets are very similar in the different simulations. The backbone RMSDs for the peptide and MHC are comparable: the difference is equivalent to that between structures determined by x-ray crystallography and by nuclear magnetic resonance spectroscopy (27). Previous studies have shown that differences on the order of 1 Å produce only small changes in calculated free energies (28). Thus, the agreement between the simulations is sufficiently close to ensure meaningful data on binding free energies. The RMSD values are smaller in the unbound simulations compared with those of the bound simulations, except for the averaged RMSD of all heavy peptide atoms. This is not unexpected because the flexibility of peptide is induced by exposure of the side chains to the solvent in the unbound state, while it is buried by the TCR in the bound state.
Comparison of the deviations of all the main chain atoms with those of the secondary structures indicates, unsurprisingly, that the loops have larger deviations than the secondary structures. The RMS differences between the mean structures of Tax(P6)-MHC and Tax(A6)-MHC, and between TCR-Tax(P6)-MHC and TCR-Tax(A6)-MHC, are shown in Fig. 5⇓. The backbone deviations are <1.0 Å for the atoms within the α-helices, except for Q54, P57, and E58 of the α1-helix and H151 of the α2 helix in the unbound state, and most of the residues from P50 to T64 of the α1-helix and A149, A150 of the α2-helix in the bound state. These residues lie around the breaks in the two helical regions, which form the “high points” on the binding surface. The “high points” impose a critical constraint which causes the TCR to dock with a pMHC ligand in “diagonal mode” (29). E58 is at the “high point” and from P57 to P50 the chain runs away from the TCR-binding surface. It is therefore unsurprising that these residues have large RMS deviations because they are exposed to solvent. Although these residues are associated with significant RMS deviations, none of them is implicated in direct contact with the single mutated residue of the peptide.
The initial structure of Tax(A6)-MHC was taken from the crystal structure of Tax(P6)-MHC, because there is no crystal structure available for the former. To assess whether the mutation leads to modifications of the conformational and/or dynamical properties, it is interesting to compare the residue-by-residue deviations and fluctuations of two unbound simulations, as suggested by Michielin and Karplus (7). The deviation per residue is the mean deviation, with respect to the Tax(P6)-MHC x-ray structure, of each conformation during the final 1 ns trajectory of the simulation while the fluctuation per residue is the deviation with respect to the average structure of the final 1 ns simulation.
Most of the deviations are <1.4 Å for atoms within the α-helices, except for the P57 and E58 residues of the α1-helix and H151 of the α2-helix (see Fig. 6⇓). These three residues have large deviations because they are at the breaks of the helices and are exposed to the solvent. As they point away from the peptide, they are not expected to interact with the peptides directly. The atoms in the β-sheet are much more static in both simulations with deviations <1.0 Å, because none of the atoms in the β-sheet are exposed to the solvent. The correlation coefficients were calculated separately for α-helices, β-sheets, and loops, using all the atoms in each region. They varied from 0.45 for the α-helix to 0.64 for the β-sheet. Although the correlations are not as high as those reported in constrained MD simulations (7), there are no large structural or dynamical differences observed in the binding site of MHC when the wild-type peptide is replaced by the mutant. The secondary structures are stable during the simulations with fluctuations <0.6 Å for the β-sheets and 1.0 Å for the α-helices. Some residues in the loops are flexible, with fluctuations up to 2 Å. The correlation coefficients are 0.68, 0.73, and 0.91 for α-helices, β-sheets, and loops, respectively.
In Fig. 7⇓, the deviations and fluctuations of the two peptides are shown for two simulations. The side chain of L1 has larger fluctuations in the Tax(P6)-MHC than that in the Tax(A6)-MHC. This residue also has large deviations from the x-ray structure in both simulations. The side chain of Y5 has large deviations because of the length of the residue and because its orientation is different in the two simulations. The distances of the Cβ atoms of the P6 and A6 (which are almost identical) from the Y5 ring are 5.77 Å and 7.95 Å, respectively. Such a difference is not unexpected owing to the length of the Y5 side chain and its exposure to the solvent. Although the deviations are reasonably correlated (correlation coefficients 0.85 for the backbone and 0.74 for all heavy atoms), the fluctuations have essentially no correlation (correlation coefficients <0.1). However, high correlations of the deviations and fluctuations are not necessary to obtain meaningful binding free energies, provided there are no large structural rearrangements in the binding site (28).
Free energy calculations.
To obtain quantitative thermodynamic predictions, we have conducted alchemical free energy simulations where the sixth residue of the peptide is mutated from Pro into Ala, in both the pMHC and TCR-pMHC complexes in explicit solvent. The derivative <∂U/∂λ>λ was integrated using a trapezoidal method, except at the end points of each run. Two extrapolation methods for the free energy derivative to λ = 0 and λ = 1 were described above. The free energies reported use the average result from the two extrapolation methods, whose results differ typically by ∼2–3 kcal/mol. The nonlinear extrapolations at endpoints were obtained by fitting functions λ−3/4 or (1-λ)−3/4 to the first and last two 〈∂U/∂λ〉 values (vdW terms), respectively. The uncertainties are calculated by block average methods using 10 blocks.
The calculated free energy difference (Table II⇓) is in reasonable agreement with the experimental free energy difference (−2.9 ± 0.2 kcal/mol) (9). The vdW contribution stabilizes the wild-type peptide, while the electrostatic contribution stabilizes the mutated peptide. The total relative binding free energy shows that the wild-type peptide is more stable than the mutated one because the vdW contribution dominates the free energy change. This contradicts Michielin and Karplus’ calculation (7), who reported that both the vdW and the electrostatic contributions were favorable to the wild-type peptide. In the next section we provide a more detailed quantitative analysis of various contributions to the computed relative binding free energy difference.
Free energy component analysis.
The role of different atoms or groups of atoms in determining binding affinity can be characterized in more detail by their contributions to the P6 to A6 free energy differences. As noted previously, individual contributions from different groups, unlike the total free energy difference, are not state functions (30), and therefore depend on the transformation pathway used in the simulations. In particular, free energy components calculated along an alchemical transmutation pathway, such as that used here, can provide insight into those interactions that are important for binding and specificity (31).
The vdW contributions from MHC, TCR, and water favor the wild-type peptide, while their electrostatic contributions favor the mutated one (Table III⇓). The peptide (excluding the mutated residue) exhibits converse stabilities, favoring the mutated peptide by virtue of the vdW interactions. At the endpoint (λ→1) of the bound simulation, two methylene groups of the proline residue disappear. The hydrogen atom of the NH group of the alanine residue then forms a hydrogen bond with the oxygen of the glycine residue at the fourth residue of the peptide, thereby introducing large vdW interactions between the vanishing atoms and the carbonyl group of glycine. The contributions from the peptide are almost compensated by those from the MHC protein. All of the residues in the α1-α2 domain with deviations larger than 1 Å (see above) make negligible contributions to the free energy. The P1 residue of the peptide make a small contribution (<0.1 kcal/mol) while Y5 contributes significantly (∼−1.34 kcal/mol). It is interesting to note that although our calculated free energy difference is in reasonable agreement with Michielin and Karplus’ calculation, the component analysis is different. Our calculation indicates that the MHC, TCR, and water favor the wild-type peptide, while the peptide itself favors the variant; in Michielin and Karplus’ calculation all four groups favor the wild type.
The contributions from TCR are further decomposed to different CDRs and the results are listed in Table IV⇓. The vdW contributions from all CDRs favor the wild-type peptide. The electrostatic contributions from CDRs of the Vα domain prefer the wild-type peptide, while contributions from CDRs of the Vβ domain favor the mutated peptide. The CDR1 loops of the TCR are positioned over the termini of the peptide, while the CDR2 loops are over the MHC molecule. None of them enjoy any contacts with the sixth residue, and all make small or negligible contributions to the free energy difference between epitopes. The CDR3 loops of both Vα and Vβ meet at the centre of the binding site over the bound epitope and they dominate the contributions to the free energy difference from the TCR. The principal contribution is from CDR3α whose electrostatic and vdW contributions both favor the wild-type peptide. The electrostatic and vdW contributions from the CDR3β have different signs, reducing its overall energetic contribution which is favorable to the mutant and second largest in magnitude.
Free energy derivative behavior.
The λ dependence of the free energy derivatives is shown in Fig. 8⇓ for the alchemical transmutation of Tax(P6) to Tax(A6). It should be noted that the curves are indeed dependent on the mutation pathways, even though the free energy difference is independent of path. Nevertheless, insight into the mutation process may be obtained from the λ dependence (28). The vdW free energy derivative steadily decreases from λ = 0.0 to 0.9, followed by a sharp decrease at the endpoint λ→1. At this endpoint, there is a significant repulsive potential caused by the rapidly increasing vdW interactions. This results in the largest changes in the thermodynamic quantities. Hence the contributions from λ = 0.98→1 are important (−1.74 kcal/mol for bound and −2.24 kcal/mol for unbound states). In the unbound state, the volume previously occupied by two methylene groups is expected to be occupied by water molecules, with concomitantly large changes in the free energy; while in the bound state the volume is already taken up by the TCR loops, although there are some adjustments when the original atoms disappear. The occupation of water molecules is significantly different around the sixth residue of the wild-type and mutant peptides when bound to the TCR. The Cδ atom of proline of the Tax is expected to be replaced by one water molecule when the proline residue is mutated to alanine. During the alchemical transmutation of Tax to Tax P6A, no water is observed to enter the binding site; rather, the local conformation adjusts to fill the cavity left by the side chain of proline. The release of the water molecule, which should be easier to observe in the mutation from alanine to proline, is a process that manifestly contributes to the binding free energy (32). Unlike the vdW λ-derivative, the electrostatic free energy λ-derivative decreases smoothly for increasing λ (Fig. 8⇓), in a roughly linear fashion.
Convergence of the potential energy derivatives.
The behavior of 〈∂U/∂λ〉λ as a function of the simulation time is shown in Fig. 9⇓ for the bound and unbound simulations. Each curve represents the cumulative average computed over the course of a simulation for each value of λ. The convergence of the free energy derivatives is excellent. The cumulative averages all converge within 200 ps except for the endpoint windows (λ = 0.9 and 0.98 for bound and 0.98 for unbound), which require ∼400 ps for λ = 0.9 and 0.98 in the bound simulations, and ∼700 ps for λ = 0.98 in the unbound simulations. Even in previously reported MD-based TI studies, convergence is often not convincingly attained, to say nothing of the requirements for very large scale simulations like the present one.
The use of scalable MD codes running on state-of-the-art high performance computers enables us to compute biomolecular properties deemed inaccessible only a few years ago. Thus, we can tackle problems head on, using increasingly realistic representations of the system in terms of greater size and duration of simulation. In this study, we have revisited the TCR-pMHC system previously studied by Michielin and Karplus (7), using a less tendentious representation of the TCR-pMHC system: we used a full atomistic model of the solvated TCR-pMHC to mimic the surroundings of the molecule; three-dimensional periodic boundary conditions were applied to represent an infinite system, together with the particle mesh Ewald summation method to compute the long-range electrostatic interactions accurately. It is encouraging to note that our MD simulations of such complex molecular assemblies produce stable trajectories over the multinanosecond time scale without the need for artificial restraints which, in turn, sheds new light on dynamic and energetic properties of the system.
The analysis of the binding free energy difference between wild-type (Tax) and mutant (Tax P6A) peptides indicates that the wild-type (Tax) peptide has a larger affinity for the TCR than the latter. Our calculated binding free energy difference is in reasonable agreement with the experimental value. The total calculated free energy difference also agrees with previous calculations. However, the four contributions from different contributing domains (TCR, MHC, water, and the unchanged part of peptide) exhibit large differences from those computed in previous work. For example, while the self-interaction of the peptide is a relatively stabilizing contribution to the wild-type complex in the calculation of Michielin and Karplus (7), it makes the opposite contribution in our calculations. To compare the results of the two papers, properly and objectively, one must take into account the different treatment of the environment of the system in the two simulations. Although the same form of hybrid potential for thermodynamic integration was used in the two set of calculations, in most other respects, our improved simulation technology has allowed us to dispense with many of the arbitrary constraints and approximations used by Michielin and Karplus. The major differences in the models lead to different results. In particular, it is clear that component contributions to the free energy difference are more sensitive to structural differences than is the difference in total free energy. When one knows the outcome required of a simulation, it is possible to select parameters, constraints, and approximations that allow such a result to be obtained. To use a simulation methodology to examine novel systems, and to thus make true predictions, one must seek–as we have here–a situation in which as few approximations are made as possible. In this way we can hope, through the use of supercomputing, to address realistic systems and time scales while avoiding random bias, any residual systematic bias being due to remaining flaws in the simulation methodology itself; these include our use of SHAKE constraints as well as lack of incorporation of quantum mechanical effects, both of which can be expected to contribute to inaccuracies in the thermodynamic behavior of these systems.
Although our model is more realistic than those used previously, it nevertheless remains limited in its description of the system. The system is still small, with many other coreceptors and accessory molecules, such as CD4 and CD8 proteins, being involved in T cell recognition, as part of what is sometimes referred to as the “immunological synapse” (33). Including these in the simulations is not straightforward as there is no crystallographic data currently available. Moreover, current computational resources limit both the system size and the simulation time. Some conformational changes in the protein occur over long time scales: for example, the interdomain motions and the motion of the loops, especially the CDR loops. A well-known technical problem arises at the endpoint of the alchemical transmutation produce, which makes a large contribution to the free energy difference in the present calculations. A modified formulation of the interaction potential energy function can be used to avoid possible endpoint singularities when atoms are created or annihilated (34).
The agreement of the computed free energy difference with the experimental data is encouraging. However, the alchemical transmutation calculations are computationally very expensive and most of the computer time is spent on uninteresting configurations that correspond to nonphysical superpositions of the initial and final physical states. Alternative but more approximate approaches to the calculation of binding affinities focus on computing the binding energetics from simulations of the physical states of the ligand. In a related study (35), we have performed free energy calculations on the same system, using molecular mechanics Poisson-Boltzmann surface area and linear interaction energy methods. These approaches are more computationally expedient than the TI method, albeit relying on some quite stringent assumptions. Our present paper has the additional benefit of being an excellent benchmark for testing various approximations such as these that can speed up free-energy calculations.
The presentation of peptides by the MHC is often viewed as the most discriminatory step of the overall presentation process. Although we define recognition as the interaction of TCR and pMHC, many subsequent steps are involved in the actual activation of T cells. These involve a number of complicated, concurrent, and multifaceted signaling events. Indeed, recognition is not, itself, an isolated event: when peptide-bound MHCs (or pMHC complexes) are recognized by TCRs, many other coreceptors and accessory molecules, in addition to CD4 and CD8 molecules, are also involved. Ligation of TCRs by pMHC complexes (pMHC) leads to a variety of downstream events, including clustering of TCRs, rapid compartmentalization of the signaling machinery into lipid rafts, and a whole group of complex intracellular signaling events. Our overall understanding of the recognition process remains poor. Nonetheless, it has emerged that the process involves the formation of the so-called immunological synapse (36), a highly organized, spatiotemporal arrangement of receptors and accessory molecules of many types. The most crucial event is the phosphorylation of several residues in the CD3 chain resulting in activation of ZAP70, which initiates further signaling. The activated T cell down-regulates surface TCR expression and up-regulates several surface markers, among them CD69 and CD25. In the case of CTL, this process is accompanied by the production of effector molecules such as perforin and IFN-γ or other cytokines as well as surface expression of CD95. Ultimately, the accurate modeling of all these complex processes will be required to gain full understanding of the process of epitope presentation. This is true systems biology (37). Aspects are currently being investigated using transcriptomic analysis, but beyond this there is a pressing need for all underlying molecular mechanisms to be studied in detail. Molecular information is necessary, not only about the isolated structures of the molecules involved but also on their interactions, as is data on the kinetics and thermodynamics of their organization in time and space. Large scale molecular dynamics will play a pivotal role in deciphering the mechanistic processes underlying these events.
The authors have no financial conflict of interest.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 This work was supported by the Engineering and Physical Science Research Council through RealityGrid Grant GR/R67699, by the National Science Foundation under National Resource Allocation Committee Grant MCA04N014, and under Partnership for Advanced Computational Infrastructure Grant ASC030006P. Our local SGI Onyx2 computer was funded by Partnership for Advanced Computational Infrastructure/Higher Education Funding Council for England. The Edward Jenner Institute for Vaccine Research is sponsored by: Biotechnology and Biological Sciences Research Council/Medical Research Council, GlaxoSmithKline, and the U.K. Department of Health.
↵2 Address correspondence and reprint requests to Dr. Peter V. Coveney, Centre for Computational Science, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K. E-mail address:
↵3 Abbreviations used in this paper: pMHC, peptide-MHC; MD, molecular dynamics; TI, thermodynamic integration; vdW, van der Waals; RMS, root-mean-square; CDR, complementarity-determining region; ps, picosecond; ns, nanosecond.
- Received November 5, 2004.
- Accepted May 18, 2005.
- Copyright © 2005 by The American Association of Immunologists