Introduction to Analysis of REMD simulations
The goal here is to perform an initial processing to a set of Replica Exchange Molecular Dynamics simulations. CPPTRAJ has the ability to process multiple runs using the 'ensemble' command. This command will read in and process trajectories as an ensemble. Similar to ’trajin remdtraj’, except instead of processing one frame at a target temperature, process all frames. This means that action and trajout commands apply to the entire ensemble; note however that not all actions currently function in ’ensemble’ mode.
This example is based on the same system used for the clustering tutorial which is based on the RNA GACC tetranucleotide. The goal here is to perform an initial processing to a set of Temperature Replica Exchange Molecular Dynamics simulations.
The files for this tutorial consists of the AMBER topology file and eight separate trajectory files that have been calculated using different temperatures:
- rGACC.nowat.parm7: Topology file, rGACC + 3 Na+ ions.
- rGAAC.nowat.001
- rGAAC.nowat.002
- rGAAC.nowat.003
- rGAAC.nowat.004
- rGAAC.nowat.005
- rGAAC.nowat.006
- rGAAC.nowat.007
- rGAAC.nowat.008
Note that although input is provided in files, users are encouraged to use the interactive mode to become better familiar with CPPTRAJ workflow and command options.
The following CPPTRAJ script will read the REMD set of trajectories and perform some analysis on the data:
parm rGACC.nowat.parm7 ensemble rGAAC.nowat.001 strip :Na+ rms RNA first :1-4&!@H= mass out rmsd.dat average avg.pdb :1-4
CPPTRAJ has the ability to process multiple runs using the 'ensemble' command. This command will read in and process trajectories as an ensemble and read a series of files provided that they have a serial number at the end.. Similar to ’trajin remdtraj’, except instead of processing one frame at a target temperature, process all frames. This means that action and trajout commands apply to the entire ensemble; note however that not all actions currently function in ’ensemble’ mode.
We will briefly look at each option with more detail:
- parm: Read the AMBER7 formated topology file.
- ensemble: Read the trajectory file with the name rGACC.nowat.001 and continue to read each file available.
- strip: Remove the Na+ ions.
- rms: Use RMSD of atoms in
as distance metric. It will use 'RNA' as a name for this data set. The mask :1-4&!@H= means it will use residues 1 through 4 and will ignore all atoms that begin with "H" (so, no hydrogens will be considered in the analysis). The output will be written to a file called rmsd.dat. - average: Create average structures considering residues 1 through 4 and all the frames available in each trajectory. Save each file in the PDB format.
Once your input has been read in, type run to begin trajectory processing and analysis. CPPTRAJ will create a series of files as requested by our script. It will create the file rmsd.dat which contains 9 columns, the first one corresponding to the frame number and 8 columns corresponing to the RMSD value for each one of the trajectories read by the ensemble command. Also, we should now have 8 pdb files with the name avg.pdb.X (where X=0 to 7), which corresponds to the average structure for each individual trajectory.
Copyright Thomas E. Cheatham III, Christina Bergonzo, Daniel Roe & Rodrigo Galindo-Murillo, 2015