NASSAM: a server to search for and annotate tertiary interactions and motifs in three-dimensional structures of complex RNA molecules

Email for queries and to report errors.

Graph theory, RNA structural motifs, RNA base arrangements

Input file type:

  • NASSAM accepts ASCII PDB formatted coordinate files of RNA structures.
  • PDB files with protein/amino acid chains such as RNA bound to proteins are also accepted and processed.
  • Files of structures solved by NMR containing multiple conformations will be processed but will likely return multiple results for a hit that corresponds to all the conformations.
  • NASSAM has been tested with PDB RNA crystallographic structures solved to a minimum resolution of 3Å including the prokaryotic and eukaryotic ribosomal subunits currently available up to December 2011.

The NASSAM search approach
The Nucleic Acid Search for Substructures and Motifs (NASSAM) program accepts three-dimensional RNA crystallographic structures formatted as PDB files as input queries to search against a database of 3D base arrangements that include base pairings, base triple arrangements, A-minor motifs, kink-turns, ribose-zippers, tetraloops and T-loop motifs. The motif database consists of graph representations of base arrangements designed from patterns reported in literature.

The input file is processed into a matrix containing information on the distances (edges) between pseudoatom vectors (nodes) effectively representing the orientation of each base with respect to each other as a graph relationship (Figure 1). The matrix representing the query structure is then compared against a database of pattern matrices that represent arrangements and motifs of RNA bases.

The search, which is the comparison of the query graph (RNA structure) against graphs representing base arrangements of RNA, is done using the Ullmann subgraph isomorphism algorithm. Since totally rigid matches of the base arrangements of the query to the database patterns are not realistic, the program is able to incorporate a distance tolerance parameter that permits a controlled amount of deviation from the distance set in the query patterns. Manipulation of the distance tolerance parameter can also result in discovery of correctly arranged motifs that were not within the parameters of the search or novel motifs which are patterns not available in the databases was retrieved by evolution of the available pattern into a new pattern as a result of high distance tolerances (see Firdaus-Raih et al 2011). In general, the default distance tolerances are set at 30% because this value was determined to give the best balance between recall and precision (see Table 1 and Harrison et al. 2003).

Each NASSAM run will by default also include calculations that identify the hydrogen bonding interactions between bases in the query RNA structures (Figure 2). Since many patterns and motifs in RNA structures also include hydrogen bonding interactions in the definition of the pattern, users can use the generated list of hydrogen bond interactions to identify whether the retrieved hits conform to not only the 3D arrangement of the bases, but any additional parameters such as requirements for specific hydrogen bonds to be present.

NASSAM output

  • Hits to patterns in the database are provided as a list (Figure 3).
  • The NASSAM output presents a list of the motifs and base interactions found in an RNA structure using common nomenclature as reported in the literature.
  • The NASSAM hits can be further selected to be viewed using a Jmol molecular viewer browser plugin window (Figure 4).
Utility of NASSAM
  • Currently, the primary users of NASSAM are expected to be RNA crystallography and RNA bioinformatics groups.
  • NASSAM is expected to be highly useful for RNA structure annotation and analysis for not only currently available structures but also other RNA complex structures as they become available.

Potential search strategies
When using NASSAM to annotate the base interactions and motifs in an RNA structure, a quick way to get a general annotation of the motifs present would be to run a NASSAM search at default search tolerances for either a specific motif of interest, all interactions available in the database or all interactions with the exception of base pairs. Once a base line of all the interactions / motifs present have been annotated, users can further increase the distance tolerance parameter. This extension of the search could reveal novel motifs or identify motifs which are outside the parameters usually defined for a particular motif but is still a valid example of the motif of interest.

Demonstration run:

This is a demonstration NASSAM run searching for base triples in the structure 3q1r.pdb, , the structure of a bacterial ribonuclease P holoenzyme in complex with tRNA.

Precomputed example:
Click here to view the precomputed results of a NASSAM search using 3Q1R.pdb as input.

References for NASSAM
  1. Firdaus-Raih, M., Harrison, A-M, Willett, P., and Artymiuk, P.J. (2011) Novel base triples in RNA structures revealed by graph theoretical searching methods. BMC Bioinformatics, 12(S13): S2. 
  2. Harrison, A-M., South, D.R., Willett, P., and Artymiuk, P.J. (2003) Representation, searching and discovery of patterns of bases in complex RNA structures. Journal of Computer-Aided Molecular Design, 17, 537-549. 

Computational resources provided by the Genome Computing Centre, Malaysia Genome Institute

Please contact ( _at_ = @ ) for any queries or to report errors.