close
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 May;18(5):900-14.
doi: 10.1261/rna.029041.111. Epub 2012 Mar 26.

LocARNA-P: accurate boundary prediction and improved detection of structural RNAs

Affiliations

LocARNA-P: accurate boundary prediction and improved detection of structural RNAs

Sebastian Will et al. RNA. 2012 May.

Abstract

Current genomic screens for noncoding RNAs (ncRNAs) predict a large number of genomic regions containing potential structural ncRNAs. The analysis of these data requires highly accurate prediction of ncRNA boundaries and discrimination of promising candidate ncRNAs from weak predictions. Existing methods struggle with these goals because they rely on sequence-based multiple sequence alignments, which regularly misalign RNA structure and therefore do not support identification of structural similarities. To overcome this limitation, we compute columnwise and global reliabilities of alignments based on sequence and structure similarity; we refer to these structure-based alignment reliabilities as STARs. The columnwise STARs of alignments, or STAR profiles, provide a versatile tool for the manual and automatic analysis of ncRNAs. In particular, we improve the boundary prediction of the widely used ncRNA gene finder RNAz by a factor of 3 from a median deviation of 47 to 13 nt. Post-processing RNAz predictions, LocARNA-P's STAR score allows much stronger discrimination between true- and false-positive predictions than RNAz's own evaluation. The improved accuracy, in this scenario increased from AUC 0.71 to AUC 0.87, significantly reduces the cost of successive analysis steps. The ready-to-use software tool LocARNA-P produces structure-based multiple RNA alignments with associated columnwise STARs and predicts ncRNA boundaries. We provide additional results, a web server for LocARNA/LocARNA-P, and the software package, including documentation and a pipeline for refining screens for structural ncRNA, at http://www.bioinf.uni-freiburg.de/Supplements/LocARNA-P/.

PubMed Disclaimer

Figures

FIGURE 1.
FIGURE 1.
STAR profile plots with annotations. In each profile plot, the dark regions indicate structure reliability, the light regions represent sequence reliability, and the thin line shows the combined column-reliability. The thick lines on top of B and C show the automatic prediction based on the STAR profile; below we indicate the known annotation by thinner lines. (A) STAR plot of an alignment of nine ncRNAs from the 7SK ncRNA family projected to the X. laevis sequence. The profile is annotated with a mountain plot of the consensus structure. (B) STAR plot for the LocARNA-P-alignment of the miRNA cluster hg18, chr13, positions 90800800–90801699, projected to the human sequence; the known microRNAs are easily detected using our method. (C) STAR plot for the LocARNA-P alignment of the human gene gas5 (hg18, chr1, 172,099,662–172,103,748); the gene is aligned with four other mammalian sequences; the introns of human gas5 host 10 C/D-box snoRNAs.
FIGURE 2.
FIGURE 2.
Accurate ncRNA boundaries for Drosophilids RNAz screen. (A) Deviation from annotated boundaries. We compare the deviation of RNAz (red) with the deviation of the boundaries as determined with our method (green). When the notches around the medians do not overlap, there is strong evidence that the medians differ. We show results of our method in three variants, since the alignment quality could be expected to depend on the sequence orientation: first, always aligning the sequences in forward orientation (+); second, in reverse orientation (−); third, in the orientation of the ncRNA annotation (annotated). (B–D) STAR plots with LocARNA-P predictions (thick green lines on top), RNAz predictions (red lines below), and annotated regions, RNAz predictions (red), and LocARNA-P predictions (green). (B) LocARNA-P precisely locates the snoRNA:U5:38ABa annotated in FlyBase. (C) For tRNA:H:48F, our prediction is well correlated with the precursor (cyan line) as described by Frendewey et al. (1985) (FlyBase annotation). (D) In the case of tRNA:N5:42Af, the magenta line shows the tRNA precursor, including the flanking region given by Lofquist and Sharp (1986). Here, RNAz indicates a 3′ extension, whereas LocARNA-P indicates the structure in the 5′ part of the precursor. As shown by Lofquist and Sharp (1986), the 5′-flanking regions of the tRNA5Asn genes differentially arrest RNA polymerase III.
FIGURE 3.
FIGURE 3.
(A) Distribution of predicted lengths of 5′ and 3′ flanking regions for tRNAs. The figure omits four outliers with 3′-trailers longer than 100. (B) Discriminating ncRNAs. ROC curves for discriminating RNAz loci, which are positives of an RNAz screen, by RNAz itself (using RNAz max P) and after rescoring with LocARNA-P by the STAR discriminator.
FIGURE 4.
FIGURE 4.
Inside and outside decomposition by the recursions. (A) Inside. The gray inside regions correspond to the matrix ZM and the white inside region to ZD. (B) Outside. The gray outside regions correspond to entries in ZM; the white outside region represents an entry of ZD.

References

    1. Bauer M, Klau GW, Reinert K 2007. Accurate multiple sequence–structure alignment of RNA sequences using combinatorial optimization. BMC Bioinformatics 8: 271 doi: 10.1186/1471-2105-8-271 - PMC - PubMed
    1. Bertone P, Stoc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, et al. 2004. Global identification of human transcribed sequences with genome tiling arrays. Science 306: 2242–2246 - PubMed
    1. Bompfünewerer AF, Backofen R, Bernhart SH, Hertel J, Hofacker IL, Stadler PF, Will S 2008. Variations on RNA folding and alignment: Lessons from Benasque. J Math Biol 56: 129–144 - PubMed
    1. Bradley RK, Pachter L, Holmes I 2008. Specific alignment of structured RNA: Stochastic grammars and sequence annealing. Bioinformatics 24: 2677–2683 - PMC - PubMed
    1. Chambers JM, Cleveland WS, Kleiner B, Tukey PA 1983. Graphical methods for data analysis. Wadsworth/Cengage Learning, Florence, KY

Publication types