Full length 16S microbiome classification.

Updated: Dec 13, 2018

With short read NGS it’s traditionally been impossible to get full-length sequences for molecules. For microbial classification, this has forced the science community to rely on sequencing just one or two of the 16S rDNA variable regions. Add on top of that the high sequencing error rates for short reads and it can lead to ambiguous classification results (not to mention ambiguous classification methodologies by grouping OTUs).

To overcome these obstacles, the LoopSeq Microbiome kit uses synthetic long reads (i.e., using barcode technology to computationally sequence a single molecule assembled from a cluster of short reads). This enables comprehensive phylogenetic classification (including for complex microbial communities) all while having a lower error rate and coverage of all nine variable regions.

Better classification

To demonstrate the effectiveness of LoopSeq Microbiome in delivering a detailed and comprehensive phylogenetic classification of microbes in a mixed community, the Loop team analyzed a complex soil sample using two different methods: traditional short-read sequencing versus LoopSeq Microbiome.

For the short-read-only analysis, we sequenced two different spans of 16S variable region: V3 and V4-V5. For the LoopSeq Microbiome analysis, we sequenced the entire span of all nine variable regions: V1-V9. The LoopSeq Microbiome method classified >99% of the unique 16S molecules down to the species or genus level, whereas the traditional approach led to classification of only ~65% of unique 16S molecules down to the species or genus level (Figure 1).

Figure 1. LoopSeq Microbiome enables more comprehensive phylogenetic classification of an environmental sample than traditional short-read-only NGS alone.

Full-length 16S

When sequencing one or two 16S variable regions (e.g., V3 or V4-5) the read lengths are at most 250bp. However, with the LoopSeq Microbiome, 96% of all reads run the full length of the 16S gene (~1500bp) - see figure 2. This means when classifying, especially when defining OTUs, a team can feel more confident in the clustering.

Figure 2. Greater than 96% of LoopSeq Microbiome reads are assembled into full-length 16S sequence.

Lower error rate

The way the LoopSeq Microbiome works, is by barcoding each 16S molecule with a unique ID, then clustering the short reads based on those barcodes and assembling them into a single long read. As a result, each base position is overlapped by multiple short-read sequences, allows consensus calling to determine the true call independent of sequencing errors. This increased coverage per base position leads to an error rate of <0.005% compared to the short-read-only error rate of ~1%. This increased accuracy provides additional confidence in species classification by reducing the number of species mistakenly identified as novel species due to random sequencing errors (Figure 3).

Figure 3. LoopSeq’s increased sequencing accuracy (error rate <0.005%) leads to reduced numbers of misclassified species. Expected number of species in this collector’s curve of a defined sample is eight.

Overall, classifying microbial communities with one or two variable regions using short read sequencing has pitfalls. With the LoopSeq Microbiome, a science team can feel more confident in and have greater clarity for their results.

To learn more, visit our shop.