Introduction
Most next-generation sequencing (NGS) libraries are prepared and run once, so data accuracy on the first pass is critical for time- and cost-effective research. The question that matters most is therefore simple: How accurate do my reads have to be before I trust the answer? Unique Molecular Identifiers (UMIs) were created to push accuracy beyond the native error rate of an Illumina® run by tagging every DNA molecule prior to PCR amplification. More than a decade of method development has produced two main flavors, simplex and duplex workflows. Simplex workflows tag each strand once and build a single strand consensus sequence (SSCS). Duplex workflows tag the complementary strands with coordinated barcodes and only call a variant if both strands agree, producing a duplex consensus sequence (DCS). In this article we explain how the chemistries differ, when the extra accuracy of duplex matters, and why simplex sequencing remains the default for most applications, including RNA seq.
From raw reads to digital molecules
Kinde and colleagues first showed that collapsing reads by molecular barcodes suppresses sequencing errors almost 100 fold, cutting the per base error rate to about 1 × 10⁻⁴.1 The logic is straightforward. All true duplicates that arise during PCR share the same tag, so they can be collapsed into one consensus read. The consensus removes random polymerase or base calling mistakes that appear only in a subset of daughter reads. Duplex sequencing pushes this principle further. Schmitt et al. added a second, strand specific tag and required that a variant be present on both strands, reducing the theoretical error floor to below 1 × 10⁻⁶ .2 Kennedy et al. later formalized the duplex protocol and tool chain, demonstrating detection of a single mutation among ten million wild type bases.3
Mechanics of simplex and duplex UMI tagging
In a simplex library each DNA fragment is ligated to adapters carrying a single random barcode, typically 8–12 bp, which is sequenced in one index read. After alignment, reads that share the same UMI and genomic start position are merged. In duplex libraries the top strand might receive the tag pair A B while the bottom strand receives B A. Reads are first collapsed per strand to form two SSCS records and then compared: only positions supported by both halves are retained in the DCS.
Residual error and depth requirements
The practical consequence of the extra confirmation step is a tradeoff between accuracy and usable depth. The table below summarizes the typical numbers reported in the literature.
| Metric | Simplex | Classic duplex | Duplex with CODEC | 
|---|---|---|---|
| Residual error floor | 1 × 10⁻⁴ to 1 × 10⁻⁵ | 1 × 10⁻⁷ to 1 × 10⁻⁶ | 1 × 10⁻⁶ to 1 × 10⁻⁵ | 
| Extra raw reads vs. no UMI | ≈ 2–3 × | 5–15 × | 1.5–3 × | 
Single strand collapsing already cuts the error rate enough for most variant calling panels. Duplex methods like CODEC reduce the depth penalty by concatenating both strands into a single read pair while maintaining near duplex accuracy.4
Why simplex UMIs are usually enough
Simplex UMIs comfortably detect variants down to about 0.1% allele frequency when sequenced to 15,000× raw depth. That threshold is sufficient for typical solid tumor panels, germline confirmation tests, copy number assays and even cfDNA applications
     NEXTFLEX Cell Free DNA-Seq Library Prep Kit 2.0
            Discover
            
 that stop at 0.1% VAF. Importantly, simplex sequencing is also the backbone of modern RNA seq. Bulk RNA seq library prep kits such as Revvity’s NEXTFLEX™ Rapid Directional RNA-seq Kit 2.0
    
                    NEXTFLEX Cell Free DNA-Seq Library Prep Kit 2.0
            Discover
            
 that stop at 0.1% VAF. Importantly, simplex sequencing is also the backbone of modern RNA seq. Bulk RNA seq library prep kits such as Revvity’s NEXTFLEX™ Rapid Directional RNA-seq Kit 2.0
     NEXTFLEX Rapid Directional RNA-Seq Kit 2.0
            Discover
            
 used with UMIs collapse PCR duplicates into digital transcript counts. Duplex confirmation would simply discard half the molecules without benefitting expression accuracy. Simplex UMIs
    
                    NEXTFLEX Rapid Directional RNA-Seq Kit 2.0
            Discover
            
 used with UMIs collapse PCR duplicates into digital transcript counts. Duplex confirmation would simply discard half the molecules without benefitting expression accuracy. Simplex UMIs
     NEXTFLEX UDI-UMI Barcodes (1-8)
            Discover
            
 therefore maximizes usable molecules, shortens run time, and halves data storage compared with duplex UMIs.
    
                    NEXTFLEX UDI-UMI Barcodes (1-8)
            Discover
            
 therefore maximizes usable molecules, shortens run time, and halves data storage compared with duplex UMIs.
When duplex UMIs makes the difference
There are, however, questions that simplex UMIs cannot answer. Measurable residual disease assays aim for 0.01% VAF or lower. Oxidative lesions in FFPE tissue create strand biased artefacts that simplex sequencing cannot distinguish from real variants. Duplex sequencing filters those artefacts because a damage event usually occurs on only one strand. Regulatory guidance for in vivo mutagenesis tests now cites duplex sequencing as the preferred method when the target error rate must drop below 1 × 10⁻⁶.
Case studies
Solid tumour panel (1 Mb, 0.1% LoD). Sequencing 250 ng of DNA to 800× SSCS depth yields roughly 15 million raw reads. Simplex calling achieves the required sensitivity and specificity, while duplex sequencing would raise cost two to four fold without changing the answer.
MRD liquid biopsy (50 kb, 0.01% LoD). At 30,000× raw coverage even duplex sequencing retains only about 20% of molecules yet still delivers an observed error rate near 1 × 10⁻⁷. Simplex sequencing cannot reach that floor and would leave too many false positives.
Choosing the right strategy
The decision reduces to four questions:
- What is the lowest allele frequency you must report?
- How damaged is the input DNA or RNA?
- How many genomic bases are you interrogating?
- What is your budget per sample?
If the answer to the first question is 0.1% or higher, simplex sequencing is almost always sufficient. If the answer dips below 0.05% or the DNA is heavily damaged, duplex sequencing or a duplex like chemistry such as CODEC becomes mandatory. Illumina’s DRAGEN® v4.3 pipeline now integrates UMI collapsing and optional duplex UMI consensus in a single software step, making validation easier for laboratories that decide to adopt the higher standard.
Conclusion
Simplex UMIs convert noisy next generation reads into digital molecules and push the error rate low enough for most research and clinical assays. Duplex methods suppress errors two orders of magnitude further and are necessary for tasks like minimal residual disease detection, ultra low frequency mutagenesis studies, or heavily damaged DNA. Knowing your assay’s variant frequency requirement, sample quality, and depth budget will tell you which method to choose. Learn more about Revvity’s Simplex UMI
References
- Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with the Safe-Sequencing System. Proc Natl Acad Sci USA. 2011;108(23):9530-9535. doi:10.1073/pnas.1105422108.
- Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci USA. 2012;109(36):14508-14513. doi:10.1073/pnas.1208715109.
- Kennedy SR, Schmitt MW, Fox EJ, et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc. 2014;9(11):2586-2606. doi:10.1038/nprot.2014.170
- Bae JH, Liu R, Roberts E, et al. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat Genet. 2023;55(5):871-879. doi:10.1038/s41588-023-01376-0
 
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
                   
     
            
            
           