Revvity Sites Globally

Select your location.

*e-commerce not available for this region.

Australia

Austria

Belgium

Brazil *

Canada

China *

Denmark

Finland

France

Germany

Hong Kong (China) *

India *

Ireland

Italy

Japan *

Luxembourg

Mexico *

Netherlands

Norway

Philippines *

Republic of Korea *

Singapore *

Spain

Sweden

Switzerland

Thailand *

United Kingdom

United States

Blog

NGS NGS Library Prep

Nov 17th 2025

3 min read

Optimal selection of read depth and length for bulk RNA-Seq: an updated perspective.

Help us improve your Revvity blog experience!

Feedback

Over the past decade, RNA-Seq has evolved from a discovery tool into a cornerstone of clinical and translational genomics. As the field matures, “best practice” no longer means following a single recipe. Study goals and sample quality should drive choices.

As a baseline, the ENCODE long-RNA data standards¹ remain the most widely referenced public specification for bulk RNA-Seq. They accept single- or paired-end data, set a read length of ≥50 bp for uniform processing, and recommend sequencing depths of ≥30 million mapped reads for typical poly(A)-selected RNA-Seq, with the rest of the design left to the experiment’s aims. These indications are useful when budgets are tight or organisms are simple, but deeper and longer runs are warranted for complex questions such as isoform usage, fusion discovery, allele-specific expression, or samples with degraded RNA.

Recent multi-center benchmarking across dozens of labs² shows how wide real-world read depths can be: ~40–420 million total reads per library. Depth, duplication, and GC bias varied markedly between centers, especially when RNA input and quality differed.

These benchmarking datasets provide practical guidance for tailoring read length and depth to RNA quality, input amount, and study objectives. Below we distill what they mean for routine gene-level expression, splicing/isoforms, and total-RNA discovery.

Differential expression

When RNA quality is high (RIN or RQS ≥ 8; DV200 > 70 %) and the focus is on gene-level differential expression, short reads and moderate depth remain cost-effective. Several technical reviews and manufacturer guidelines converge on ~25 – 40 million paired-end 2x 75 bp reads per human sample as a sweet spot for robust gene quantification. This depth stabilizes fold-change estimates across expression quantiles without wasting reads on already-well-sampled transcripts.

Isoform detection

As analytical goals shift toward isoform detection and alternative splicing, both depth and read length must increase. Comprehensive isoform coverage typically requires ≥ 100 million paired-end reads with 2 × 75 or 2 × 100 bp read length⁴. A 2024 benchmarking study demonstrated that conventional depths used for differential expression capture only a fraction of splice events2. Long-read sequencing is gaining ground for full-length transcript reconstruction, but short-read paired-end sequencing remains unmatched for sensitive detection of low-abundance junctions across cohorts.

Fusion detection

Fusion detection sits between the two previous use cases. Most established fusion callers depend on paired-end libraries to anchor breakpoints. Current best practice favors 2 × 75 bp read length as a baseline, with 2 × 100 bp read length providing cleaner junction resolution and ≥ 60 – 100 million reads to ensure sufficient split-read support.

Allele-specific expression

For allele-specific expression (ASE) and expressed-variant analysis, higher depth is essential to accurately estimate variant allele frequencies and minimize sampling error. Oncology-oriented pipelines such as VarRNA show that ~100 million paired-end reads are required for reliable ASE profiling, with further increases advisable when tumor purity is low or RNA integrity is compromised⁵.

Additional considerations

RNA integrity metrics (RIN or RQS and DV200) strongly influence design choices. A recent study using old FFPE blocks confirmed that high-quality RNA-Seq is possible from archival tissue⁶. Current recommendations can be summarized as follows:

DV200 > 50 % → Poly(A) or rRNA-depletion protocols, 2 × 75 – 2 × 100 read lengths, standard depth.
DV200 30 – 50% → Prefer rRNA depletion or capture; add 25 – 50% more reads.
DV200 < 30% → Avoid poly(A); use capture or rRNA depletion with higher input and ≥ 75–100 million reads.

These findings reaffirm that degraded RNA inflates duplication rates and reduces effective complexity; sequencing deeper is the simplest corrective measure.

When input amount is limited (≤ 10 ng RNA), additional PCR cycles inflate duplication, reducing usable complexity. Recent work recommends incorporating unique molecular identifiers (UMIs) to collapse duplicates when sequencing deeply (> 80 M reads). In FFPE applications, combining UMIs with capture or rRNA-depletion protocols and modestly increasing total reads by 20 – 40% restores quantitative precision⁷.

Conclusion

Community benchmarks and new analyses have quantified how read length, sequencing depth, and RNA quality interact to determine data usability. The guiding principle that emerges is simple: match your sequencing strategy to your biological question and sample quality, not to generic norms.

For high-integrity RNA and gene-level studies, short reads and moderate depth remain efficient. For isoforms, fusions, or expressed variants, both read length and depth must rise, ideally within stranded, paired-end designs. When RNA is degraded or scarce, adopt rRNA depletion or capture, use UMIs if possible, and budget extra reads to offset reduced complexity.

Finally, validate every new workflow with a pilot that measures duplication, exonic fraction, and junction detection before scaling. As sequencing costs plateau and analysis costs dominate, these design optimizations ensure that every read you generate contributes real interpretive value.

Browse our RNA-Seq portfolio

References:

Bulk RNA-Seq Data Standards and Processing Pipeline – ENCODE
Wang, D., et al. (2024). A real-world multi-center RNA-Seq benchmarking study using the Quartet and MAQC reference materials. Nat Commun 15, 6167. doi:10.1038/s41467-024-50420-y;
Considerations for RNA Seq read length and coverage | Illumina Knowledge
Li, H., et al. (2025). Improving gene isoform quantification with miniQuant. Nat Biotechnol doi:10.1038/s41587-025-02633-9.
Bollas, A., et al. (2025). Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants. Commun Med. 5, 202. doi:10.1038/s43856-025-00901-y.
Frederick, M.J, et al. (2025) Reliable RNA-Seq analysis from FFPE specimens as a means to accelerate cancer-related health disparities research. PLoS One. 20(4):e0321631. doi: 10.1371/journal.pone.0321631.
Verma, R., et al. (2025). Commentary: a review of technical considerations for planning an RNA-Sequencing experiment. BMC Genomics 26, 918. Doi:10.1186/s12864-025-12094-8.