Contrasting and Combining Transcriptome Complexity Captured by Short and Long RNA Sequencing Reads
Confirmed Presenter: Seong Woo Han, University of Pennsylvania, United States
Track: HiTSeq
Room: 517d
Format: In Person
Authors List: Show
- Seong Woo Han, Seong Woo Han, University of Pennsylvania
- San Jewell, San Jewell, University of Pennsylvania
- Andrei Thomas-Tikhonenko, Andrei Thomas-Tikhonenko, University of Pennsylvania
- Yoseph Barash, Yoseph Barash, University of Pennsylvania
Presentation Overview:Show
High-throughput short-read RNA sequencing has given researchers unprecedented detection and quantification capabilities of splicing variations across biological conditions and disease states. However, short-read technology is limited in its ability to identify which isoforms are responsible for the observed sequence fragments and how splicing variations across a gene are related. In contrast, more recent long-read sequencing technology offers improved detection of underlying full or partial isoforms but is limited by high error rates and throughput, hindering its ability to accurately detect and quantify all splicing variations in a given condition.
To better understand the underlying isoforms and splicing changes in a given biological condition, it’s important to be able to combine the results of both short and long-read sequencing, together with the annotation of known isoforms. To address this need, we develop MAJIQ-L, a tool to visualize and quantify splicing variations from multiple data sources. MAJIQ-L combines transcriptome annotation, long reads based isoform detection tools output, and MAJIQ (Vaquero-Garcia et al. (2016, 2023)) based short-read RNA-Seq analysis of local splicing variations (LSVs). We analyze which splice junction is supported by which type of evidence (known isoforms, short-reads, long-reads), followed by the analysis of matched short and long-read human cell line datasets. Our software can be used to assess any future long reads technology or algorithm, and combine it with short reads data for improved transcriptome analysis.