N. A. Koltunov, E. P. Guguchkin, E. A. Karpulevitch, “Optimization of short reads alignment with indels in whole-genome sequencing”, Proceedings of ISP RAS, 2025, Volume 37, Issue 6(2),Pages <nobr>211

Optimization of short reads alignment with indels in whole-genome sequencing

N. A. Koltunov, E. P. Guguchkin, E. A. Karpulevitch

Ivannikov Institute for System Programming of the RAS

Abstract: We present a novel method for aligning reads in whole-genome sequencing (WGS), aimed at improving alignment accuracy and the practical efficiency of this stage of genomic analysis. Unlike graph-based approaches, the proposed algorithm directly integrates knowledge of known genetic variants into the alignment process, enabling more accurate mapping of reads to the reference genome without constructing complex graph structures. The method has demonstrated high effectiveness on real sequencing data: we observed a consistent improvement in read alignment quality in highly variable and difficult-to-map regions of the genome. In particular, using variant information allows more precise alignment of reads that contain alternative alleles, reducing the number of mapping errors in these regions. At the same time, the required computational resources remain at an acceptable level, making this solution applicable in standard WGS pipelines without a significant increase in workload. The alignment speed of the algorithm is comparable to traditional solutions, which facilitates its integration into existing analytical pipelines. The practical value of the method lies in the improved alignment accuracy, which directly affects the quality of downstream variant calling and other analyses. The proposed approach can serve as an effective alternative to current graph-based alignment methods, providing comparable improvements in alignment quality with lower complexity of implementation. Future work will include optimizing the algorithm’s performance, expanding the set of genetic variants accounted for, and conducting in-depth comparisons with other tools. These steps are intended to further increase the method’s efficiency and reliability, reinforcing its significance for practical use in genomics.

Keywords: short-read alignment, indels, alternate contigs, liftover, variant calling

DOI: 10.15514/ISPRAS-2025-37(6)-30