Özet
The revolutionary CRISPR-Cas9 technology has revolutionized genetic engineering, and it holds immense potential for therapeutic interventions. However, the presence of off-target mutations and mismatch capacity poses significant challenges to its safe and precise implementation. In this study, we explore the implications of off-target effects on critical gene regions, including exons, introns, and intergenic regions. Leveraging a benchmark dataset and using innovative data preprocessing techniques, we have put forth the advantages of categorical encoding over one-hot encoding in training machine learning classifiers. Crucially, we use latent class analysis (LCA) to uncover subclasses within the off-target range, revealing distinct patterns of gene region disruption. Our comprehensive approach not only highlights the critical role of model complexity in CRISPR applications but also offers a transformative off-target scoring procedure based on ML classifiers and LCA. By bridging the gap between traditional target-off scoring and comprehensive model analysis, our study advances the understanding of off-target effects and opens new avenues for precision genome editing in diverse biological contexts. This work represents a crucial step toward ensuring the safety and efficacy of CRISPR-based therapies, underscoring the importance of responsible genetic manipulation for future therapeutic applications.