Artificial Intelligence for Single-Cell Biology: From Representation Learning to Predictive Modeling
DOI:
https://doi.org/10.71373/3nmqre81Abstract
While single-cell sequencing technologies provide a high-resolution approach for dissecting cellular heterogeneity, their data are inherently high-dimensional, sparse, noisy, and strongly affected by batch effects and limited annotations. These properties make artificial intelligence (AI), particularly deep generative and probabilistic models, more suitable for analyzing single-cell data. Recent AI frameworks, including variational inference–based model scVI and its extensions, have supported unified pipelines for normalization, representation learning, batch correction, multimodal integration, and downstream analyses. Specialized downstream analyses, such as scalable cell-type annotation, trajectory and dynamic inference, cell–cell communication analysis, spatial mapping, and the prediction of genetic or pharmacological perturbation responses, can be extended by learning transferable latent representations. Emerging self-supervised foundation models promise reusable cellular representations across tasks, tissues, and species. Addressing challenges in benchmarking, interpretability, uncertainty quantification, and robust generalization highlights future frontiers in the development of predictive and causal single-cell models.
