BioPE-RV: A New Paradigm for Biomedical Relation Extraction Integrating Prompt Engineering and Explicit Rule Verification
DOI:
https://doi.org/10.71373/yf07bz53Keywords:
Entity-Relationship Extraction; LLM;Prompt Engineering; Logical Constraint VerificationAbstract
Abstract: Infectious diseases remain a major public health challenge threatening human health and socioeconomic development. Understanding viral infection mechanisms is therefore critical for effective disease control. Biomedical literature contains a vast number of regulatory relationships between genes and proteins, and accurately extracting these relationships is a fundamental prerequisite for constructing molecular regulatory networks and supporting downstream biological research.
In recent years, large language models have demonstrated considerable potential in biomedical information extraction tasks. However, when applied to regulatory relationship extraction, they still suffer from systematic limitations, including the misclassification of co-expression as regulation, excessive chain causal inference, and insufficient understanding of experimental contexts. These issues substantially compromise the reliability of extracted results.
To address these challenges, this study proposes BioPE-RV, a regulatory relationship extraction framework that integrates large language model–based candidate generation with explicit biological logic verification. The framework first leverages a large language model to generate candidate regulatory relationships from text and subsequently applies multi-level logical constraints—including entity validity, explicit regulatory agency, semantic consistency, causal chain integrity, and experimental context rationality—to iteratively verify and filter candidate results.
Experimental results on COVID-19-related biomedical literature demonstrate that combining generative models with interpretable biological logic constraints significantly improves both the reliability and reproducibility of relationship extraction.
