Published on 05/12/2025

Using Advanced Analytics and Artificial Intelligence in Real World Evidence for FDA Submissions

Post updated on 24/06/2026

The integration of advanced analytics, artificial intelligence (AI), and machine learning (ML) into real-world evidence (RWE) is transforming how pharmaceutical and medical technology companies approach regulatory submissions to the U.S. Food and Drug Administration (FDA). In this tutorial, we will discuss the regulatory landscape, methodologies, and best practices for leveraging these technologies to support FDA submissions.

Understanding Real-World Evidence and its Regulatory Context

Real-world evidence is derived from data gathered outside of traditional clinical trials. This data encompasses various sources, such as electronic health records (EHR), claims and billing activities, patient registries, and even patient-reported outcomes. The FDA has emphasized the importance of RWE

in its Framework for using Real-World Evidence to Support Regulatory Decision-Making, allowing for more comprehensive evaluations of the safety and efficacy of medical products.

The FDA’s guidance documents, particularly the “Real-World Evidence: Assessing the Evolving Landscape” provide a roadmap for how RWE can be incorporated into regulatory submissions. RWE can be especially valuable in areas such as:

Post-market surveillance
Assessing long-term safety and effectiveness
Expanding indications for approved therapies

When planning to leverage advanced analytics and AI within RWE, it is vital to align your approach with regulatory expectations outlined in 21 CFR Parts 56 (Institutional Review Boards) and 312 (Investigational New Drug Application).

The Role of Advanced Analytics and AI in RWE

Advanced analytics, powered by AI and machine learning, enhances RWE by providing insights that traditional statistical methods may overlook. These technologies enable more sophisticated analyses, such as:

ML phenotyping: This involves the identification of distinct patient subgroups that may respond differently to a treatment based on their characteristics. Using machine learning algorithms, researchers can analyze large datasets to delineate various phenotypes that influence treatment outcomes.
Natural Language Processing (NLP): NLP techniques are utilized to extract insights from unstructured data sources, such as clinical notes within EHR systems. This allows for richer datasets that include qualitative data from healthcare encounters.
Causal ML: This methodology is employed to determine causative factors that lead to specific outcomes, thus establishing a clearer linkage between treatments and patient results. Understanding the causal relationship is fundamental for regulatory submissions, as it augments traditional associations found through standard statistical tests.

However, it is essential to ensure robust data governance and comprehensive AI governance frameworks, which focus on bias and explainability. The FDA is increasingly attentive to these concerns, particularly as they relate to AI systems used in healthcare. By maintaining transparency in data handling and model decisions, companies can fortify their regulatory submissions against potential scrutiny.

Framework for Developing RWE Submissions Using AI and Advanced Analytics

To effectively use advanced analytics and AI in RWE for FDA submissions, consider the following step-by-step framework:

1. Define the Research Question

The first step in any successful submission is to establish a clear research question that aligns with regulatory objectives. This question should focus on a specific outcome that can be evaluated through RWE analyses, such as long-term drug efficacy or safety profiles in diverse populations.

2. Data Source Identification and Validation

Compiling a robust dataset is critical. Identify appropriate data sources tailored to your research question. Sources may include:

Electronic health records (EHR)
National patient registries
Insurance claims data

After identifying potential data sources, perform validation checks to ensure data integrity, quality, and relevance. Engage in thorough data cleaning and preprocessing to eliminate bias and missing values.

3. Select Advanced Analytics Techniques

Once the data is prepared, select suitable analytics techniques that are best aligned with your research objectives. The methodology may include:

Machine learning algorithms for predictive analytics and supportive evidence generation
NLP for deriving insights from unstructured narratives in EHRs
Causal inference techniques to robustly assess the impact of interventions compared to control populations

These techniques should be chosen based on their ability to capture the complexities of real-world settings while being comprehensible to regulatory reviewers.

4. Conduct the Analysis

Using the selected methodologies, perform the analysis and document each step taken, including model parameters, assumptions, and limitations. This documentation is crucial, as the FDA expects a comprehensive presentation of your analytical approach.

5. Address Bias and Ensure Explainability

Prior to finalizing your findings, it is crucial to assess the presence of bias within your models and data. Ensuring that models provide explainable outcomes is pivotal for regulatory acceptance. This involves detailing how certain variables contribute to model predictions, thereby facilitating a better understanding of the analyses by regulatory reviewers.

6. Prepare Submission Documentation

This stage involves compiling all the data, analyses, and conclusions into a structured submission. It is essential to adhere to FDA’s 21 CFR Part 314 guidelines regarding submissions, which outline necessary documentation for Investigational New Drug applications and New Drug Applications. Additionally, include an RWE-specific section that elucidates the role of advanced analytics in your submission.

7. Engage with Regulatory Bodies

Consider engaging with the FDA early in your submission process, particularly through meetings designed to clarify issues related to RWE applications. Opportunities to discuss your study design, data analyses, or results with the FDA can provide valuable guidance and increase the likelihood of a favorable outcome.

Key Considerations for AI Governance in RWE

AI governance encompasses policies and procedures for the ethical use of AI technologies in RWE. Key aspects include:

Bias Management: Determine potential biases in datasets or algorithms that could influence results. Regular audits and assessments of the datasets and model outputs are crucial for identifying and mitigating biases.
Explainability: Maintain transparency in AI algorithms to assure stakeholders of the reliability of results. Techniques such as feature importance analysis or SHAP (SHapley Additive exPlanations) can help in explaining model predictions.
Regulatory Compliance: Remain compliant with applicable regulations, including adherence to 21 CFR Part 11 concerning electronic records and signatures during submissions.

Conclusion

As pharmaceutical and biotechnology companies increasingly integrate advanced analytics and machine learning into their submission processes, it is crucial to remain guided by the regulatory framework set forth by the FDA and to ensure that data governance, bias, and explainability are paramount in your methodologies. By leveraging the step-by-step framework provided in this tutorial, you can ensure that your use of RWE and AI positions your submission favorably in the ever-evolving regulatory landscape.

For additional information, feel free to explore the FDA’s dedicated page on RWE where resources and guidance documents are available to assist in navigating complex submissions.

What successful RWE case studies reveal about FDA… What Successful RWE Case Studies Reveal About FDA Expectations In recent years, the FDA has increasingly recognized the importance of Real-World Evidence (RWE) in supporting…
Real world data sources overview claims EHR… Comprehensive Overview of Real-World Data Sources: Claims, EHR, Registries, and Digital Health Data The evolution of healthcare delivery has been influenced significantly by the integration…
Future directions in publishing and sharing RWE case… Future Directions in Publishing and Sharing RWE Case Studies with Regulators In recent years, Real-World Evidence (RWE) has gained significant traction as a valuable tool…
Global RWD landscapes in US, EU and UK and… Understanding Global Real-World Data Landscapes: Implications for Real-World Evidence As the pharmaceutical and medtech industries increasingly rely on real-world data (RWD) to inform and support…
Natural language processing NLP to unlock… Unlocking Unstructured EHR Notes for RWE: A Step-by-Step Guide to NLP and Advanced Analytics for FDA Submissions Natural Language Processing (NLP) and advanced analytics, including…
FDA Real-World Evidence and Data Standards: A… FDA Real-World Evidence and Data Standards: A Comprehensive Regulatory Framework for Leveraging Real-World Evidence (RWE) and Data Standards for FDA Regulatory Decision-Making: A Complete Compliance…

FDA Guidelines

Using advanced analytics and AI in real world evidence for FDA submissions