Software Solutions

Creating a Pipeline of Clean Clinical Data

Today, most clinical data teams manage external data ad hoc using spreadsheets and manual methods to bring it into their EDC system.

Photo: KM/stock.adobe.com

Patient-centricity is an important focus for clinical trials, driving improvements in patient access to studies and convenience. This has resulted in more innovative approaches to patient data collection and convinced more people to participate in trials.¹ The changes also created a flood of lab, imaging, and other data captured outside of a research site. Sponsors and contract research organizations (CROs) are challenged to ensure the quality and validity of third-party patient data that’s inaccessible in an electronic data capture (EDC) system. 

Today, most clinical data teams manage external data ad hoc using spreadsheets and manual methods to bring it into their EDC system. It’s a high-stakes relay race where each player transfers a small amount of data to the next relay point—spreadsheet to spreadsheet or person to person—and hopes nothing is lost along the way.

A new vision for clinical data. Photo: Veeva.

Data organizations need to improve data quality and the overall speed of clinical data management, said Leianne Ebert, head of global data operations at Alcon. “We don’t want to be the rate-limiting obstacle to product approval.” 

Ebert added that bringing clean and timely data to trial stakeholders as early as possible will improve analysis and decision-making and help lead to successful submissions. Today, reliance on manual approaches increases cost, variability, and risk to compliance and patient safety. This can cause significant delays in database locks. As trial designs become more complex, development timelines shrink, and data volumes increase, robust methods are needed to aggregate and clean clinical data for today’s trials.

Leveraging Value from Patient Data

Sponsors and CROs must gain more value from the data they collect if they plan to utilize technologies such as artificial intelligence (AI) or machine learning (ML). A clinical data science executive at a diversified CRO believes the industry needs to automate patient data flow. 

The executive envisions an integrated data-cleaning model where cross-functional teams can work on a single platform, examine the same data, and use it to make decisions and track activities—all within an integrated data quality plan. This change can move clinical data management toward a progressive delivery of submission-ready evidence that allows all stakeholders access to the most relevant data. 

Over the past few years, several vendors have developed data management solutions, typically called clinical data workbenches. Sometimes developed as data hubs or platforms, clinical data workbenches can help bring data management teams closer to meeting these goals. The latest clinical data workbenches aim to establish a pipeline of clean data from case report forms (CRF) and external sources that stakeholders can access and manage in one place, including third-party data providers. 

Clinical data workbenches simplify data transformation and listing creation by focusing more closely on data ingestion, cleaning, and extraction. They can also be helpful to automate more data checks and queries, reduce reconciliation workload, and speed database lock. Improved collaboration between clinical trial stakeholders creates a deeper context for clinical data, helping to establish a foundation for the successful application of AI and ML. 

Most workbenches provide data review and cleaning in one system and export the data without the delays that manual approaches can create. Some also allow relevant prior work like past queries to be reused, saving critical time and effort.

A diversified CRO, a specialized medical devices company, and a leading biopharma recently shared their experiences working with data workbenches, connecting these efforts with EDC and clinical modernization. Following are summaries of what they’ve learned and the results they are seeing from using clinical data workbenches so far. 

The CRO adopted a clinical workbench as part of a broader effort to implement a cloud-based clinical data management system (CDMS) and unified EDC. So far, the new tool has allowed them to pass more than 30% in clinical data management cost savings to its biopharma sponsor clients.

A major contributor to those pass-along savings is 50% reduction in the time required to create listings. The typical process for creating listings involves multiple steps, including having the EDC vendor extract data, putting that data on a server, creating and running SAS programs, and exporting results to data management. 

Most listings are pre-generated with the clinical data workbench, which results in a one-step process. No additional steps are needed when standard listings are used, since they are generated once at the start of the study and reused throughout the trial. The workbench has helped the CRO get common listings out ahead of the critical path. 

It has also reduced overall data cleaning time with automatic query checks. A rules engine checks the data based on explicit criteria that the user establishes. Queries can be generated automatically when discrepancies are found, reducing the average time from three minutes to just under 60 seconds per query. These benefits multiply when large numbers of queries are generated automatically.

The CRO can bring patient data into its workbench as soon as patients are enrolled in a trial through API connections with third-party data providers, which speeds data aggregation and access. The workbench has also reduced the time and number of steps required for protocol amendments. Instead of revising all the protocol data before extracting it, publishing and data updates can be accomplished simultaneously. As a result, the company has reduced by 40% the time required for data locks, completing the process in under three weeks.

Transforming Third-Party Data

Alcon’s clinical data workbench has made it much easier to ingest third-party patient data. “External data has always been a bigger challenge because it’s less standardized and isn’t something we can control at a study-to-study level. Because we’re more dependent on external data vendors, there can be more variability,” said Ebert.

So far, the workbench is helping Alcon’s data management team to collaborate more closely with data vendors to prevent variability. As Ebert puts it, the team can now simply let vendors be experts in their areas and focus on bringing their data in and managing it holistically. “We can now join external data very easily with EDC data to see the full picture,” she said. 

The new tool has also simplified data transformation, eliminating costs for custom programming. The data workbench enables a unified approach to data cleaning, review, and reconciliation, increasing the certainty that decisions are based on correct assumptions. 

“Getting data final, clean, and accurate as close to the patient visit date as possible allows us to lock down and secure data faster. We can also push it for analytical outputs and, ultimately, to submission-related deliverables,” said Ebert.

Embracing Holistic Data Management 

A top 20 biopharma has also implemented a data workbench as it embarks on its vision for “clinical data on demand.” A new CDMS with an improved EDC and data workbench has delivered efficiencies, shifting the focus from data checking to strategic activities that add value.

The new data workbench initially focused on ingestion to harmonize clinical data and ensure all teams can access it from one place. EDC and third-party data are consolidated and aligned to a study-specific backbone or data model for every trial.

The company’s clinical data team has also reduced the manual effort needed for data transformation and review. So far, automation, including automatic change detection, edit checks, and queries, have reduced manual processes by 30%-50%. Data is now reviewed without spreadsheet trackers and both data managers and providers can review queries in the same system. 

The ability to generate queries automatically has reduced data cleaning workloads. Most auto-generated queries can now be managed and closed without any help from the data management team. Overall, the workbench saves the company over 60 hours per week on query management and data review.

The latest clinical data workbenches generate ROI by reducing the effort and risk of managing larger streams of more diverse patient data. Companies are already noticing improvements in speed and efficiency.

Data workbenches bring clinical trials closer to a vision of clean, timely, and accessible data that can drive more patient-centric approaches. They pave the way for the industry’s use of more innovative solutions that can help speed patient access to new treatments.

Reference

  1. Deloitte, Broadening Clinical Trial Participation to Improve Health Equity, 2022

MORE FROM VEEVA—Health Equity Requires Awareness, Focus, and Collaboration: Industry Experts Agree


Pavel Burmenko developed clinical data expertise as a statistical programmer. He found success in clinical data collection and analysis, where processes and technology meet. As team lead for Veeva CDB, Burmenko is now helping companies transform clinical data management.

Keep Up With Our Content. Subscribe To Medical Product Outsourcing Newsletters