Bookmarks
Data Cleaning/Profiling Tools
Description
Improving the quality of healthcare data for research use is a priority activity. Data profiling software tools can assist this process by evaluating data sets across a range of quality dimensions. A wide range of freely available software with data profiling capabilities is available but healthcare organizations often have limited data engineering capability and expertise. We aimed to evaluate freely available data profiling software tools using healthcare data.
28 freely available data profiling tools were evaluated for their capabilities based on publicly available information and data sheets. Based on this assessment nine tools were selected for further detailed evaluation using a synthetic data set of 1000 patients and associated health data in a common health data model, and the tools scored based on their functionality with this data set.