Creating Data Quality Scorecard: Motivation and Mechanics
Motivation
The data quality scorecard is meant to present a picture of the data quality levels and as to where they are impacting the business and where the impact is not that high or important. The rules are meant to provide a framework for measuring how close the data is to business expectations.
Validation
The idea behind validating data to the defined data quality rules is to deduce the levels of conformance. The nature of rules being validated defines what is being measured. Most the major vendors allow for this so that data can be audited and monitored for validity.
Thresholds of Conformance
Acceptability thresholds are set to ensure the measurement of different levels of expected conformance as different data flaws have varied business impacts and different levels of business criticality are revealed by this. The simplest method is to have a single threshold which when breached will indicate unacceptable data quality and when not breached will represent acceptable data. In a two step process the data quality thresholds are set as acceptable and questionable but usable.
Ongoing Monitoring and Process Control
Tracking the overall data quality over time gives insight into how much the system has improved the data quality over a period of time. In other words it gives a historical perspective on this. The statistical control process shows if the data quality is within acceptable range when compared to historical control bounds and when it can help in notifying the data stewards when an exception event is happening and where to look to track down the process that is causing this. Historical charts are an important part of the scorecard.
Mechanics
A dashboard framework can be used to present the data quality scorecard where a data sets conformance to data quality rules can be grouped within dimensions of data quality and presented at the highest level and can also provide drill-down capability to explore each individual’s contribution to each score.
The levels of acceptability at lower levels of drill down can be reviewed by integrating the different data profiling and cleansing tools with the dashboard application. The historical levels of conformance for a set of rules and the current state of validity at the rule level and also a drill down to specific data that did not meet the expectations can be achieved.
The participants in the data quality framework can get a quick grasp of the quality of enterprise data and understand the most critical data quality issues and to take action and eliminate the poor data quality sources effectively and with ease.
Data Science in Finance: 9-Book Bundle
Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.
What's Included:
- Getting Started with R
- R Programming for Data Science
- Data Visualization with R
- Financial Time Series Analysis with R
- Quantitative Trading Strategies with R
- Derivatives with R
- Credit Risk Modelling With R
- Python for Data Science
- Machine Learning in Finance using Python
Each book includes PDFs, explanations, instructions, data files, and R code for all examples.
Get the Bundle for $29 (Regular $57)Free Guides - Getting Started with R and Python
Enter your name and email address below and we will email you the guides for R programming and Python.