Data Architecture Trends Part 1; How to Improve Data Quality
Improve the Quality of Your Data by Observing It Every Step of the Way
Poor Data Quality (DQ) is a nightmare for organisations, leading to many failed projects and the loss of millions in revenue.
In the development of the Modern Data Stack, there has been a clear use case for improving DQ by removing the onus on the humans involved. Introducing Data Observability, a concept of observing the health of the data, borrowed from application observability from the Software Engineering world.
In this first part, deep dive into Data Architecture trends, we will focus on improving Data Quality using Data Observability measures.
Let's go!
So — What Is Data Observability?
Observe your data's health as it flows through various layers of your architecture.
Data Quality can be divided into two main categories, technical and functional checks. Technical checks usually include how up-to-date the data is. i.e. did the last job run?; is the table schema still the same? i.e. did someone randomly add a column and break the flow?; does the table have all the records? i.e. did random records get dropped in the pipeline?