If you’re a senior IT leader, I’d like to make a prediction. You have faced a key data governance challenge at some time. Probably quite recently. In fact, there is a good chance that you’re facing one right now. I know this to be true, because clients approach us frequently with this exact issue.
However, it’s not a single issue. In fact, over time we have come to realise that data is a slippery term that means different things for different people. Which is why we felt that deeper investigation into the subject was needed, to gain clarity and understanding around this overloaded term and to establish how we can talk to clients who see data governance as a challenge.
So, what is data governance? And what motivates an organisation to be interested in it?
Through a series of surveys, discussions and our own experiences, we have come to the conclusion that client interest in data governance is motivated by the following wide range of reasons.
1. Data Security/Privacy
I want to be confident that I know the right measures are in place to secure my data assets and that we have the right protections in place.
2. Compliance – To meet industry requirements
I have specific regulations to meet (e.g. health, insurance, finance) such as:
- Storage – I need to store specific data items for specified periods of time (or I can only store for specific periods of time).
- Audit – I need to provide access to specified data for audit purposes.
- Data lineage/traceability – I have to be able to show where my data came from or why a decision was reached.
- Non-repudiation – I have to be able to demonstrate that the data has not been tampered with.
3. Data quality
My data is often of poor quality, it is missing data points, the values are often wrong, or out of date and now no-one trusts it. This is often seen in the context of central data teams charged with providing data to business functions such as operations, marketing etc. Sometimes data stewardship is mentioned as a means of addressing this.
4. Master/Reference Data Management
When I look at data about the same entities in different systems I get different answers.
5. Preparing my data for AI and automation
I am using machine learning and/or AI and I need to know why decisions are being made (as regulations around the use of AI and ML mature this is becoming more pressing – see for example https://ico.org.uk/for-organisations/guide-to-data-protection/key-data-protection-themes/explaining-decisions-made-with-ai/).
6. Data Access/Discovery
I want to make it easier for people to find data or re-use data – it’s difficult for our people to find and/or access data which would improve our business. I want to overcome my data silos. I want data consumers to be able to query data catalogues to find what they need.
7. Data Management
I want to know what data we have e.g. by compiling data dictionaries. I want more consistency about how we name data items. I want to employ schema management and versioning.
8. Data Strategy
I want to know what strategy I should take so my organisation can make better decisions using data. And how do I quantify the benefits?
9. Creating a data-driven organisation
I want to create an operating model so that my business can manage and gain value from its data.
I think it’s clear from this that there are many concerns covered by the term data governance. You probably recognise one, or maybe even several, as your own. So what do you need to do to overcome these? Well, now we understand the variety of concerns, we can start to address the approach to a solution.
Understanding Lean Data Governance
Whilst it can be tempting for clients to look for an off-the-shelf solution to meet their needs, in reality, they are too varied to be met by a single product. Especially as many of the concerns are integral to the data architecture. Take data lineage and quality as examples that need to be considered as you implement your data pipelines – you can’t easily bolt them on as an afterthought.
Here at Equal Experts, we advocate taking a lean approach to data governance – identify what you are trying to achieve and implement the measures needed to meet them.
The truth is, a large proportion of the concerns raised above can be met by following good practices when constructing and operating data architectures – the sorts of practices that are outlined in our Data Pipeline and Secure Delivery playbooks.
We have found that good data governance emerges by applying these practices as part of delivery. For example:
- Most Data security concerns can be met by proven approaches – taking care during environment provisioning, implementing role-based access control, implementing access monitoring and alerts and following the principles that security is continuous and collaborative.
- Many Data Quality issues can be addressed by implementing the right measures in your data pipelines – incorporating observability through the pipelines – enabling you to detect when changes happen in data flows; and/or pragmatically applying master and reference data so that there is consistency in data outputs.
- Challenges with data access and overcoming data silos are improved by constructing data pipelines with an architecture that supports wider access. For example our reference architecture includes data warehouses for storing curated data as well as landing zones which can be opened up to enable self-service for power data users. Many data warehouses include data cataloguing or data discovery tools to improve sharing.
- Compliance challenges are often primarily about data access and security (which we have just addressed above) or data retention which depends on your pipelines.
Of course, it is important that implementing these practices is given sufficient priority during the delivery. And it is critical that product owners and delivery leads ensure that they remain in focus. The tasks that lead to good Data Governance can get lost when faced with excessive demands for additional user features. In our experience this is a mistake, as deprioritising governance activities will lead to drops in data quality, resulting in a loss of trust in the data and in the end will significantly affect the user experience.
Is Data Governance the same as Information Governance?
Sometimes we also hear the term Information Governance. Information Governance usually refers to the legal framework around data. It defines what data needs to be protected and any processes (e.g. data audits), compliance activities or organisational structures that need to be in place. GDPR is an Information Government requirement – it specifies what everyone’s legal obligations are in respect of the data they hold, but it does not specify how to meet those obligations. Equal Experts does not create information governance policies, although we work with client information governance teams to design and implement the means to meet them.
The field of data governance is inherently complex, but I hope through this article you’ve been able to glean insights and understand some of the core tenets driving our approach.
These insights and much more are in our Data Pipeline and Secure Delivery playbooks. And, of course, we are keen to hear what you think Data Governance means. So please feel free to get in touch with your questions, comments or additions on the form below.