Thorben Louw

Data/Machine Learning Engineer
Data & AI

March 5, 2026

The human element in data quality and GenAI: Culture, ownership and change

For years, organisations managed with imperfect data foundations.  A dashboard might be wrong. A report might need correction. But we coped because we could contain the consequences: errors were usually localised and we could count on human judgement and experience to compensate.

Today, data quality isn’t just a concern for humans. Agentic (AI) use of data removes that safety buffer of capable humans who compensate for quality issues. Natural language interfaces expand self-serve access to data well beyond specialist teams who understand the data’s quirks. Critical agentic workflows increasingly trigger downstream actions automatically. When data quality is bad in today’s environment, errors propagate faster and further.

So doing something about organisational data quality really is no longer optional for leaders.  Further pressure comes from regulations like the EU AI Act, which mandates measuring data quality and having traceability and accountability in AI systems.

Many leaders are tempted to treat this as just a tooling challenge, and hope to find a magic fix for organisational data quality by procuring the latest data quality tool. But at its heart, data quality is really a cultural challenge.

Everyone’s job — but someone’s responsibility

Culture determines whether issues get surfaced and dealt with holistically, or silently worked around.

It’s tempting to start by proclaiming that data quality is “everyone’s job” now, and focus on making it possible for anyone to safely and easily raise an issue when problems are discovered.

However, “everyone’s job” too often means “no one’s responsibility”. In practice, critical datasets must have named owners and stewards who are clearly visible in the data catalogue and easy to contact. Ownership cannot be tribal knowledge, and owners need to know what they are responsible for.

In mature organisations, owners are incentivised to really improve data quality, not just mask errors. They’re accountable for proactively measuring and reporting on the quality of the data they own, and having a plan for addressing problems. This requires leaders to reward transparency rather than arbitrary quality metrics.

Start where you are

You don’t need a grand all-or-nothing Data Quality Programme to make real progress quickly. It can begin with pragmatic steps: publishing owners for the most business-critical datasets; introducing simple (often free, open-source) tooling for profiling and rule-based data quality tests to measure the current baseline; introducing visible issue logging; and then prioritising quality improvements for data feeding AI use cases.

First make quality visible

Measuring data quality can’t be a once-off or manual activity. Use modern tooling and engineering practices that allow for continuous quality measurement (with repeatable checks running as part of data pipelines), meaning that data observability becomes a standard component of every data product.

Explicitly “advertise” the measured state of quality for each data product in your data platform, even if it starts off looking bad (remembering that the culture should incentivise transparency!) Demand that every data product publishes intended use, freshness expectations, known limitations and service levels into an accessible data catalogue. Data quality frameworks allow reports with additional, dataset-specific checks to also be easily available for inspection.  Such transparent, simple quality indicators (almost like a “nutritional label” for datasets) enable consumers to make informed decisions about whether the data is suitable for their use case, rather than make unfounded assumptions based on aspirational data quality policies.

Measuring and publishing quality indicators works well for traditional structured reporting data. But the ideas can also be applied to the unstructured datasets that are the backbone of GenAI applications, although what we measure is different. Building GenAI applications often means evaluating these systems for context-sensitive metrics like recency, truthfulness, relevance or accuracy, and here too, transparently publishing metrics to consumers helps users make informed choices about their use .

Encouraging visibility creates important discipline by “shifting quality left”. When teams are transparently quantifying quality by building in data observability from the start, they are incentivised to fix problems as they happen, rather than allowing problems to remain hidden until someone notices later.

Encourage appropriate quality — and acknowledge its cost

This cultural shift – reframing quality as an explicit measured characteristic of a product, not a fuzzy implicit goal – also encourages pragmatic use of imperfect data. Not every data product needs to be top quality to be useful to someone today!

Transparency forces us to confront the fact that quality takes effort and cost, and so a trade-off must be made.  The objective is not maximising quality everywhere, it’s ensuring appropriate quality where it is needed. Financial reporting, regulatory submissions and datasets driving automated decisions clearly demand rigorous controls. Exploratory sentiment analytics on product reviews may not. Encourage teams to calibrate investment in quality improvements according to risk and use cases.

Measure progress over time

Enabling transparent overall platform quality metrics for leaders is just as important as specific data product measures is for data consumers. Track leading indicators like the proportion of critical datasets with named owners, defined SLAs and traceable lineage, or time to detect and resolve incidents. Over time, reduced rework, fewer AI-related errors like hallucination rates in RAG applications, and greater stakeholder trust become visible outcomes.

In the GenAI era, data quality is the bedrock of operational resilience, not a background hygiene factor. Those organisations that treat quality as a cultural discipline will be the ones able to scale AI use of data and to drive real business value, with confidence.

About the author

Thorben is a data and software engineering specialist with over 18 years’ experience delivering scalable, pragmatic data and machine learning products across cloud platforms. He helps teams adopt modern data practices to rapidly build, test, and deliver value from their data.

You may also like

Data quality and GenAI: The business risks of poor data quality for GenAI

Blog

Data quality and GenAI: The business risks of poor data quality for GenAI

Blog

Are your microservices hiding data products?

Blog

Crafting quality in data pipelines

Get in touch

Solving a complex business problem? You need experts by your side.

All business models have their pros and cons. But, when you consider the type of problems we help our clients to solve at Equal Experts, it’s worth thinking about the level of experience and the best consultancy approach to solve them.

 

If you’d like to find out more about working with us – get in touch. We’d love to hear from you.