HM Revenue & Customs

Eliminating cyber crime with event-driven architecture

Using events to better predict and prevent financial fraud with Her Majesty’s Revenue and Customs.

In the United Kingdom alone, cybercrime causes billions of pounds worth of damage each year. Beyond the immediate economic impact—which continues to grow—online fraud creates significant distress for individuals, as well as massive resourcing strain on both the public and private sector.

As cybercrime continues to rise in scale and sophistication around the world, the situation is only getting worse. The only valid response is to explore new and improved ways to predict, preempt, and protect against fraud. Learn how Her Majesty’s Revenue and Customs (HMRC) is leveraging event-driven architecture, data pipelines, and big data processing to manage and mitigate the threat of cyber fraud.

a sunny day in london featuring an iconic red phone box

This case study will help you to understand:

tree diagram icon

How legacy events, stored in data lakes, can be used for evolving organisational needs.

data icon

The importance of data pipelines in storing and managing information for large organisations.

eye icon

The value of event-driven architecture in predicting and preventing fraud in real-time.

01

About HRMC and the Customer Insights Platform

HMRC is the UK’s tax, payments and customs authority.

The organisation performs a range of sophisticated, vital functions, but their primary roles can be summarized as:

  • Collecting money that pays for the UK’s public services and infrastructure
  • Supporting disadvantaged families and individuals with targeted financial support
  • Helping the honest majority to make accurate and valid tax submissions
  • Preventing the dishonest minority—cyber criminals—from cheating the system for illegal financial gains

In performing these roles, HMRC is dedicated to digital transformation with a view towards ‘making tax easy online’.

Each year, HMRC serves over 50 million business and individual customers while generating hundreds of billions of pounds in revenue.

To support and facilitate this digital activity, HMRC relies on:

  • A Multi-channel Digital Tax Platform (MDTP); a cloud platform hosted on Amazon Web Services. The MDTP is home to HMRC’s online self-service tax applications; 130 digital services comprised of 950+ decoupled microservices. Learn more about cloud-based platforms in our Digital Platforms playbook.
  • The Customer Insights Platform (CIP). The CIP performs a protective, legal function by collating and collecting customer data related to interactions that occur within the MDTP. This data is primarily captured through digital channels via web-facing applications like self-assessment, VAT filing and more.
  • 300TB

    of data monitored to date.

  • 9.6 billion

    transactions audited in January 2021.

  • 450 million

    transactions audited per day.

  • 800+

    services monitored and audited.

02

How events provide complete, real-time visibility of customers

With sophisticated customer journeys spanning multiple tax applications and departments, real-time visibility of user behaviour is invaluable.

Using an event-driven architecture, the CIP monitors every interaction within the Tax Platform to establish detailed user profiles. The monitoring is designed to ensure compliance across legitimate submissions—and the best possible user experience for genuine users—while protecting against prospective instances of fraud and identity theft.

The monitoring occurs in two tiers:

  1. Gathering general user meta-data and information generated through historical use of the MDTP (for example, the devices a person uses to access the platform, IP addresses, etc.)
  2. Using event-driven processing to plot sophisticated customer journeys in real-time (for example, diverting people through different pathways based on certain behaviours exhibited while they attempt to complete a tax submission.)

A consistent and detailed view of all customers, for all key stakeholders

Every customer interaction is tracked and audited as an event: from attempting a login or clicking on a content page to submitting a self-assessment. These comprehensive user profiles can be surfaced throughout the organisation to provide continuity and a single, up-to-date source of truth for:

  • Case Workers

    Prioritise cases using analytics generated from event metadata, and interactively explore events on a case by case basis to conclude investigative outcomes.

  • Customer service teams

    Use events for performance analytics and understanding customer journeys to improve their service.

  • Call centres

    Get a view of a calling customers previous web journey, so they can see which page of a tax assessment the user is stuck on for example.

  • Solictors

    Use customer journey event data as an audit log for legal purposes when investigating and prosecuting complaints.

  • Finance teams

    Can be used for BI reporting such as number of tax submissions, fraud repayments blocked etc.

The profiles offer invaluable context for various departments throughout HMRC—creating huge efficiencies by eliminating double handling of information—while triggering unique user journeys based on certain behavioural triggers that are identified through a combination of event-processing and meta-data.

Capturing events is always useful, even before you determine meaningful use-cases.

Since its inception, HMRC’s Multi-channel Digital Tax Platform has historically collected events and placed them on a messaging queue; even without a native events-processing tool.

With the implementation of Apache Kafka in 2017, the CIP is now able to push data into a batched analytical data lake.

As a result, the CIP preserves the notion of markable events within the data lake, while leveraging a range of other tools to perform big data processing functions across those captured events.

This approach—which is only possible as a result of implementing and storing events prior to the CIP’s capacity to use those events for real-time processing—creates two crucial benefits:

  1. With Kafka essentially functioning as a data pipeline, the data lake can be used for analytical, big-data processing thanks to the breadth of information captured as markable events. This information can be used to surface customer profiles based on legacy interactions and metadata generated through the Tax Platform. Learn more about data pipelines in our Data Pipeline playbook.
  2. The information can be used for real-time event processing, which is critical in identifying and blocking fraudulent transactions before they can occur.
A diagram with a user's journey signified by a line, intersecting with different points representing events

Using events and machine learning for behavioural analysis, HMRC plots dynamic, real-time user journeys based on a person’s historical context and current activity. Certain actions by ‘User B’ may trigger additional monitoring or automated security protocols, for example, and notify risk assessment or fraud detection teams.

The CIP is fed from a microservices-based architecture, running in Amazon Web Services (AWS). The platform leverages Kafka Connect to feed into S3 (Amazon’s Simple Storage Service), which facilitates the transition of information from Kafka to the data lake. A range of big data processing tools perform analytical functions on the information stored within the lake.

One example is a suite of libraries associated with structural transaction layers; an open-source library called Apache Hudi. This allows for data processing via Apache Spark. The configuration enables a range of capabilities associated with incremental-style event processing, creating two key benefits:

  1. This approach allows the CIP to preserve the informational and conceptual structure of events within the data lake.
  2. In turn, this provides far greater flexibility and specificity in analysing targeted datasets, rather than treating all information as one general set of data.
03

Using events to predict and prevent fraud in real-time

When it comes to digital crime, the best defence is undoubtedly predictive prevention.

Once a transaction is processed, it is incredibly difficult to recapture funds retrospectively. Eliminating illegal transactions in the first place is crucial.

Event-processing plays a vital role in the CIP’s ability to conduct behavioural analysis and identify potentially fraudulent activity in real-time.

For example, credential stuffing and other criminal practices can be detected within seconds of the very first attempts being made. Once a problematic account, transaction, or behaviour is identified, HMRC can trigger a vast array of corrective measures. These range from increased monitoring throughout a user’s journey to completely blocking their account.

programmer at a computer in an open plan office

Let’s consider a detailed practical use case.

Identifying fraud, instantly

Among many other things, the CIP is configured to monitor for events that signify multiple login attempts for different users from the same device.

Through real-time event-processing the CIP will provide immediate visibility of this behaviour. Rather than take a singular or definitive course of action, we monitor for additional events to establish more clarity around the user and divert them through different journeys based on a profile of behaviours.

Fraud detection requires nuance and sophistication to ensure we don’t penalise legitimate users. Multiple login attempts on a single device is common practice for accountants working on behalf of a range of clients, for example.

In this example, an event might query an API to serve additional security questions as part of the sign-in process. Alternatively, if the bank account, IP address, or device has a historical record of criminal activity or red-flag behaviours in the platform—as identified in the metadata associated with that user’s profile—the submission may be blocked entirely.

Thanks to event-driven architecture and detailed user-profiles generated using historical interactions with the platform, HMRC has the power to determine what processes they adopt or alter based on a real-time portrait of each customer.

The result? Improved experiences for legitimate users, and infinitely more effective protection against would-be criminals.

04

About the tech stack

The technical infrastructure of the Customer Insights Platform has evolved over time.

Using an emergent design approach, the team has been able to flexibly build in new capabilities, integrations, and ancillary services to meet evolving needs quickly. Over time, some of these solutions and third-party integrations have included:

logos of companies whose products were used in the development of the platform including AWS and Kafka

Want to know more?

Are you interested in this project? Or do you have one just like it? Get in touch. We'd love to tell you more about it.