Sharing and Learning at the Technology Leaders Network: Enterprise Data Hub

This was originally posted on the Government technology blog

fullsizerender-32

The Technology Leaders Network was established in 2013 to help departments work collaboratively, share good practice and drive the technology agenda across government. Over the summer we have been working on plans to breathe new life into the Network; looking at how we can make the most out of bringing Technology Leaders together.

Traditionally Tech Leaders have met for monthly board meetings with a restricted attendee list and a formal set of minutes circulated afterwards. We will continue to have those meetings where specific decisions are needed, but want to make the Network more transparent and open it up to the wider pool of technologists we have across government. We’re going out to departments more, learning from external speakers and seeing demonstrations of tools in action.

In September we held the first session since our refresh: a live demo of the HMRC Enterprise Data Hub (EDH). Nigel Green, Director IT Delivery at HMRC, describes the hub below.

The hub of all knowledge

Can you imagine storing and managing over 10 million filing cabinets worth of customer data? Then working out how to keep that data relevant, correct and useable? It’s a bit of a conundrum but since 2014 we’ve been working on a solution called the Enterprise Data Hub (EDH) and we’ve just reached a major milestone in its development.

A bit of background

At the moment our customer data is spread across 11 separate data warehouses with some information only available to parts of HMRC, so cross referencing can be tricky.

EDH will bring all our customer data into one usable place. It’ll let us use technological advances to store and analyse customer data using freely available open source tools and commodity hardware. It will save money and give us new ways of interrogating data. We’ll be able to work smarter, quicker and ultimately increase our tax revenues.

Finding a solution

Our technical solution is Apache (Hadoop), an open-source software framework for storage and large-scale processing of data sets on clusters of commodity hardware. Instead of relying on expensive, proprietary hardware and disparate systems to store and process data, this enables the distributed parallel processing of huge amounts of data across relatively inexpensive, industry-standard servers that both store and process the data, and can also scale to accommodate very large data volumes.  

It can handle all types of data, including structured, unstructured, log files, pictures, audio files, communications records and email.

Maintaining security

To protect our customers’ data we’ve incorporated a world leading approach to a technique called ‘Tokenisation’.

Tokenisation is a reversible method for replacing sensitive data with non-sensitive ‘tokens’. It provides similar security benefits as encryption, but retains the vital usability of data for our business processes. This gives us unprecedented capability to securely manage customer information from one single control, and to tightly control access to de-tokenised data giving greater protection to vulnerable or at risk customer groups.

This tokenisation technique is a world first, and we know that a number of banks are very interested in what HMRC are doing, and want to use a similar solution themselves.

What is the major milestone?

Well, we’ve just reached the point where we can securely upload customer data from anywhere within the department. This may not sound like much, but because of the complexity of how our data was stored before and our determination to develop a secure approach it’s a massive step forward.

This means we can move onto the important tasks of migrating over data and services and use EDH for our key transformation projects. We can start bringing data tools together across HMRC and identify future opportunities to analyse and use our customer data smarter. We’ll be able to combine data and customer analysis to improve services and customer experience.

And on a practical level we’ll start making significant savings as we decommission each of the data warehouses.

Transformation of our analytical capability will be a game changer in terms of HMRC’s digital ambition and achievement of our revenue generation objective. There’s also potential for cross government transparency with UK economic benefits and shared use of cloud storage. The possibilities are endless.

Stay up to date

You can find out more about the work being done by HMRC by following their blog HMRC digital.

We’ll make sure we blog about the work of the Tech Leaders Network in the future and showcase opportunities to get involved.

Nigel Green is Director of IT Delivery at HMRC

Tags: ,

Leave a Reply

Your email address will not be published. Required fields are marked *