#148: Legacy Data - why you should be thinking about it

Over the last few episodes, I've focused on legacy software - what it is, how it occurs, and various strategies to deal with it.

Alongside that legacy software is the legacy data - which is arguably more important than the data.

In this episode, I want to talk about why thinking about that legacy data is so important

Or listen at:

Published: Wed, 07 Sep 2022 15:56:57 GMT


Hello and welcome back to the Better ROI from Software Development Podcast.

Over the last few weeks, I've been running a mini-series looking at Legacy Software - what it is, how it occurs, and the various strategies to deal with it.

I want to add on to that mini-series by looking at Legacy Data. In this episode, I want to talk about why thinking about legacy data is so important, and next week I'll provide some advice on how to deal with it.

While we may understand the need to fix legacy systems, we often overlook the data.

When I've talked about approaches to fixing legacy application software, I've talked about:

  • Evolution - the continuous improvement to the existing system to bring it out of its legacy state
  • Revolution - the wholesale replacement of the system with something new
  • And Outsourcing - using a third party to either maintain or replace the legacy system.

But what happens to the data we have accrued? What happens to that?

Dependent on which approach you've taken to fix your legacy system, the likelihood is that you will have to migrate the data in some way - be it to transform it in some manner or move it in its entirety.

The importance of that data to be migrated can depend on many factors, including the content, the age and a wide variety of needs of your organisation, your customers, or even regulatory bodies.

I'd argue that in many cases the data is more important than the actual software. We often have considerable value locked away in that data.

We want to have confidence in that data before, during and after any work to resolve any legacy software problems. We want to ensure that we are not losing, exposing or corrupting valuable data to the organisation, its customers or regulatory bodies. We don't want to:

  • duplicate account balances
  • lose order details
  • lose customer details
  • or expose organisation or customer sensitive data.

Doing so can result in material impact to the organisation - from unhappy customers to financial regularities to substantial fines for insufficient security.

For the most critical data. I would suggest you start with how will you validate after any changes be made. And you will do this either before or as one of the first stages of any work on the legacy software.

And if the data is something you will expect to be audited, especially if financial or part of some regulatory governance, involve the auditors before you start. It surprises me greatly that organisations fail to involve auditors in this kind of activity and are then surprised when an auditor pulls them up for not doing it correctly.

Most auditors will want to see evidence that the data remained valid through any form of migration activity. Some activities may be as simple as a before and after reconciliation of balances. Other activities may require evidence that sensitive data remain secure at all stages, and any chance of exposure was negated.

Again, I can only repeat, involve the auditor before you make any change. Finding that an audit fails for lack of evidence can have serious ramifications to an organisation.

So this begs the question, do you know how important the data is within those legacy systems?

Now this is obviously going to vary wildly from system to system and organisation to organisation.

Thus, if you're not sure, then I would recommend building a risk matrix similar to the one I introduced in episode 143 for understanding the software systems within your organisation and just how legacy they are.

Again, this can be a simple spreadsheet of the types of data held along with an indicative score across a number of categories - for example:

  • Value to your organisation
  • Value to your customer
  • Sensitivity - such as if it contains PII data or would fall under GDPR
  • Or will the data be needed as evidence for any audit or investigation?

This can be an interesting exercise to do regardless of any plans to migrate the data again. Like the risk matrix for legacy software, it can be an eye opening experience.

In some cases it can highlight data which simply has no value to the organisation. Many organisations in years past will have collected data on the off-chance it may be useful in the future. I've certainly heard many justifications from sales or marketing that some piece of data would be really helpful if we collect it - but then it never gets used.

In those situations, I would generally consider deleting it. We should no longer consider data an asset. We should consider it a liability - and as such, it has to justify its reason for existence. Simply having data because we might need it is opening the organisation to increase security risks, increase costs and increase complexity. Removing helps to reduce those. It helps to reduce the noise to signal ratio and allows us to see the wood for the trees.

And where the data does have value, then this approach can help define guardrails around how it should be managed and worked with. And again, what those guardrails look like will depend between organisations and the value of that piece of data for the organisation. It is very much a one size does not fit all, but it should be in line with how you want to validate that any data transformation has been performed successfully.

In this episode, I wanted to encourage you to consider the importance of any legacy data when working with legacy software.

Data is critical to the success of any organisation. Having the right data, which is correct and accurate, is table stakes for running a modern organisation. Without the appropriate level of discipline and rigour over our organisational data, then we open ourselves up to exposing sensitive data, failing to meet audit requirements for financial accounting and regulatory activities, and simply being unable to know what is happening in our organisation.

Thus, while data should be considered critical in the normal day to day, anything that is likely to change, transform or move it should be approached with additional rigour.

We should make sure in advance that we have the appropriate guardrails and the audit processes in place to ensure success. To ensure that we don't:

  • duplicate account balances
  • lose order details
  • lose customer details
  • or expose organisation or customer sensitive data.

In next week's episode, I want to take a look at some practical advice when migrating data.

Thank you for taking your time to listen to this episode. I look forward to speaking to you again next week.