Virtualization Technology News and Information
What to Know Before Migrating Bad Data

Many organizations are working on moving their data to the cloud, which may seem like a 1:1 process, but that's certainly not the case. For starters, it depends on the format and application you were using in the past.

You may need to convert or translate the data for more modern use cases. Assuming a 1:1 process also assumes that all data is totally accurate and in its appropriate format.

Unfortunately, nearly all data suffers from some form of corruption - the resulting information often referred to as bad data.

What Is Bad Data?

Experienced professionals and data experts alike should already know the definition of bad data. For those who don't, the term refers to any kind of information or dataset that is missing details, in the wrong format, duplicated unnecessarily or even rife with errors.

What some may not know is what goes into cleaning bad data, which must be done before it can be migrated to a new system or platform. The same is true when migrating to the cloud, because you want the resulting data to be available for use by any other systems and applications that have access to it.

To clean data, organizations and teams must first come up with a standardization for how their information should be organized. This doesn't just affect existing data - it should also include data-entry processes for the collection and use of new information.

Dysfunctional or bad data is migrated quite often, which causes even bigger issues later.

How to Deal With Bad Data During a Migration

Before a migration begins, data planning should be a central focus of the entire operation. Part of that strategy includes factoring in the assessment and cleaning of broken or dysfunctional data.

While planning, it's important to:

  • Outline the data migration process and all steps required leading up to the actual event
  • Identify each dataset or collection's source and what that means for its current organization
  • Dedicate teams or individual members to each subset of data so it gets the proper focus
  • Assess volumes that will be migrated for readiness and prioritize them based on importance and accuracy ratings
  • Complete trial migrations to ensure the clean data will be moved appropriately
  • Create a contingency plan for the loss of existing data or corruption that may occur during the move
  • Develop an acceptance criteria and standard and ensure any data being migrated meets it before being moved

These steps will help mitigate the risks associated with bad data migrations. More importantly, you can cut down on the corruption and destruction of mission-critical information.

Always Have Backups

Before, during, and after a migration, you should have the appropriate data and information backups handy. This allows you to quickly restore information in the event of a major failure. Backups are a standard practice for all forms of data management and facilitation, and you should already have a robust plan, process and application in place.

Develop and Test Migration Tools

Before activating an actual migration system, you'll need to develop the necessary tools and functions that will allow you to move content accurately. Even if you're using third-party tools, it may be necessary to use applications to serve your data - such as a tool that will convert existing data to a new format.

Whatever the case, it's important that you and your teams test all the necessary applications and systems to ensure optimal performance. You'll want to make sure you're using real data, not false information, because this will help you measure whether or not it comes out correct on the other side.

You should also document the entire process from start to finish, not just to adhere to data mapping rules, but also for a reference later should you run into any unforeseen issues. You at least have a base or jumping-off point to reference, which will help you discern the underlying problem and find a solution.

Fix Data Before the Move

Ultimately, however, it's essential that you take the time to fix and assess the quality of your data before it ever moves or exchanges platforms. As a general rule, any data that is bad on-premises will be even worse when uploaded to the cloud.

If you only take away one thing, let it be this: Develop a standard or common data model that all your information can be adapted to. This ensures it all matches up and is accurate before, during and after the move.

Finally, create a process to fix any dysfunctional data before the move takes place. Once it's been migrated to the cloud, the problems get exponentially worse, especially when the content is accessed by multiple systems, users and platforms. This will save you a lot of headaches in the long run.


About the Author

Kayla Matthews is a tech-loving blogger who writes and edits Follow her on Twitter @productibytes to read all of her latest posts!
Published Monday, January 21, 2019 10:53 AM by David Marshall
Filed under: ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<January 2019>