Data

Data Creates Meaning Out of Chaos

Data Archeology

Moving data to a new system is like moving out a house you have lived in for decades.  You are forced to clean out the attic, basement, and the corners of every closet.  You decide what’s valuable, what’s trash, what needs to be repaired, and how everything you need will fit into the new house.

Data migration reveals the past and mistakes you do not want to make again.  

Getting the right things right

Along the way, you will encounter strategy, technical, business, and political challenges.

These challenges combined make data a wicked problem, and it cannot be managed like a tame problem.

Navigating Data

There is more to data than data migration, but if you get data migration right, and avoid the msitakes data migration reveals, you’ll be most of the way there. 

Seems Easy Enough ...

Everyone can see the technical aspects of data migration, the ETL:

  • Extract – gathering the data needed from legacy sources
  • Transform – mapping the extracted data
  • Load – getting the data into the target system.

Data Migration is almost always more difficult than anticipated.  The illustration provides a framework for considering four categories of data migration challenge:

Data Strategy

  • Data Governance
  • Creating a Single Source of Truth
  • Analytics & Reporting
  • Security
  • Historical Data

Technical Challenges

  • Legacy data chaos
  • Security during the ETL process
  • Run times measured in days, not hours
  • Cutover Complexity

Business Challenges

  • Cleaning
  • Harmonization
  • Validation

Political Challenges

  • Ownership
  • Governance 
  • Resistance 

A cohesive data strategy should address critical topics from day one. These areas, when considered holistically, ensure a solid foundation.

Data Governance

Establish a Data Governance Council comprising senior business leaders. This group assigns data owners and data stewards, who will create a shared business vocabulary, own data cleaning, determine the meaning and standard names for data, and own clean data for their domain areas after go-live.

Quality & Migration

Data should be ‘cleaned’ prior to migration. This includes profiling, cleansing, deduplicating, and validating legacy data prior to load. It also defines the migration scope: what data is essential for day-one operations and what historical data will be archived or discarded.

Avoiding Data Migration Security Risks

Careless extraction and handling of sensitive data, such as PII or compensation, into uncontrolled spreadsheets, or insecure staging areas, is common and represents exceptional risk. To protect secure information, organizations must implement secure, auditable ETL platforms without exception, enforce mandatory data masking during all test cycles, and ensure all data is always encrypted.

Source of Record (SoR) vs. Single Source of Truth (SSOT)

System of Record (SoR)

In a cloud ERP environment, the ERP system acts as the transactional System of Record (SoR), storing and processing business transactions.

Single Source of Truth (SSOT)

A cloud ERP systems does not serve as the sole enterprise-wide Single Source of Truth (SSOT) for analytics and reporting, since direct database access is restricted and data often resides in multiple systems.

You need a modern, vendor-agnostic data store —such as an enterprise data warehouse—as the SSOT to provide access, performance, and integration across different systems (such as the ERP and CRM).

Reporting Strategy

Start your reporting and analytics strategy early. Review legacy reports for business value. Set two modes: use the ERP’s tools for real-time operations and the enterprise data warehouse for strategic business intelligence. and the enterprise data warehouse for strategic, cross-functional business intelligence.

Secure Access

This strategy defines data access and retention through a clear security model, such as Role-Based Access Control (RBAC), and formal Data Retention Policies for the secure handling, archiving, or destruction of historical data, ensuring compliance and manageability.

Historical Data

Decades of historical data should not be migrated into a new ERP. Establish Data Retention Policies that define destruction timelines and implement a Data Archiving Strategy to securely store inactive data in low-cost, accessible formats. This reinforces a clear direction for security, compliance, and manageability without repeating previous details.

Technical Challenges

  • Legacy Data Chaos
  • Data Transformation
  • Load Times

Legacy Data Chaos

Garbage-in, garbage-out (GIGO) is such a cliche that I am embarrassed to repeat it, but there is it.  What you put into the system is what you will get out of it.

Challenge

Data is not in one place. It’s spread across decades of “dirty, disconnected legacy data” in siloed, incompatible systems like CRMs, SCMs, and “critical Excel spreadsheets”.

Impact

Migrating chaos without a plan carries existing problems into the new system, leading to data that cannot be trusted.

Solution

Expose the chaos. A successful migration should begin with a comprehensive discovery and profiling phase to understand what data exists, where it resides, and how the business actually uses it.  Expectations on this effort should be set accordingly. 

Extract, Transform, and Load (ETL)

The "T" in ETL is a Business-Logic Minefield

The “Transform” (T) phase is a business challenge disguised as a technical one. This is where “business rules” are applied to the data, and where projects can grind to a halt. For example:

  • Different systems have conflicting information for the same customer.
  • The sales department defines “Active Customer” differently than the finance department.
  • Legacy data structures do not align with the new ERP’s model.
  • The meaning of data elements does not exactly align within systems.

Solution

A detailed exploration phase resulting in a data mapping and transformation plan that involves the right people and sets the right expectations.

Load Times: Unseen Project Killer

Data volumes can overburden the migration process. A full data load can take days, even weeks to run. This is not just a technical inconvenience; it has a major downstream effect on the entire project:

  • Network Latency: Moving large volumes of data can slow networks to a crawl.
  • Slow Test Cycles: If a test load takes 72 hours to run, the business team only has a few hours to conduct validation.
  • Late Failure Discovery: Data that took 72 hours to load and returns an extensive list of errors may require a re-run.
  • Go-Live Migration Window: long load times can create unacceptable system lock-out times.

Solution

  • Work with small subsets of data during iterative development and validation
  • Break large batches of data into small batches when full loads are required.
  • Adopt and test a ‘delta-load’ strategy:
  • Load all of the master data,
  • Load the bulk of the transaction data in a batch a week before go-live
  • Load the modified data just before go-live.

Business Challenges

  • Clean Data 
  • Harmonizing Data
  • Validation

Clean Data Requires Business Judgement

Trust and Data Debt

The dirty data in your legacy systems is a form of data debt accumulated over years. The ERP project is like a bank calling this debt due. Cleaning is the repayment, and it is paid with the time and effort of skilled business users, not IT. This is a resource requirement from managers who must now dedicate their best people to this unfunded mandate, rather than to their ‘actual’ jobs.

Challenge

It developers lack the business knowledge to determine what to clean (e.g., whether a “duplicate” customer record is actually a required separate account).

Impact

IT-led cleansing destroys value and can lead to billing chaos and damaged customer relationships. When users see the same old errors, trust in the ERP system diminishes and adoption fails.

Solution

A “Map-First, then Clean” approach. Use the new ERP’s data map to identify specific data issues that will impact migration, then empower skilled business users to make the final judgment on how to fix them.

Harmonizing to A Single Source of Truth

Challenge

Identifying the source of truth (SSoT). When the Sales CRM and the Finance system have conflicting data for the same customer, who wins? This is a business negotiation, not a technical task.

Impact

Deadlock. Departments with siloed organizational structures (e.g., Sales vs. Procurement) fight over a business partner definition, delaying the ETL transform phase.

Solution

The SSoT is not found, it is created. It requires a Governance body with the executive authority to resolve cross-domain issues and enforce a binding, unified standard.

Data Harmonizing is the business process of creating a single, unified golden record for your core master data. In a typical company, customer, vendor, and product data is duplicated across multiple legacy systems, leading to conflicting information. The goal of harmonization is to consolidate this data so that everyone in the organization—from sales to finance to the warehouse—is working with the same, accurate, and up-to-date information. This single source of truth (SSoT) is the foundation for all accurate analytics, operational efficiency, and regulatory compliance.

Validation: The Business Sign-Off

Challenge

Mistaking data validation for an it task. The project team’s goal is to get the sign-off to meet the cutover date, while the business’s job is to withhold it until the system is proven.

Impact

Skipping or rushing user acceptance testing (UAT) is fatal. It can result in costly fixes, reduced user adoption, and damaged trust when errors are inevitably found after go-live.

Solution

Formal UAT, which is the final and most crucial stage. It must be performed by real end-users validating end-to-end business flows—like running a real payroll—not just checking data fields.

Data validation is the final and most crucial stage of the migration process before go-live. It is the formal process of verifying that migrated data is accurate, complete, and formatted correctly. A common and fatal mistake is to treat this as an IT led task of checking row counts. True validation is a business function called user acceptance testing (UAT).

UAT

The IT team does not perform UAT. Real end-users perform it—the accountants, procurement clerks, and warehouse managers who will live in this system every day. The goal of UAT is not just to check data fields, but to validate end-to-end business flows using the migrated data. The questions being answered are not whether the customer address field is correct, but :

  • Can I create a purchase order for this migrated vendor?
  • Can I run our real-world, complex payroll using the migrated employee data?
  • Can I ship this product from inventory, and does it generate the correct financial journal entry?

This data validation for specific business scenarios is the only way to find errors that have a context which a purely technical check would miss.

Political Challenges

  • Ownership Mindset
  • Governance and Ownership
  • Resistance

It’s Not an IT Problem

Challenge

A pervasive (and incorrect) belief among business leaders that because data migration involves a digital platform, it is an IT problem and not a business effort.

Impact

Accountability avoidance. Business managers, who are the only ones with the context to clean or validate the data, deflect this difficult work onto IT, guaranteeing failure.

Solution

The recognition that data migration is an enterprise project. Success is impossible without establishing formal data ownership and accountability within the business, not IT.

Accountability Avoidance: A Political Strategy

The “it’s an IT problem” mentality is often not a simple mistake – conscious or not – it is a political issue of avoidance. Accountability for data means work. It means taking responsibility for the difficult, time-consuming cleaning and validation tasks. Business managers, already stressed and under-resourced, have resistance to change and no incentive to volunteer for the difficult, unpaid, and thankless new job of data owner.

Governance and Ownership

Data Governance

Data Governance is the set of policies, processes, roles and responsibilities an organization creates to ensure its data is accurate, consistent, secure, and used responsibly. If ownership is the role, governance is the system that gives that role power. This framework is the only mechanism that can overcome siloed organizational structures and fragmented data ownership.

Challenge

Ownership creates political deadlocks. Departments in siloed organizational structures (like Sales and Procurement) disagree on harmonizing rules, and no one has the authority to break the tie.

Impact

Fragmented data ownership and conflicting business rules grind the migration to a halt. The project is stuck waiting for a decision that never comes.

Solution

A formal Data Governance framework that establishes Data Stewards (business-level experts) and a Data Governance council with the political power to resolve cross-domain issues and enforce a final decision.

The Two-Part Structure of Governance

Data Stewards

The Business Experts: These are not new hires; they are existing Subject Matter Experts (SMEs) appointed from the business. These data stewards are the people who understand what the data means and are given formal responsibility for defining data standards, monitoring data, and resolving any data-related issues within their domain.

Governance Council

The Executive Power: This is a data governance committee or strategic advisory group composed of senior executives. Their job is not to know the data, but to resolve cross-domain issues.

Resistance

Change Management and Data Migration

Resistance can, by itself, determine the success or failure of the entire project. Resistance is not a single problem to be solved; it is the symptom of all the friction created by the other challenges, and can be an impediment to data migration.

The Challenge

User resistance is a data failure factor. It is a human reaction driven by insecurity (of job loss, of competence) and a political reaction driven by power (the loss of power user status).

The Impact

A disengaged user—the spreadsheet master—nods along in meetings while  actively withholding  the tribal knowledge the project needs, sabotaging it.

The Solution

A comprehensive change management plan that proactively addresses the human and political anxieties. Reframe resistance not as defiance, but as misaligned passion from users trying to protect their work.

Quiet Resistance

The greatest danger is not open defiance. It is organizational change resistance wrapped in silence. This ‘quiet resistance’ is essentially sabotage by a disengaged user.

This is the spreadsheet master who knows the undocumented business rules that the ETL and Cleaning teams are begging for. This person nods along in meetings while actively withholding their needed expertise. They watch the project team build the wrong transformation logic, knowing it will fail. This is why a comprehensive Change Management plan that addresses the entire hidden mass is the only path to success

The Taxonomy of Data Success

Getting Data right requires getting a hierarchy of things right.  Explore the taxonomy of Data critical success factors below.  Click the grey + to reveal more detail.

  • Data
    • Strategy and Governance
      • Strategic Intent
        • Defining Objectives
        • Business Case and Sponsorship
        • The Cross-Functional Team
      • Architecture
        • Charter and Framework
        • Roles and Responsibilities
        • Data Policies and Standards
      • Data Quality and Lifecyle
        • Standards and Metrics
        • Data Lifecycle
        • Current State and Gap Analysis
    • Pre-Migration Readiness and Transformation
      • Source Data Analysis
        • Identifying Data Sources
        • Data Profiling
        • Prioritizing Data for Migration
      • Data Cleansing and Enrichment
        • Data Cleansing
        • Data Enrichment
      • Data Transformation and Mapping
        • Data Mapping
        • Transformation Rules
        • Legacy System Complexities
    • Execution and Validation of Data Migration
      • Data Loading
        • Migration Approach
        • Technical Procedures
      • Testing and Validation
        • Rigourous Testing
        • Multi-Stage Testing
        • User Acceptance Testing (UAT)
      • Reconciliation and Validaation
        • Methods and Metrics
        • Go-Live Preparation and Cutover
    • Post Go-Live
      • Transition to Continuous Governance
      • Quality Monitoring and Maintenance
      • The Data Lifecycle

Data in the Project Lifecycle

An overview of activities for Data success, by phase, from the beginning of the project. Click any of the phase buttons below for a summary.

Compliance

Engagement

The “Why”

“I have to”

“I want to”

Motivation

External
(reward & punishment)

Internal
(pursues purpose)

Focus

Rules and policies

Goals and mission

Results

Meets standards

Exceeds standards

Behavior

Follows instructions

Take initiative & innovates

Outcome

Stability

Growth, innovation, higher productivity