Who should be involved when defining product data cleansing rules?

Quality: why data cleansing matters

In large product-centric companies, B2B and B2C alike, the issue of data cleansing needs to be closely linked to product data quality. Therefore, it is crucial to co-opt stakeholders onto a cleansing process early on. In fact, many organisations appoint a single individual to advocate for data quality and liaise with users and other stakeholders throughout the cleaning process. This could also mean interacting with third-party subject matter experts, software solution vendors and board-level parties to educate stakeholders on the business value that clean data brings. The following article examines issues around the development of a business case for cleaning product data, who the key stakeholders may be and how to engage with these people to ensure the process runs smoothly and that the outcomes add genuine value.

A failure to establish clear criteria for cleansing rules is likely to impede effective application of such rules. This runs the risk of not eradicating problems like:

  • Misunderstood data
  • Violations of security or regulatory norms
  • Failure to establish data lineage
  • ‘Information hoarding’ by various parts of the organisation (often due to the failure to address the ‘silo’ mentality.

 

So, how can we set up a framework to ensure that the right people are involved in making the right decisions about the rules and constraints applied to product data cleansing?

Roles and responsibilities: should senior management be involved?

Data champions (C-suite management), and data citizens (users) who will be effective drivers for project sponsors and owners to involve the right stakeholders in developing data cleansing rules, which will in turn, inform greatly the shape and extent of responsibilities within the data governance framework.

A number of these individuals will be subject matter experts who possess business expertise and a capacity for detailed analysis. This group of stakeholders (and most probably, end users) are the “go-to” personnel for many non-technical end users, as they tend to know where data resides and what relationships it has with other parts of the organisation. That makes them very useful to have as members of the team governing standards and rules for product data cleansing.

The business case

Building the business case for data cleansing within your organization requires a clear understanding of your strategic business goals – it also means identifying the KPIs used to measure the performance of your cleansing initiatives. That includes not only an estimate of how much it costs for you to improve your data quality, but to what standards you expect your data quality to be.

Product data quality is at the core

A data cleansing plan has quality at its root. When you are identifying which types of data to be targeted, you need to know what the biggest quality issues are in those data sets, as this will inform the data cleansing tactics and techniques applied, as well as which software tools are needed to the process. The input and perspective of users and experts in the areas where these data are located is essential – a top-down approach to a cleaning initiative runs the risk of omitting precisely the categories and formats which could come back to haunt you once the new PIM or MDM solution is up and running. Hence, any plan should also establish roles and responsibilities, along with a clear definition of success for your data cleansing initiative.

Establishing common criteria

You must have universal agreement on two key areas:

Common definitions: There need to be clear and unambiguous definitions for metrics. You do not want a scenario where different business functions around the company provide multiple definitions of, for example, what a ‘new customer’ is, consequently reporting different numbers among them almost every month.

Quality thresholds: Too often. It is not clear to what degree data needs to be cleansed. Any parametres for quality thresholds requires the active participation of users and stewards across the organisation. Therefore, at least in the phases of setting up a rules-based cleansing strategy, all interested parties should have input into generation of the final set of rules. This may well involve fierce disagreement at first, but that is precisely why using an external subject matter expert (with no departmental axe to grind) will help to guide the stakeholder group towards a consensus-based set of rules.

Governing the process

Governance provides context, perspective, and priority to issues of data cleaning, which would otherwise remain unresolved, causing quality problems further down the line (for example, at the migration phase).

Departments may frequently have their own understanding and opinion of how the same data should be regarded. However, no one department or key contact has the big picture provided by the organisation-wide community, as long as this community communicates across its scope. Moreover, what one department considers as an acceptable standard of DQ may well be below what another department requires.

Planning and applying data cleansing rules

Knowing where and why most data errors occur helps to identify the root causes and develop protocols built to manage this. Effective practices for data cleaning will have a positive impact across the company, which is why it is crucial to put a premium on open and transparent communication.

  • Responsibility: A C-Level manager, the Chief Data Officer (CDO), and those responsible for business and technology needs should play a role in defining rules.
  • Metrics: different data sets have different data quality attributes, so applying a universal quality ‘score’ will help a company to measure the outcomes of its data cleansing phase. This overall number (a simple 1-100 scale is often used) can also give more weight to data which are considered as having a greater weight in determining the organisation’s goal fulfilment
  • Action planning: before setting out on the cleansing journey, you should outline a clear set of actions to kick off the data cleansing plan. At later stages in the PIM or MDM project, these rules and norms will need to be amended as part of an overarching data quality strategy (as business priorities will inevitably change). Thus, the rules determined by stakeholders for data cleaning become a significant part of the data quality framework within a data governance program.

 

Data quality will impact on all areas of business activity (for good and, if not addressed, bad), so involvement of departments with key users and decision-makers is essential: marketing, sales, logistics, product listing and eCommerce teams should all form part of that exalted group.

Governance provides context, perspective, and priority to issues of data cleaning, which would otherwise remain unresolved, causing quality problems further down the line (for example, at the migration phase).

Departments may frequently have their own understanding and opinion of how the same data should be regarded. However, no one department or key contact has the big picture provided by the organisation-wide community, as long as this community communicates across its scope. Moreover, what one department considers as an acceptable standard of DQ may well be below what another department requires.

ROI from setting rules for data cleaning

Keeping on top of data cleaning is an essential ongoing endeavour. Inputs should be maintained at accurate and consistent levels, and although your existing data will be clean once it has gone through your rules-based processes, incoming data from, for example, suppliers, may not be. Having done the initial rule-setting for data cleaning you have laid the groundwork for using the power of AI-based tools to automate many cleaning processes. That is why knowing who is involved in data cleansing will ensure a degree of organisational consistency and quality in cleansed data and will enhance performance from then on.

Find out more

If you would like to find out more about how product data management, PIM and MDM can create value for your business, we’d love to hear from you – Ben Adams, CEO Start with Data

Case Study

“Start with Data are helping transform product data management, laying scalable technology and data governance foundations”