As data consultants, over the years we have seen organizations big and small, and across numerous industries. While the scope of our work can vary widely from one client engagement to another, there is one common issue that we see in nearly all organizations: dirty data.
To be fair, some companies are much better off than others, but regardless the size of the company, most analysts within the organization are eager to commiserate about how difficult it is to get to what they need. It could be that the data isn’t well organized, perhaps it’s hard to get to and make sense of, or worse, it simply can’t be trusted.
Dirty data can mean a lot of different things. Data can be presented in the form of a free-form text box entry, where the State can be represented as TN, Tenn, Tennessee, or even a misspelled variation like ‘Tenesee’. It can be in the form of differing metric definitions. No doubt you’ve been part of or overhead this conversation:
“Why don’t my revenue numbers match your revenue numbers?”
“Well, did you exclude X?” . . . and then a 15-minute discussion ensues about the best way to pull and interpret the metric.
Just the other day at a client meeting, I was asked, “Of all the companies that you’ve worked with, how bad is our data?” At least they’re self-aware!
While we all might look back and say, “If only we had a process in place 5 years ago, we wouldn’t have to deal with this now.” The truth is that the data dirtiness must be confronted, the sooner the better.
Here are some simple steps to help get your organization on the right path:
1. Begin regular meetings with key stakeholders to talk data.
The key is to identify a representative from every major area of the business who knows their data and can speak to the unique needs of their department. If your organization doesn’t already do this, it’s time to start! If a process is already in place, it may be called Data Governance or Data Stewardship, and if your organization is really on top of things, they have it listed as part of your job description
2. Develop an organizational knowledgebase.
Do all your company’s discussions and decisions live in email or on the shared drive? It’s time to start documenting them in an organized fashion so that team members, new and old, know where to go to get the information they need. This could be a SharePoint site or any number of tools that will help you do this.
3. Develop a data dictionary.
This is similar to #2 above, except this is specific to the metrics that are commonly used in your business. How is net revenue calculated? How do you count new customers/patients/accounts? A common problem we hear from companies is that they don’t have a single source of truth. Turns out, even companies who have a single reporting platform will get different truths because analysts are pulling the numbers differently. By coming to a consensus on how a metric should be defined through stakeholder meetings, and then by documenting that metric in a data dictionary, you can reduce the multiple truth problem.
Remember that you are never alone in your data struggles, and it’s never too late to get back on track. Contact us today to inquire how Think Data Insights can help you ensure your organizations data is clean, trustworthy, and adding value to your business!