Tips for Data Teams – The Consistency Check

Have you ever delivered an analysis, only to hear from your client that “these numbers can’t be right”? It’s hard to convince someone that your results are credible when they don’t even pass the first 5 seconds of review. As much as we may not want to admit it, sometimes the numbers are indeed wrong, so how do we avoid these situations from happening? One type of check that a Data Team can adopt is the “Consistency Check”. Here are some questions that you can ask yourself when doing a consistency check:

Consistent numbers

Question 1) Are the numbers consistent with themselves?
When building complicated analyses, different sections of the analysis can fall “out of sync” with each other if they are not all updated in the same way. When this happens it can produce inconsistent summary results (i.e. the cover page reports 255 conversions per hour, but the supporting details on other pages show 237 conversions per hour). Sometimes we place too much faith on our reporting tools and assume that they will report exactly as intended. In other situations it’s just a matter of being too close to the work. After a while the numbers are burned into your short term memory and you lose your ability to critically review them with an objective eye. Suggested work-arounds include:

  • Have another member of your team do a consistency check on the results, preferably someone who hasn’t been involved in the work.
  • Take an old school approach. Print out the results, and use different colored highlighters for each type of metric. Highlight the summary numbers that represent the same result, and confirm that they are indeed consistent. Continue until you’ve highlighted all summary numbers.
  • Take another old school approach. Get your calculator out or use a separate spreadsheet, and confirm that you can replicate the summary numbers just based on the results that are being presented. You may be surprised with how many of your clients are doing this with your results already.

Question 2) Are the numbers consistent with your previous analyses?
When a client receives a new set of results they often pull up the previous results that you gave them. They are asking the question “how much have things changed?” You can beat them to the punch by doing this consistency check yourself. To be more specific:

  • Start with the previous result that was presented or released. Compare the summary numbers from the previous results to your current summary numbers.
  • Assess if the changes are interpretable. If they are, then this interpretation will likely be part of what you communicate when you release the new results. If the changes are not interpretable, then it’s time to go back into your current results, or your previous results to diagnose why the changes aren’t explainable.

Question 3) Are the numbers consistent with other reports?
Stepping into the shoes of your audience, you can think about the other reports that they are referring to on an on-going basis. It doesn’t matter if the other reports that they use came from a completely different source – from their perspective all data from all sources is supposed to tell the same story. In a similar manner to Question 2, you can do some additional homework so that your results are valuable to your audience as possible. For example you could:

  • Ask your clients if they have any other reports that they use frequently, and if they would be willing to share them with you. You can frame it honestly – you want to make sure that your results are valid, and if they are different from other sources, you want to be able to explain why.
  • Do a little research on your own, in particular, reviewing any routine corporate reporting, or industry reporting. Sometimes, a skeptic can be won over by proving that you did your homework. Again if the numbers line up from other sources, it becomes something you can report as proof of consistency. If the numbers don’t line up and you can’t explain the difference, then it may be an indication that you need to review your analysis.

Question 4) Are you telling the right story?
Taking all of the above into account, you should be able to deliver your results confidently. You should now know that the numbers in the report are consistent amongst themselves, that the analysis is consistent with previous analyses, and that the results are interpretable in comparison to other sources. This now can become part of your summary and presentation of your stunning new work. Or at least it can form as an addendum to the email, or the presentation that shows your audience the efforts that you went through to ensure that the numbers are the right numbers. Then you have the foundation to begin telling the actual story of the analysis (the “so what” message).

These are just a few tips, but I’m sure there are many of experts out there who have many more great ideas. If you have suggestions, or alternate points of view, please weigh in.

Note: What is a Data Team?
When we refer to “Data Teams” it’s a catch all for groups of technical, statistical, and subject-matter domain experts that are involved in providing information to support their organization. These teams are sometimes called “Business Intelligence”, “Decision Support”, or “Information Management”, but they can also be internal consultants such as “Operations Analysts”, “Strategic Information” or “Research”. Many of these concepts equally apply to teams of Data Scientists.


Reducing Rework in a Data Team

As much as we’d all like to get things done right the first time, with analysis and modeling it’s not always possible.

When delivering results, it’s fairly common to receive requests for minor revisions – and most of that we can all handle. But every so often the situation catches you by surprise. You’re delivering what you think is a great piece of work only to learn that it missed the mark completely. You hear statements like “This isn’t what I asked for!” or “You misunderstood what I asked for!” and you wonder where things went wrong.

Sometimes you can rightfully blame the person who requested the analysis, and then conveniently changed their mind. But more often the breakdown happens around communication and agreeing on expectations.

Final version

So what do you do? Here are some coping strategies:

1) Ask the question “What does a job well done look like?”
The next time you’re asked to run a major analysis where you feel that you don’t have an adequate understanding of what is being asked, try this script:

“I want to make sure that I give you what you want. Would you mind if I grabbed a couple of minutes to clarify a few things?”

Then ask your clarifying questions. For example:

  • What’s the business question that this analysis is supporting you with?
  • Do you just want the summary, or did you want the supporting details?
  • Is this analysis just for your reference, or is it going to be distributed?
  • How accurate does this need to be?

The answers to these questions can make a big difference in determining the final deliverable. If you only have time for one question, the first question is the best one to ask.

If you’re lucky enough that the person making the request is willing to spend more than a couple minutes with you, then you can try to get crystal clear on “What does a job well done look like?” The following are some of the statements that you might hear:

  • It will help me answer this questions …
  • The numbers will be consistent with our annual report
  • The summary of results will be jargon-free
  • The results will be delivered by Friday morning at 10 am, both by email as well as a color print out on my desk

2) Put your understanding in writing
Now, with your heightened clarity you can now put it into writing. A short follow up email of the form “Thanks for clarifying. So, just to recap I will …” will provide one more opportunity for corrective feedback.

In many situations you won’t be able to do the first step (getting clear on “what a job well done looks like”) because the person making the request is too busy. But even in these situations it’s still worthwhile putting into writing. You can write the same short email, but this time it will have an opening line of the form “I know you’re too busy to discuss the analysis, so I’ll make the following assumptions when I do it …” And then, you can add a closing line “Hopefully that captures it. If I don’t hear otherwise from you, I’ll deliver results based on this understanding.”

3) When delivering your result, include the original request
You’ve done the hard work of clarifying expectations, you’ve done the analysis, and now this is the easy part. When summarizing the results, make sure that you attach your analysis to the clarifying email. If you’re delivering it in hard copy, you can attach a print out of the clarifying email to the top.

Using this approach the person making the request will be able to see their role in the entire process. It won’t take long for people to see the value of slowing down and spending a few minutes getting clear on the request.

4) Follow up after the fact
The worst situations are when you’ve put in the hard work, but it wasn’t really what the requester wanted, and so they don’t use it. They’ve wasted their time, your time, and they still didn’t get what they want. Because they feel embarrassed about not using the work, they will often not bother giving you feedback.

So, it’s up to you to solicit feedback after each major deliverable. A brief check-in after the fact can yield great feedback. If you’re not getting rave reviews about the great work you did, you can ask “What could I have done to make it even better?” This seemingly innocent question prompts the requester to give candid feedback, and demonstrates that you really care about the value of your work.

How's my analysis?

These coping strategies are not for everyone, and are not needed in every situation (especially the quick and easy analyses). But it’s the times when we get it wrong where we really appreciate the value of clarifying expectations. If you have your own coping strategies, please weigh in.

Note: What is a Data Team?
When we refer to “Data Teams” it’s a catch all for groups of technical, statistical, and subject-matter domain experts that are involved in providing information to support their organization. These teams are sometimes called “Business Intelligence”, “Decision Support”, or “Information Management”, but they can also be internal consultants such as “Operations Analysts”, “Strategic Information” or “Research”. Many of these concepts equally apply to teams of Data Scientists.


The Science of Data Scientists

The concept of the Data Scientist may very well be the next big thing in the field of analytics. Recently several industry leaders have weighed in on the question “What is a Data Scientist?”, but another way of looking at this is to ask the question “What is the Science of Data Scientists?”
            Data Scientist

A dictionary definition of science is a “systematic knowledge of the physical or material world gained through observation and experimentation”. So let’s look at the use of science in three areas that Data Scientists all need to do in carrying out their basic work:

  1. They transform the data into a format and structure that is conducive to analysis
  2. They carry out some kind of descriptive, interpretative, or predictive analysis
  3. They communicate their results

Using Science in Data Transformation:

Anyone who’s worked with data for a while knows that the data you have available is usually less than perfect. Missing data, inconsistently formatted data, and duplicate data are fairly routine obstacles, and then linking data from different sources is even more challenging. Data Scientists are also often required to work with “secondary data” that has been generated through an operational system or process. The data was originally designed to meet a functional requirement, rather than with the intention of it being analysed in the future. Even if the data is clean and error-free, there is a requirement to reorganize the data into a structure that is conducive to the analysis that needs to be performed.

So, in response, most Data Scientists develop skills in transforming data, and are quite good at it too. They use tools ranging from statistical analysis software to standard database technologies. Where the science comes in, is that there if often a lot of experimentation that takes place along the way, as the Data Scientist figures out how best transform the data while introducing little to no error along the way.

Many Data Scientists have learned the hard way that using a scientific method to prove that the data transformation has been done correctly ultimately saves time and reduces rework in the end.

            
Big Number

Using Science in Performing Analysis:

Here the use of scientific method is more obvious. It is taken as a given that Data Scientists conduct their analysis and modeling systematically, and that the essence of the work involves observation and experimentation. In carrying out the work, often “the proving” is a key component of what the Data Scientist does, so that they know they are drawing the right conclusions.

However, there is a wide range of scientific tools that Data Scientists can use to understand and interpret massive amounts of complex data. Data Scientists are not unlike other skilled experts, and can be sometimes be like a carpenter with a hammer who sees every problem as a nail. For example, some Data Scientists are truly exceptional when it comes to logistical regression modeling (making the best guess of a “yes/no” variable), but then are complete novices when it comes to multivariate analysis (such as condensing information captured in 1,000 correlated variables into 10 summary variables). As is often the case with niche skills, it takes a while to really get good at using them effectively, and it’s rare to find Data Scientists that are truly effective in all domains. The scientific connection here is that Data Scientists sometimes have to come to grips with the limits of their own skill set, and have to experiment in new directions to expand their knowledge base.

Using Science in Communicating Results:

This angle is less intuitive, but ultimately what’s the point of doing high-brow analysis, if nobody is able to understand the result, or even worse, if they can’t use the result to support a key decision?

Data Scientists that are in high demand are those that are able to truly understand the business question being asked, and why it’s being asked. Then they communicate their complex findings in a way that the decision-makers can actually do something with the result.

This important skill takes a while to develop, often through experimentation (i.e. what happens when I present it this way?), and then observation (i.e. what did the CFO do with the last findings I sent her?). Even better, is when the Data Scientist adopts basic market research approaches to their own work. Specifically, by following up with their clients and/or end-users of their work and discovering how the results could be even more useful. Or taking a more traditional approach, they can literally post their results with on-line reporting tools and run analytics to see how often and how deeply their results are being viewed.

The concept of the Data Scientist is still relatively new and will be shaped by those of us who work in and around in the industry. Please offer your own comments and feedback, even if you disagree with any of these ideas.