Coronavirus: What are the real numbers

Objective of this article:

  • To show the difference between the “real numbers” and the reported numbers
  • To highlight why we should be doing randomized testing
  • To remind you that it doesn’t make a difference in how we manage this global pandemic

There are a lot of numbers and charts out there. A lot! Some of them look scary. Some of them give us optimism. Some of the charts just plain look cool. But what’s the point of all these numbers? To inform us and drive us to take the right actions.

Sneak peak of the findings:

  • Unless we test everybody, or start doing randomized testing on the general population, we are never going to know the “real numbers”. Given the scarcity of tests currently available, our only option is really randomized testing.
  • The claim that “our numbers are higher because we’re testing more” isn’t really true. A given region can do a better job than expected at testing the population, but that population (the community) can do a worse job than expected in terms of spreading the virus.
  • The mortality rate is the only “real number”. Anyone with coronavirus symptoms who is in critical condition is very likely to get properly tested. The problem with this hard number is that it tells us the rate of the virus in the community two or more weeks ago … not what it is right now.
  • It doesn’t matter anyway. What matters is having fewer cases each day in every region. Each of us must do everything what we can to reduce the transmission of the virus in the community.

Chart 1: The relationship between confirmed cases per million and deaths per million

One of the most real numbers we have is “deaths”, and as shown there is a correlation between confirmed cases per million and deaths per million. But the context is relevant too … British Columbia has had a larger proportion of outbreaks in long-term care homes (19 around Vancouver), and as a result it drives the mortality rate up. A 1% mortality rate is considered to be “average” for regions that have fared well. Ontario, Alberta, Saskatchewan and Manitoba have mortality rates of 1.1% to 1.5%. The mortality rate for British Columbia is 2.4% as of the most recent data.

Confirmed Cases versus “Real” cases

In the previous article we showed the number of cases per million, and highlighted how Canada and the US are at a crossroads between a worst case and a best case scenario.

Chart 2: Confirmed cases per million since the date of the 100th confirmed case

A problem with confirmed cases is that it only represents the cases that were actually tested. What about all of the coronavirus cases that are out there in the community that aren’t tested yet?

The impact of different testing approaches

I’m going to show you three examples of how different testing approaches can skew the numbers. Imagine we had a hypothetical village of 100 people with a similar pattern to coronavirus (30% with no symptoms, 55% with mild to moderate symptoms and the rest with severe symptoms), and imagine that we (somehow) knew that 10% were infected. The village would look like the following visual.

Graphic 1: 100 residents, 10% “true” infection rate

I’m showing this twice … on the left side (organized) everything is organized so that it’s easy to count. The right side is what it’s probably like in the community (randomly spread).

This is a trivially small village and maybe we tested all 100 residents. But, what would the situation look like if we didn’t test everybody? We would test some of the individuals who had symptoms and were close to people with symptoms.

Graphic 2: 100 residents, unknown infection rate with testing, 25% testing coverage

Again, the left side (organized) shows the 100 people in the village, nicely organized so that you we can count them. You’ll see that there are now diamond shapes which represent the 25 villagers that we tested. The circles are the ones that have not been tested (but we secretly know who has the virus and who doesn’t).

In this demonstrative example there was just 1 confirmed case out of the 25 residents who were tested, which means the infection rate would be 4%. The infection rates (of those who are tested) that are reported in Canada range between 2% and 6%.

The right side (randomly spread) shows how we might test the villagers who happened to be close to the one positive test we identified.

The above testing would conclude that there are 1 confirmed case for 100 residents, so a 1 in 100 rate. But we secretly know that the true rate in the community is 1 in 10. We are underestimating by 90% this scenario.

Graphic 3: 100 residents, 25% testing coverage focused on those with symptoms

In this demonstrative example all 7 of the cases with symptoms were tested. As a result there were 7 confirmed case out of 25 tested, which means the infection rate would be 28%, which is about 5 times higher than what we’re seeing in the data. The above testing would conclude that there are 7 confirmed case for 100 residents, so a 7% rate in the community. But we still secretly know that the true rate in the community is 10%. We are still underestimating by 30% in this scenario.

Graphic 4: 100 residents, randomized testing

In this final demonstrative example, we’ve randomly selected 2 out of 10 people and tested them. Out of 100 people in the community we tested 20 of them. 2 of the tests came back as positive, for an sample-based infection rate of 10%. Because we did a random selection we know that we can scale up the numbers to the full population of the village, meaning we would not calculate 2 over 100 residents, we would calculate 2 over 20 samples = 10% and then we can accurately project that there must be 8 more cases in the community that have not been tested.

Take away message: We can’t know the real numbers without randomized testing

If we want to have the “real numbers” we have two options:

Option 1: Test everybody, which is impossible given the scarcity of tests

Option 2: Design a randomized test of the general population

There must be a relationship between testing and confirmed cases!

There isn’t great data out there on the number of tests that are performed, but recently Canada has started reporting the number of tests in each province. As of April 1, 2020 the number of tests per 1,000 population ranges from 3.5 to 11.1, and higher for the remote provinces with smaller populations. South Korea is at about 7.9 tests per 1,000 population, and they are considered to have tested in a substantial manner.

Chart 3: Tests per 1,000 population by province in Canada

If there was a relationship between the number of tests per population and the confirmed cases per population, then we should be able to see a correlation in the following chart.

Chart 4: Relationship between tests per population and confirmed cases per population

Looking at the chart, it doesn’t appear to show a correlation between these two factors. But some of the outliers to the right are very small populations which might be distracting us from the bigger story.

Chart 5: Relationship between tests per population and confirmed cases per population, by size of population

This helps us see what we should be focusing on. If we zoom into the range of 0 to 12 tests per 1,000 population we see the start of what looks like a correlation.

Chart 6: Relationship between tests per population and confirmed cases per population, by size of population

But, it’s definitely not the type of correlation that you would see in a statistics textbook. Why is that? There are two factors at play here:

Factor 1: How a given region approaches testing. Do they test randomly? Do they test above average or below average amounts per capita? Do they save their tests for specific situations? Do they not waste their tests on individuals where it doesn’t make a difference in the management?

Factor 2: The ability of a region to contain the spread. How seriously do they approach containment? Do they follow the guidelines? Did they act early?

Chart 7: Relationship between tests per population and confirmed cases per population, by size of population, highlighted

The above chart shows a blue region that is subjectively chosen as the “normal” relationship of tests per population and confirmed cases per population. Ideally this blue region would be based on comparable data from around the world, but such a dataset does not yet exist, so we do what we can with our small handful of data points.

Specific provinces have been highlighted. Quebec stands out as having a much higher number of confirmed cases per population, in comparison to the rest of the provinces. It would be difficult to explain the high numbers as the cause of them doing much higher testing per population. Alberta stands out as having a remarkably high number of tests per population, while still showing a relatively low number of confirmed cases per population. The same could be said for Manitoba.

The only “real number” is deaths

The hard truth of the numbers is in the mortality rate. We’ve seen that the general mortality rate in the regions that have managed this for a longer period is approximately 1%. We’ve also seen situations like Spain, Italy and Hubei where the demand on the healthcare system far surpassed it’s capacity, and as a result had mortality rates of 5% to 10%.

Chart 8: Relationship between confirmed cases per population, and deaths per population

As shown in this chart there is a relationship between the confirmed cases per million population and the deaths per million population. We can see that even though Quebec had higher than expected confirmed cases per population in Chart 7, they are showing in the “normal” range just below the 1% mortality rate. There are some smaller provinces such as New Brunswick and Nova Scotia that are so early in the process that they have not reported any deaths yet. Ontario has a 1.5% mortality rate as of this data, whereas Manitoba, Saskatchewan and Alberta have a 1.1% to 1.2% mortality rate. British Columbia is showing as having a much higher mortality rate at 2.4% … this may just be reflective of how the virus impact long term care homes, with elderly residents. There is a clear relationship between age and mortality.

The “real numbers” don’t change our actions

Data is not reality. It is a representation of reality. The point of data is to guide us to the right decisions and the right actions. Even if we don’t have perfect data we can hopefully all agree that:

  • It’s a good thing if any of our numbers (confirmed cases, deaths) reduce from one day to the next.
  • If we’re “doing well” then we should all do what we can to keep it that way … which means following the social distancing guidelines.
  • If we’re not doing well in a given region, then each of us needs to do everything to help bend the curve, including following the guidelines to the letter.

I’m inclined to agree with the following statement:

“If it were possible to wave a magic wand and make all Americans freeze in place for 14 days while sitting six feet apart, epidemiologists say, the whole epidemic would sputter to a halt.”

Main take away messages

These are the main things that I learned from carrying out this analysis:

  • There is a relationship between tests and confirmed cases … if you do more testing, you do get more confirmed cases … but it’s not as simple as that. Cases can be higher than expected, but have a “normal” mortality rate. Cases can be lower than expected even if there are high levels of testing provided that the region follows the recommended social distancing guidelines.
  • We should really consider doing randomized testing of the general population if we want to get at the real numbers. A little bit of testing with some basic statistical design would go a long way.
  • Mortality is a “hard number”, but even this metric must be taken in context. Virus outbreaks in long term care homes are likely to result in above-expected mortality rates.
  • With a bit of concerted effort over a short period of time the growth rates can be reduced. This situation doesn’t get any easier for any of us if we drag it out.

Data sources:

Much of the data that we need is publicly available on github. The following link includes daily data of confirmed cases for each country in the past several weeks:

https://github.com/datasets/covid-19/blob/master/data/time-series-19-covid-combined.csv

Please keep in mind that each day it shows the numbers that were reported across the world as of the previous day.

In addition, the Epidemiological Summary from the government of Canada has provided much of the data related to testing:

https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html?topic=tilelink

Coronavirus: Canada and US at the crossroads

(Note: Data updated as of March 31, 2020)

Objective of this article:

  • To show where Canada and the US are right now, relative to other areas of the world
  • To highlight what we can do to slow the curve

In my previous article I showed how we can line up different regions to see if we’re doing better or worse than expected. This article is going to expand on the previous one, including some adjustment for the different population sizes of different regions. Please note that you don’t have to read the previous article for this to make sense.

Sneak peak of the findings:

  • Confirmed cases per million population is quite simply number of confirmed cases divided by the population, multiplied by one million. 1 case per million is just like it sounds “one in a million”. Hubei is at about 1,000 coronavirus cases per million. Canada is at just over 100 confirmed coronavirus cases per million, and the US is at about 250 cases per million.
  • Canada is literally at the crossroads right now. As a country we can still follow the growth curve from South Korea and keep this situation manageable. But we could just as likely fall down the pathway of Spain, Italy and Hubei.
  • The US was late to the party but they are catching up quickly! The US needs to step things up drastically if they want to bend their curve.
  • The only thing that makes a difference now is that each one of us does what we can to reduce the number of new cases today. If every day we can reduce the spread of the disease to a smaller number than the day before, then we’re winning the war. It’s a simple as that.

Chart 1: Confirmed cases per million since the day of the 100th confirmed case

Confirmed cases

Many of us have seen the following chart of rapidly growing confirmed cases.

Chart 2: Confirmed cases since the date of the 100th confirmed case

It tells us that the US is in a worrisome pattern of rapid growth, mostly driven from the New York and New Jersey numbers. This chart also tells us that Canada is still in a relatively low number of confirmed cases.

The key shape we are looking for is the point of inflection on the S curve. This is the day when there are fewer new confirmed cases than in the previous day. The main places where we can see this curve is in those regions in China that have been through this for five or more weeks.

Chart 3: The S Curve

When we look at Italy, Spain and the US we can see that we’re still in the exponentially increasing side of the curve, we just don’t know if we’re close to the inflection point or not.

If there’s one thing you probably remember about anything “exponential” it’s that it’s hard to really see the first part of the curve. The y-axis scale can be changed to a logarithmic scale to make it easier to read.

Chart 4: Confirmed cases since the 100th confirmed case, using a logarithmic scale

This helps us see how things are changing over time. We can see from this perspective how similar all of the countries are in the beginning. We can see how the Hubei region started reducing their rate of growth in their third week after the 100th confirmed case, but that the US is actually accelerating in their third week. Looking at South Korea it is clear that they were able to reduce their growth rate in their second week and substantially slow the growth afterwards. The Canada curve is lower than these “worst case scenario” regions but it won’t be if Canada doesn’t hit the inflection point soon.

Taking into account the size of each region

Each of these regions have substantially different populations, so to put things in perspective we divide the number of confirmed cases by the population of each region. To make it so that we don’t have to read small decimal numbers it’s calculated as cases per one million population. So, if a country has 100 cases and a population of 100 million then they would be at 1 case per million.

Chart 5: Cases per million since the 100th confirmed case, using a logarithmic scale

The chart shows that Canada should be worried at this point, as it’s in the zone of Spain, Italy, and Hubei. Canada just passed the South Korea curve in the most recent few days. The US on the other hand has a very large population and as a result they have been under the curve. This would not be true if we were looking at New York and New Jersey specifically. The US is catching up with the pack rapidly and is on the trajectory to overtake Italy and Spain in another week.

The war is won and lost in the local regions

Even though this is a global pandemic it’s important to think locally. In Canada there are a number of provinces with very small numbers of confirmed cases, which makes the country look better than it might actually be.

Chart 6: Cases per million since the 100th confirmed case, Canadian provinces

As shown, when we look at the cases per million for Quebec things look much worse than what is happening in Spain. British Columbia, Alberta and Ontario were also looking worse but we can see that their growth curves are not as steep as the curve for Spain. The point here is that each local region needs to do what they can to minimize their local growth curves.

In a future analysis I’ll be looking at the impact of different approaches to testing. Again, each region is dealing with it differently. For example Quebec started using their own labs instead of the national lab to maximize speed of results for large volumes, and that was driving the big jump towards the end of week one. In contrast British Columbia continues to do very targeted testing.

Make every day count

Every single day we see the number of new confirmed cases reported through our local media channels. What matters most is that every single day we do everything in our power to make tomorrow’s number lower than today’s number. How we do that is:

  • Don’t be lax about the guidelines … if you don’t really have to go out then don’t go out. If it’s 6 feet of separation then keep it to 6 feet or more. Don’t sing “happy birthday” just once when washing your hands … do it twice!
  • Many people are bending the above rules a bit because they are convinced that they don’t have the coronavirus. To play it safe, assume that you’re one of the 30% who have it and don’t have symptoms.
  • If you’re sick then treat yourself like you are patient zero in a global pandemic … you have one job to do your part: don’t pass it on to anyone else. That’s the biggest way you can help. Read and re-read the very helpful best practices guides and don’t overreact: https://www.cdc.gov/coronavirus/2019-ncov/if-you-are-sick/steps-when-sick.html

As of March 31, 2020 there are 168 countries that have data in the public dataset that also have reported at least 100 confirmed cases. The following charts show some examples of how different regions have been doing over the past few days.

Chart 7: New confirmed cases per day in Quebec

Quebec was doing great for a while … with the numbers decreasing more and more every day. Their numbers unfortunately bounced back up on March 27th and have generally increased since then. Of the 168 regions that are reporting who have more than 100 confirmed cases, there only 22 regions that are showing signs of improvement like the first four days in the above chart.

Chart 8: New confirmed cases per day in Canada

Canada was much more up and down in the pattern but the past couple of days have shown significant growth in new confirmed cases per day. About 82 or so regions in the data have an “unstable” pattern.

Chart 9: New confirmed cases per day in the US

The new confirmed cases per day in the US continue to show a generally increasing pattern. The overall situation in the US is dire, and they need to do something drastic to slow their curve. Unfortunately there are over 40 other regions in the world that are following this worsening pattern.

Main take away messages

In summary, these are the things that I did not know until I did this analysis:

  • In general things are getting worse, with most reporting countries showing signs of “up and down” patterns of new cases per day, or increasing numbers of new confirmed cases per day.
  • Canada can still turn this around, but only if we hit our inflection point in the next few days.
  • The US will have to do something substantial, and do it quickly or this will far outpace what we’ve seen in Italy and Spain.
  • What matters most is that we all do whatever we can to make the number of new confirmed cases for our own region lower tomorrow than they were today. This is now a community-based disease, which means that it’s all up to us at this point.

Data sources:

Much of the data that we need is publicly available on github. The following link includes daily data of confirmed cases for each country in the past several weeks:

https://github.com/datasets/covid-19/blob/master/data/time-series-19-covid-combined.csv

Please keep in mind that each day it shows the numbers that were reported across the world as of the previous day.

Coronavirus: How each country is riding the bell curve

Objective of this article:

  • To make it easier to draw comparisons of coronavirus growth rates across countries
  • To assess how each country is doing … are they ahead or behind the curve?

As we all try and make sense of the current situation we’re all wondering if our own country will follow the growth rates observed in Italy and Spain, or if perhaps, all of this social distancing will help us be more like Taiwan and Singapore.

We’re all wondering how long will this last, and how long will it take for this to go through our system.

It’s hard to compare the growth rates when they start at different times in different regions. We also need to think about how do we compare regions with +100 million residents to smaller regions like Canadian provinces and US states.

Sneak peak of the findings

  • It generally takes about 1 to 2 weeks for a country, state or province to go from having 10 confirmed cases to 100 confirmed cases
  • It generally takes another week to go from 100 to 1,000 and generally another week to go from 1,000 to 10,000 confirmed cases
  • Countries like Canada, Australia and the UK have now already missed their chance to manage the coronavirus as well as Taiwan and Singapore, but they are are currently faring reasonably well
  • The US, and in particular, New York and New Jersey are on track to fare as poorly as Spain, Italy, and Germany unless they do something quick

Chart: Growth rate in confirmed coronavirus cases, lining up the start of the clock to the day of the 100th confirmed case

How I figured it out

Much of the data that we need is publicly available on github. The following link includes daily data of confirmed cases for each country in the past several weeks:

https://github.com/datasets/covid-19/blob/master/data/time-series-19-covid-combined.csv

We have to keep in mind that each day it shows the numbers that were reported across the world as of the previous day.

If we want to be better able to line up each region with the growth rates of other regions, we will need to do a few things:

Step 1: Look at some countries that are further along in their patterns

Step 2: Normalize the starting time so that it’s comparable

Step 3: Separate the countries that have slow growth rates (our best case scenario) from the countries that have fast growth rates (our worse case scenario)

Step 1: Learning from countries that have been at this a while

Chances are that you’ve already seen the China Hubei charts that show the cumulative confirmed cases and the new confirmed cases:

Chart 1: Cumulative confirmed cases for China Hubei

Chart 2: New confirmed cases for China Hubei

Even though this is could be considered a “worst case scenario” pattern, the most positive aspect of it is that the pattern has stabilized, and it happened in approximately 7 weeks. The shape of the cumulative curve is an “S” shape, meaning it has an exponentially increasing pattern in the beginning, an inflection point in the middle in mid-February, and an exponentially decreasing pattern at the end.

It is widely reported that the growth rates in other China provinces were substantially improved based on the learnings from the Hubei region. As an example, the same curves are shown for the Chongqing province.

Chart 3: Cumulative confirmed cases for China Chongqing

Chart 4: New confirmed cases for China Chongqing

Visually we can see the S curve last with the first week showing exponentially increasing growth rate, an inflection point in the second week, and the third week already showing an exponentially decreasing growth rate. So, we could consider curves like the to be our “best case scenario”. You can also see from Chart 4 that the inflection point happens when the number of new confirmed cases per day starts to decrease from one day to the next. It actually happened twice … once in early February (in week 2) and again on the second week of February (in week 3).

For another “worst case scenario” we look to Spain.

Chart 5: Cumulative confirmed cases for Spain

Chart 6: New confirmed cases for Spain

We can see that we have not hit our inflection point yet and that the number of new confirmed cases is still zigzagging upwards.

Now we need a way of lining things up, so we can see if this is better or worse than expected.

Step 2: Normalize the starting time so that it’s comparable

A simple and easy way of normalizing the starting time is to take note of the first day where there were 10 confirmed coronavirus cases in a given region. This is the first toehold in the region. We know that if there are 10 confirmed cases then chances are there are a lot more cases that are in the community that have no symptoms or have not yet been tested and reported.

Chart 7: Cumulative confirmed cases for Spain and Chongqing

Now that the confirmed cases are on the same scale, and the time axes are the same scale we can see that the Spain situation on day 14 (March 10, 2020) is remarkably worse than observed in Chongqing on day 14 (February 6, 2020). The Chongqing province was already showings signs of slowing down at that point, whereas the country of Spain was still in the period of exponentially increasing growth. Spain has 39,885 confirmed cases as of March 24, 2020.

Similarly, we can normalize the starting time is to take note of the first day where there were 100 confirmed coronavirus cases in a given region, and again for the first day there were 1,000 confirmed cases, and the first day there were 10,000 confirmed cases.

We can use these numerical milestones to measure and compare the growth rates. We can look at how long does it take for a region to increase from 10 confirmed cases to 100 confirmed cases, and then how long does it take to go from 100 to 1,000 and from 1,000 to 10,000.

Because not all regions have stabilized and matured, we only have so many counts for each. The following table shows us the range of time it takes to go from 10 confirmed cases to 100 confirmed cases

Table 1: Number of days between 10 confirmed cases in a region and 100 confirmed cases in a region

There are clearly some regions that showed a faster initial growth rate of less than 7 days including Italy, Spain and Iran. Interestingly, the Chongqing province had a fast initial growth rate, but we know that they stabilized the growth so rapidly that they never hit the next milestone of 1,000 confirmed cases.

There are several regions that had a very slow initial growth rate including Japan, Singapore, Hong Kong, Australia and Taiwan. We also know that some of the regions with slow initial growth rates lost their gains later.

Chart 8: How long it takes to go from 10 confirmed cases to 100 confirmed cases in a region:

The chart shows that, in general, this first part of the growth curve typically takes 1 week with some regions able to stretch it past 3 weeks.

If we go through the same exercise for the transition from 100 confirmed cases to 1,000 we observe that it takes about a week for that to happen too. Iran and the Hubei province of China had the fastest growth rates, whereas Canada, Australia and Japan were able to slow their growth rate down during this period to 11, 12, and 30 days respectively.

There are fewer countries that have transitioned from 1,000 to 10,000 confirmed cases, but the pattern is about the same again … it takes about a 1 week for this transition to happen.

Step 3: Separate the countries that have slow growth rates (our best case scenario) from the countries that have fast growth rates (our worse case scenario)

As shown previously, we are able to use these milestones to identify the countries that represent our “best case scenario” and our “worst case scenario”.

Chart 9: Best case scenario growth curves for cumulative confirmed cases

The best case scenario would be to follow the same slow initial growth in cases observed in Hong Kong, Singapore, Taiwan and Australia. Note that the scale of the y axis is 300 cases.

Chart 10: Worst case scenario growth curves for cumulative confirmed cases

The worst case scenario would be to follow the rapid initial growth observed in Italy, Spain and Iran. Note that the scale of the Y axis is 30,000 cases, a 1,000 times larger than shown in Chart 9.

So, now that we’ve done that, how are we doing?

Many regions are just in the starting phase of their growth pattern, and now we can use the above curves to see if we’re on the trajectory of Spain or if we’ve got this managed like in Taiwan.

I’m from Canada so, in a very un-Canadian way, I’m going to show our country first.

Chart 11: New confirmed cases for Canada

We’re increasing our cases every day, so it looks like we’re still in the exponentially increasing part of the curve. Our first day of 10 confirmed cases was on Feb 24, 2020 and you can see we’ve done a good job of stretching out this initial period.

Chart 12: Cumulative confirmed cases for Canada, plus best and worse case scenarios

If we compare ourselves against the best and worse cases, we’ve done very well since 10 confirmed cases. If we recalibrate the worst case scenario based on the start date of 100 confirmed cases we can see that Canada is reasonably in the middle of the best and worst case scenarios. Clearly we don’t have a shot at achieving the low case counts observed in Taiwan and Singapore, but things could be worse.

Turning our attention to our neighbors to the south we can see a much more concerning picture.

Chart 11: New confirmed cases for the US

The numbers are bigger, as would be expected for a larger country and it looks like they are still in the exponentially increasing part of their curve too.

Chart 14: Cumulative confirmed cases for the US, plus best and worse case scenarios

The more concerning pattern is the growth rate relative to the worst case scenario. The US is increasing at a much faster rate than expected.

Obviously we’re treating each region as if they were comparable, which they are not. The population and the population densities are different. But to put it in perspective, if we filter on just the state of New York which has a smaller population than Canada, and apply the same growth curves we get the following chart.

Chart 15: Cumulative confirmed cases for the State of New York, plus best and worse case scenarios

New York state has approximately one third the population of Hubei, but they already have 39% of the confirmed cases reported in Hubei (26k versus 69k). On a positive note, the new cases reported on March 24, 2020 were the first ones to decrease in more than 10 days.

Looking back to Canada we can see that there are regional differences here too.

Chart 16: Cumulative confirmed cases by Canadian province

The province of Quebec is following the “worst case scenario” pattern, whereas many provinces have such small counts that they are not even registering in this analysis. Most provinces are well between the best and worst case scenarios.

The best and worst case scenarios provide us with a relative guidepost. They don’t tell us where we’re necessarily going to be next, but more so they give us an idea of where we are now.

Main take away messages

In summary, these are the things that I did not know until I did this analysis:

  • All of the regional growth patterns of confirmed cases generally follow the same “S curve”
  • The regions that are doing well are the ones who have been able to nip the problem in the bud … they get to the inflection part of the “S curve” and never get to 1,000 confirmed cases
  • Some regions like France and Germany did well at reducing the growth at the beginning but then lost their gains later … this may indicate the importance of not getting over-confident
  • Many regions have reached the slowing of the growth curve by 3 to 4 weeks … the worst may be yet to be seen, but it looks like the upper end has been 6 to 7 weeks for those regions that have been dealing this for a while

The Cost of Knee Replacements in Canada

Healthcare costs represent a significant part of the Canadian budget, and every year they grow through increases in inflation, population, and utilization.

What most Canadians don’t realize is that it’s actually quite difficult to figure out what it costs to deliver different treatments in different hospitals.

Luckily the Canadian Institute for Health Information collects information from across the system compiles that into their Patient Cost Estimators.

A new infographic by AnalysisWorks summarizes what we know and what we don’t know about the cost of knee replacement surgeries in Canada.

The cost of a knee replacement in Canada

Please feel free to connect

Via our websites: http://www.analysisworks.com and http://www.light-house.ca

Via LinkedIn: http://www.linkedin.com/pub/jason-goto/2a/bb/a5a

Via Twitter: #analysisworks

How to Create KPIs that Clinicians Care About

Healthcare organizations around the world face the challenge of creating Key Performance Indicators (KPIs) that clinicians actually care about. The problem is, it’s usually the administrators who lead these efforts, and they usually use administrative data to drive the metrics and … surprise, the KPIs end up being very administrator-focused.

But that’s rarely the intent. Healthcare leaders often introduce KPIs with a goal of driving improvement through awareness and understanding.

Clinician atlas kpi

So how can an organization create KPIs that clinicians care about? Here are 4 steps:

Step 1: Engage clinicians in the design process
When introducing a change, most would agree that a winning strategy is to involve people who will experience the change in the design process. Clinicians are no exception. They are often very numbers-driven, but want the numbers to mean something to them. The level of consultation and engagement doesn’t have to be a lot … it can be seeking advice from a few key clinical leaders and influencers, or it could be a full-blown focus group with stickies. What’s important is that the clinicians have had the opportunity to give input and feedback.

Generally speaking this is best done with a group of 5 to 10 representative clinicians, led by a facilitator who is good at condensing big ideas and focusing on the biggest issues. It’s also ideal to have a “data person” present, though, the facilitator should take special care not to limit the KPIs to the available data. Try to encourage this group to brainstorm their top 10 KPIs … it’s important to avoid what Brad Hams (author of Ownership Thinking) describes as going to the “KPI Buffet” where people select way too many KPIs to the point that the information is overwhelming.

Step 2: Think about what the clinicians will do differently based on the KPI
After brainstorming the top 10 KPI candidates, the next exercise is to focus on ensuring the KPIs are actionable. If a KPI is actionable it will:

  • Tell you how you are performing on something that you can influence, and
  • Tell you in a time frame that you can do something about it

The “something you can influence” is a key concept. There is no point in monitoring KPIs that people have no control over. In complicated environments like healthcare, where many different people contribute to performance, it’s fairly common for everyone to think the problem is because of someone else. Borrowing again from Brad Hams, you can identify the clinician that has the most influence on a given KPI and then put their name next to the KPI as the KPI owner. This will go a long way towards making at least a few clinicians care about the KPIs!

Ideally, through this lens of making the KPIs actionable, you will have decreased your list from 10 KPI candidates to 5 or less.

Step 3: Keep it to 3 KPIs
As much as we would all like to fix everything right away, the reality is that most people, departments and healthcare organizations can only really focus on a few key areas at one time. So, push the design team to select the 3 best, most actionable, most relevant, and most impactful KPIs. By the time that you’ve charted these KPIs over time relative to their targets, there will still be lots to look at.

Once these first three KPIs are well on target and under management, they can be removed from your top-of-mind monitoring reports and replaced with other actionable KPIs that didn’t make the list.

An alternate possibility is allow 3 KPIs in each clinical area. For example, there are KPIs that are relevant to Surgery (such as post-operative infection rates, or compliance with the surgical checklist) that would not be relevant to Medicine.

Step 4: Put the KPIs into action, and stick with it
There will likely be some work in actually getting the data for these KPIs in a clean, accurate and reliable format. As mentioned before, some of these KPIs might not have data sources yet, and some ground work will need to be done to start collecting the data prospectively. Bear in mind that it’s much more valuable to have manually collected data feeding into an actionable KPI than pre-existing data sources feeding into a KPI that can’t be influenced.

To put the KPIs into action:

  • Post them publicly
  • Build a habit of reviewing them as a routine part of standing clinician meetings
  • Publicly celebrate when targets are met, making sure to recognize each individual who contributed to the success
  • Retire the KPIs when they are under management, and when there are more important KPIs for the top 3 positions. Resist the urge to grow the KPI list.

If you have stories about how you’ve created KPIs that resonated with clinicians, please share them. And as always, please feel free to connect

Via our website: http://www.analysisworks.com

Via LinkedIn: http://www.linkedin.com/pub/jason-goto/2a/bb/a5a

Via Twitter: #analysisworks


4 Tips to Unclog Your Data Team

A common problem that good Data Teams face is that they are significantly backlogged. They are pulled in many different directions by different leaders with different priorities. It’s a good sign that they are a valued asset to your organization, but it can be frustrating waiting for them to get to your urgent requests. Sometimes it’s like they are clogged up like bad plumbing …
Unclog your data team

So, what can an organization do to unclog their Data Team? Here are four tips:

Tip 1: Get crystal-clear on the outcomes of the Data Team
Data Teams often spend a lot of time talking about their efforts and the resources they feel they need. But instead it’s better to focus on the outcomes of the team … how will your organization know that the Data Team is doing a good job? Until everyone in the organization (including the Data Team) is clear on the outcomes that they need to achieve, the demands on the Data Team will continue to grow unchecked. Some example Data Team outcomes could be:

  • To ensure top-level management has the reports they need to maintain profitability
  • To provide management insights on market competitiveness
  • To trigger management alerts on operational areas that require attention
  • To detect patterns related to decreasing customer satisfaction
  • To support improvement projects in the organization

… and so on (Hint: It would be unrealistic for most Data Teams to attempt to meet all of these outcomes.)

Once the outcomes of the Data Team are clear, the next step is …

Tip 2: Calculate the ROI for different Data Team efforts
Most Data Teams hold responsibilities for maintaining reports and analyses … some of which are easy and some of which are very hard. Rarely do the users of these deliverables appreciate the effort that goes into them, particularly when there is a lot of interpretation required, or a lot of extra data cleaning that can’t be automated.

In these situations it may make sense to assess if the value of the information is commensurate with the effort involved in generating it. This is especially true if there is a suspicion that the information isn’t really being used for decision-making. More tips are on this topic are described in the blog post Turning Analysis Into Action, but generally speaking the Data Team efforts should be fully aligned with the outcomes of the Data Team.

If the ROI on a difficult analysis isn’t there then …

Tip 3: Give your Data Team permission to purge
Data Teams typically find themselves in situations where they don’t have enough capacity to meet all of the demands imposed on them. And every week requests for new analyses and reports come up.

So, if they are working on difficult things that are clogging them up, empower the team with a business process to periodically review the ROI of the analysis and how popular it is. Set a bar for minimum expectations, and discontinue anything that doesn’t meet it. For example, if a report is only being used by one or two people, that’s a pretty good sign that it could be discontinued. The whole power of reporting is creating common measurement of performance that everyone can get behind. So, if a complex analysis is only interesting to one or two people, then chances are they aren’t aligned with the rest of the organization.

A sure-fire way to test the popularity of a periodic report is to just let the report take a vacation. If you don’t provide the report, does anybody come asking for it? If not, then you’ve just liberated some bandwidth for your Data Team.

But you don’t have to stop there … you can unclog your Data Team even further with the next tip …

Tip 4: Hold some reserve capacity for emergent work
Important and urgent things come up, and when they do, Data Teams often drop everything to respond. So why not maintain some reserve capacity for this? You can even review your past urgent and important requests to get a sense of the timing of these requests … year end, month end, just before planning sessions, etc.

As a Data Team when you plan out your week, and assign responsibilities, try as best as you can to not schedule every last hour. Build a couple hours of flex into every day, or plan for “catch up” days. Worst case scenario, your team members can get ahead on some neglected projects with this flex time. Best case scenario, when your CEO calls needing something urgent, you’ll be able to impress them with your ability to respond quickly.

If you have stories about how you’ve unclogged your Data Team, please share them. And as always, please feel free to connect

Via our website: http://www.analysisworks.com

Via LinkedIn: http://www.linkedin.com/pub/jason-goto/2a/bb/a5a

Via Twitter: #analysisworks

Note: What is a Data Team?
When we refer to “Data Teams” it’s a catch-all for groups of technical, statistical, and subject-matter domain experts that are involved in providing information to support their organization. These teams are sometimes called “Business Intelligence”, “Decision Support”, or “Information Management”, but they can also be internal consultants such as “Operations Analysts”, “Strategic Information” or “Research”. Many of these concepts equally apply to teams of Data Scientists.

3 Tips for Turning Analysis Into Action

Today’s analytics tools offer deeper insights into your data than ever before. But if you were to take an objective look at your reports, how many of them actually drive action? How often do you spend time looking at your analytics and have the following thoughts:

  • “So, what am I supposed to do with this?”
  • “Ok, our KPIs are up, but how much of it is because of things we did?”
  • “Sometimes we’re above target and sometimes we’re below target … at what point should I start paying attention?”

If you find that you are asking yourself these questions, then you are probably not realizing a return on investment from your analytics. The whole point of having better analytics is to give you, your department and your company the tools to be successful. If your analytics don’t directly support you in taking action, then the entire investment may be wasted.

While organizations are very keen to invest in the latest technologies, they often invest little to no effort in ensuring that the analytics trigger the right actions. This is often the weakest link in the analytics chain.

Weakest Link Analytics
These are some basic tips that your organization can use to turn your analytics into action.

Tip 1: Find the critical few metrics that matter
Many new adopters of analytics are tempted to show all of the data and metrics, assuming that more information means more power. But this approach is often counter-productive, because different people in the organizations focus on different things … instead of everyone pushing towards a common goal.

In Mastering the Rockefeller Habits Verne Harnish describes the top metrics as “the critical few” … and he means it. He encourages organizations to find their one, two or three metrics that show whether things are going well or not. For organizations with a wide suite of performance metrics, some consideration could be given to building an aggregate measure, just like a GPA score for a student. More tips to improve your metrics are described in my post on Escaping KPI Hell.

Tip 2: Assign responsibility to each metric … take names
The expression “when everyone is in charge, nobody is in charge” rings true here. When metrics are reported, and it’s unclear who is responsible for taking action, more often than not people will assume that someone else is taking care of it.

The simple practice of designating individual responsibility to each metric can create drastic change in performance. Once the “metric owner” has accepted this responsibility, their name is written next to their metric in the reports. That individual now feels a personal responsibility to make sure the metric looks good, and will start taking action to make it happen.

There will be situations where the metric owner doesn’t have all of the authority and resources to drive the metric, but they can at least report on the factors that are most affecting performance.

Tip 3: Walk the talk … keep score and show that it counts
Another common reason why companies don’t realize a return on investment on their analytics is that they don’t build in their metrics into their daily, weekly, monthly and quarterly dialogue.

By focusing on a few critical measures (and making sure they are the right ones), leaders should see the review of the analytics as something that helps them, rather than another new task to squeeze into the day. During these reviews the main questions are:

  • How are we doing?
  • Why?
  • Who needs to do what to achieve our performance targets?

This isn’t about finger-pointing … it’s about having a common score card, with specific people in charge of the few key metrics, and empowering them to take the right actions that help the organization as a whole succeed. These very basic tips can go a long way towards increasing the value of your analytics.

If you have stories about how you turned analysis into action, please share them. And as always, please feel free to connect

Via our website: http://www.analysisworks.com

Via LinkedIn: http://www.linkedin.com/pub/jason-goto/2a/bb/a5a

Via Twitter: #analysisworks



Outsourcing Analytics vs DIY – Tips for Executives

If your organization has not yet embraced analytics, you may be wondering “what’s the best way to get started?” A key decision at the beginning is whether or not to bring in outside expertise to kick start the process, versus the traditional approach of recruiting an internal team. Another key decision is “which analytics software should we buy?” This post outlines some tips that executives can use to move forward.

Analytics DIY vs Outsourcing

Tip 1: Buying analytics software shouldn’t be your first step
There’s an incredible array of analytics software available in the market, many of which are marketed as turn-key solutions. The idea of an off-the-shelf solution appeals to a lot of business leaders … they are drawn towards the idea of a having a tangible asset that works right out of the box, without having to worry about the pesky people issues.

But, there are a lot of negatives that come with this approach:

  • The tool is only as good as the strategic thinking that goes into how it will be used. If you run an analytics tool on poor or incomplete metrics the tool doesn’t have a chance of creating business value.
  • The tool is only as good as the analyst running it. The analyst is the interface between the real business problem, and how that business problem is translated into the data and metrics in your system. If that translation is poor, then the tool is unlikely to generate powerful results.
  • The tool will quickly be discarded if it’s not generating business value. This will create a belief within the organization of “been there, done that … we tried analytics and it doesn’t work for us”. This can mean that your organization will fall behind the competition.

Tip 2: Think hard before recruiting from within
It’s not uncommon for an organization to build their analytics team with their existing staff. This approach increases the chance that your analytics team will get what your business is about, and hopefully they also represent the culture of your organization.
Hello I'm the VP of Analytics
A challenge with this approach is that the team members who are recruited from within are often not able to give full attention to their new position, because they are still holding responsibilities related to their old roles. Another challenge with this approach is that there’s a risk of missing a big opportunity to take a fresh look at how the organization uses analytics to drive their key decisions. For example, if you recruit from your finance department, chances are that your analytics will be very financially focused. These concerns can be overcome, but it certainly helps to think about these considerations before making a decision.

Tip 3: Find a recruiter you can trust
If you’re building up a new team with external hires, getting the ball rolling can be tricky. Most organizations start by hiring the team leader, and then ask the team leader to do all of the following recruiting. A challenge with this approach is that whoever is hired first often sets the possibilities and the limitations of the team. For example if the first hire is a fan of traditional multivariate statistical approaches, chances are they will pursue analytics applications in that area, while leaving all other opportunities behind. They will create demand for their favorite analytics applications, and therefore hire other team members that have that same skills set (i.e. “he who has a hammer sees everything as a nail”).

So, the first hire with this approach is a crucial one, and given the specialized and nichy aspect of analytics, this will be a hire that you’ll likely do best to work with a recruiter that you trust. If you are successful in hiring a strong team leader, think about using the Who Method for setting targeted outcomes for the first 90, 180, and 365 days. These outcomes should reflect the business value that your organizations wants to get out of having its’ own analytics team.

Tip 4: Find an analytics consulting firm you can trust
The alternative approach would be to start off with an external consulting firm that specializes in analytics, and do a demonstration project with them. This approach is especially useful, as it allows you to start off with an experienced team and make progress quickly. This both increases the range of analytics that can be considered, and increases the chance of having a successful first project.

To get even more value out of working with an analytics consulting firm you can look at options for them to help you move towards building your own team. You can ask them:

  • Based on the work they do with you, can they build a “leave behind” tool that allows you to update the results yourself?
  • What insights do they have on your local job market for analytical talent?
  • Could they support you in building a recruiting plan?

Often leaders are hesitant to bring in an outside consulting firm because they don’t know what to look for, and they are worried about hiring the wrong firm, and/or asking for the wrong type of support. But what is less risky … hiring a consulting firm to do a “prove yourself” demonstration project, or building up a team of full-time staff with a completely new area of expertise?

Either way it’s generally better to focus on your people and processes first, and then afterwards, figure out the analytics software they need to do their job. Building an analytics capability in an organization takes a while. There are more things that can go wrong than go right. If you take a long term view, it makes sense to begin small (both with people and projects), realize some early wins, and gradually build the team based on the business value that they generate.

If you have stories about how you built your analytics team, please share them. And as always, please feel free to connect

Via our website: http://www.analysisworks.com

Via LinkedIn: http://www.linkedin.com/pub/jason-goto/2a/bb/a5a

Via Twitter: #analysisworks



3 Simple Checks to do Before Expanding your Data Team

If you’re the leader of a Data Team, chances are your clients are constantly demanding more and more services as time goes on. Your team members might be working longer days to keep up, and still you might not be able to meet all of the needs of your customers.

While the typical approach would be to try and get funding for more team members, there are other things that could be done first.

Grow your team
Here are 3 simple checks that you might want to perform before trying to grow your Data Team:

Check 1: What are the patterns of demand for your Data Team?
Despite being a numbers-driven group of people, it’s uncommon for Data Teams to actually analyze their own pattern of demand. Some questions to think about are:

  • How often do new requests come in?
  • Who do they come in from?
  • What is the nature of the requests?
  • What is the urgency and target turn-around time for the requests?
  • How long does it take to clarify the request?
  • How long does it take to deliver a result?

Getting a picture of your demand patterns will help you better understand what’s driving the level of busy-ness in your Data Team. It may point you in the direction of converting repeat requests into automated self-service reports. Or it may highlight those customers that have a chronic pattern of last-minute urgent requests, and in these situations you might benefit from proactively checking in with them once a week to see what might be coming up.

Or, at minimum, having this information will be your first point of evidence that your Data Team could benefit from having more team members.

Check 2: Is your team working efficiently?

As the leader of the Data Team you might be convinced that the team is working as efficiently as possible. But think about how you might go about convincing others. Some questions to consider are:

  • How many work hours does it take to respond to requests from your customers?
  • Does it take some team members less time than others to get things done? If so, what skills are teachable and transferrable between team members? Or, are there any team members that just aren’t pulling their weight on the team?
  • How much time does your team spend doing “disaster recovery”, meaning situations where some bad numbers have been released by the team, and they are scrambling to correct the numbers? If this is significant, then implementation of quality control measures like the Consistency Check can help.
  • What’s the percentage of time that your Data Team is doing mundane and repetitive work? This may point to the need to further streamline and automate your processes, and/or offer standardized self-serve reports for frequently requested information.
  • How many iterations (back and forth with the customer) does it take to complete a request? Are there opportunities to increase efficiency, by slowing down at the beginning, and getting clear on the what, when, why and how of the request?
  • What is the pattern of work hours for your Data Team members? Are they constantly working late, and if so, is it measured anywhere?

By attempting to answer these questions you, as the Data Team leader, may find that you have some easy opportunities to pursue before trying to seek funding to grow your team. Or alternatively, by answering these questions you will have the evidence to show that your team is working as efficiently as possible.
Dollar sign
Check 3: When the Data Team can’t respond to requests, what does this cost your organization?

It’s very rare for a Data Team to keep track of the requests they can’t get to, which is a shame, because this can be invaluable information when thinking about expanding the team.

At minimum, a central log of requests can be set up, to track all requests that are made of your Data Team. The log should capture 1) requests that were accepted and delivered, 2) requests that the Data Team couldn’t respond to, and 3) requests that were accepted, but have been delayed by more than a couple of weeks.

Then, as the leader of the Data Team you could follow up with a sample of your customers that had unmet needs in the past year, and work with them to estimate the cost of not having that information. It doesn’t have to be anything fancy – even just a back of the envelope calculation is better than nothing. You may need to be creative, but you could consider the cost of making the wrong decision, or the cost of an adverse situation unfolding because of lack of awareness.

Pulling it all together, you now should know the pattern of your demand, the efficiency of your Data Team, and you should also have a rough idea of what it costs your organization when you can’t respond to requests. This will be the type of evidence that your leadership team can use to make an informed decision regarding whether to expand your Data Team or not.

Note: What is a Data Team?
When we refer to “Data Teams” it’s a catch-all for groups of technical, statistical, and subject-matter domain experts that are involved in providing information to support their organization. These teams are sometimes called “Business Intelligence”, “Decision Support”, or “Information Management”, but they can also be internal consultants such as “Operations Analysts”, “Strategic Information” or “Research”. Many of these concepts equally apply to teams of Data Scientists.

 

Are you reporting what you can? … Or reporting what you should?

Too many organizations are missing opportunities to use their data and analytics as a competitive advantage. Often leaders believe that having an abundance of data, reports, and colorful charts is what it means to be data-driven. But gaining competitive advantage through your data and analytics involves making sure leaders have the information they need when they need it.

But it’s extremely common for people in an organization to become complacent with the information that’s always been available, instead of demanding the data that will serve them better. The idea of “if all you have is a hammer, everything looks like a nail” is often evident in performance reporting … with too many indicators describing the same concept, while ignoring other critical parts of the organization.
Dropping the hammer
To get some “out of the box” thinking it may be helpful to do an assessment every so often to make sure that the data and information is actually generating meaningful value to the organization.

Here are four questions to consider:

1) Does the current analytics help your leaders make decisions?
And, does it support them in taking the right action? If not, why? Take a hard look at the data and information available to the users with respect to:

  • Timeliness: Is information provided in a timeframe where users are actually able to take action? Or is the information so far out of date by the time it’s reported, that it provides little value for performance management?
  • Trustworthiness: Are leaders having difficulty trusting the data, because of accuracy and reliability issues?
  • Relevance: Are leaders having difficulty in seeing the relevance of the existing data? For example, seeing your current performance relative to past levels, or relative to industry standards.
  • Understanding: Do leaders actually understand the data or metrics that are being reported? Do they know where the data comes from, and what it represents? Are some of your metrics needlessly complicated? Is the data presented and charted in an immediately intuitive and visual way? Is it clear who is responsible for what? As best as you can, figure out what do your leaders really should know before taking actions based on the data and analytics.

If you have any of these issues, you’ll need to resolve them if you want to increase the competitive value of your data and analytics

2) Do you have too many metrics?
What are the few metrics (3 to 5) that cover the majority of what’s important, to the majority of the people in your organization? For example, if you have 20 metrics then think about how you can group them into topics and create aggregate metrics (i.e. like a GPA summarizes a student’s letter-grade performance across courses). Reporting dozens of metrics is easy, but not very actionable, because at most times some metrics will look good and some metrics will look bad. Slimming them down to the few metrics is a tough job, but when done right, it can be an incredibly powerful tool for communicating through the organization “what good performance looks like”.

3) What do your leaders want?
What would help them do their job better? Carry out working sessions with your leaders to figure out what the ideal data and analytics would look like, and more importantly, have them describe the actions they would take with this information. Ask them to think outside the box, and not restrict themselves to the data that they’ve seen before. Then ask your leaders to set priorities using the question: If you had to choose one metric, what would it be? If you’re doing this as a group working session, you will generate a lot of informative discussion, and a decent group of new metrics.

4) What is the ROI on the right metrics?
On your top-ranked metric for new data (i.e. data you currently don’t collect or have available), do a cost-benefit analysis. How much is it worth to your organization to have this information? Taking this approach can reveal opportunities to make a strategic investment in order to gain a competitive advantage. See Tip 3 from How to Get the Data You Need for more details.


If done right, this exercise will generate a burning need to make the investments to help leaders have the information they need, when they need it. And in doing so, empowering them to take meaningful actions based on truly relevant data and analytics. Translating analysis into action will create a competitive advantage for your whole organization.