How to Misuse Global COVID-19 Statistics

How to Misuse Global COVID-19 Statistics

Have you heard about Somalia’s COVID-19 policy?

In the official statistics, Somalia has just 3,362 confirmed infections and 97 confirmed deaths from COVID-19. On a per capita basis, the country has a death toll just barely above New Zealand, and below other widely reported success stories like South Korea, Japan, Norway, Germany, etc.

Yet, my guess is you probably have not heard much about how Somalia defeated the coronavirus. Why is that?

The reason is that, implicitly, no one thinks Somalia’s statistics reflect the underlying reality. Still the recurring target of US drone strikes and dealing with internal conflict, Somalia–like many other impoverished or conflict-torn countries around the world–has little capacity to deal with COVID-19.

Thus, their numbers probably aren’t low because they came up with a uniquely successful strategy. They’re low because they aren’t counting.

The Implications of Incomplete Data

All of this may seem obvious. However, the implications of this observation are routinely forgotten in reports on the coronavirus.

The problem is two-fold: First, Somalia is not an isolated example. There are numerous countries around the world with extremely low case numbers that are almost certainly caused by a lack of testing and counting, not actual policy success. Second, all of these same statistics roll up into the official global COVID-19 totals.

While everyone seems to recognize that numbers from Somalia and others are not real, they forget that this necessarily means that the global totals are also compromised.

We see versions of this error constantly, but here are a couple examples to watch out for.

CNN: The US has 4% of the world’s population but 25% of its coronavirus cases

Depending on the date of the article and metric chosen (cases vs. deaths), the second percentage in this claim will fluctuate somewhat. The point is to show that the US accounts for a disproportionate share of damage caused by the coronavirus. That general claim is valid, but the uncritical use of global statistics greatly exaggerates the disparity.

According to Worldometers, approximately 1.5 billion people live in countries where very limited testing (less than 1% of population) has been done. Additionally, some large countries like India only recently ramped up their testing program, and still have totals well below the US for the moment.

This line is a standard inclusion in many stories on the coronavirus. The article used above is slightly unique only because it led with it in the headline.

NYT: America’s Death Gap

Here, The New York Times can be commended for at least making a passing reference to the old missile gap canard. Hopefully, this tipped readers off to the fact that what they were about to read was not true.

In the piece, the Times offers the following thought experiment:

If the United States had done merely an average job of fighting the coronavirus — if the U.S. accounted for the same share of virus deaths as it did global population — how many fewer Americans would have died?

The answer: about 145,000.

That’s a large majority of the country’s 183,000 confirmed coronavirus-related deaths.

The problem with this is not that their math is wrong; the problem is that they’re relying on figures that everyone–including them–knows or should know to be unreliable.

As of this writing, the official global average COVID-19 death toll is 114 per 1M people. But this figure is severely diluted by all the countries that have large populations and limited testing.

If we were to treat this global average as a real number, we are left with some rather implausible conclusions.

Yes, the US is much worse than official average with 583 deaths per 1M. Fair enough.

But the same goes for countries like Canada and Switzerland which have 242 and 232 deaths per 1M, respectively. Does anyone think Canada has performed twice as bad as the average country in this pandemic?

Likewise, oft-praised countries like Germany (112)  and Denmark (108) appear to be only marginally better than average. Are we to believe these countries weren’t very successful after all?

No one thinks these other conclusions are true, but it follows from the Times’ line of analysis. The absurdity of the global average only becomes obvious when it’s compared to the results of other countries that are known as success stories.


The point here is not to defend the US’s track record on the coronavirus. It has performed abysmally on every conceivable metric, albeit not literally the worst. (That dubious honor belongs to Peru, with Belgium a close second.)

Rather, this is an argument for basic data literacy. If the details of a data set are unreliable, they don’t magically become reliable when you sum them up or pass them through an econometric model. Garbage in means garbage out.

We live in a time when almost every politician and pundit says we need to “follow the data”. Perhaps we should start by understanding its limitations.

Is Sweden’s COVID-19 Response a Cautionary Tale or a Model to Follow? It’s Complicated

Is Sweden’s COVID-19 Response a Cautionary Tale or a Model to Follow? It’s Complicated

In the ongoing debate about lockdowns in the US, Sweden has become the battleground.

To mainstream media outlets, Sweden’s experience is cited as a cautionary tale. CBS writes that Sweden has become “an example of how not to handle COVID-19”.

Meanwhile, to those who have been skeptical of the lockdown policy all along, Sweden’s results are occasionally cited in glowing terms. For instance, Jeffrey Tucker of AIER tweeted this out last week, showing that Sweden’s daily death toll has slowed to a crawl:

2020 07 20 Jeffreyatucker Sweden Daily Deaths Trend

So which version is true?

Did Sweden’s less restrictive approach to COVID-19 usher in the hellscape that US public health officials have warned us about? Or is it actually a model for the rest of us to follow?

It’s too soon to know for sure. But the data we have so far suggests the answer is not black-and-white.

The Problem of Cherry-Picking
In the CBS article, they point to the per capita COVID-19 death toll in Sweden to declare it a policy failure. Writing last month on July 17, CBS notes:

…the death toll from Sweden’s outbreak is now the fifth-worst in the world, per capita. The country’s mortality rate from the coronavirus is now 30% higher than that of the United States when adjusted for population size.

On that date, this was true. It is missing some important context, however.

For starters, if Sweden was fifth-worst in the world, why is the article about Sweden? If the point is to identify some COVID-19 policies that clearly failed, there would seem to be at least four candidates just as worthy of criticism.

Excluding the tiny nation-states of Andorra and San Marino, the four European countries that had experienced higher per capita death tolls than Sweden at the time of CBS’s piece were Belgium, UK, Spain, and Italy. In the last two weeks, Peru has also overtaken Sweden in terms of per capita death tolls.

All of these countries imposed lockdown policies, and still experienced death tolls higher than Sweden. Italy might have an excuse as the first major hotspot in Europe, but what accounts for the others?

This selective criticism of Sweden by mainstream media has been called out elsewhere with good reason. This is classic cherry-picking–finding facts to fit a predetermined narrative.

To be fair, Sweden’s proponents can also be guilty of omitting context.

In the tweet noted above, Tucker points to very low rates of new deaths as a sign of success. This is good news, but it doesn’t tell us much. It’s widely understood that viruses will burn themselves out eventually. The lockdown debate is about how best to mitigate the damage in the meantime.

In another example, this tweet from Yinon Weiss, favorably compares the experience of Sweden to New York. The comparison is correct–Sweden has fared far better than New York on a per capita basis. However, this is better evidence of New York’s extreme failure rather than Sweden’s success. If we draw sweeping conclusions from this data point, then it’s just cherry-picking in the other direction.

2020 07 12 Yinonw Sweden Vs. New York Covid 19
Obviously, writing a tweet is different than writing an article. Twitter isn’t exactly built for nuance.

The point is that, so far, Sweden’s results are mixed. They don’t warrant a victory lap for anyone.

Voluntary Versus Coercive
While Sweden seems to be viewed by the rest of the world as a radical experiment when it comes to COVID-19, that’s not how Sweden sees itself.

Speaking to Nature magazine early on in the pandemic, Sweden’s state epidemiologist Anders Tegnell explained bluntly, “I think it has been overstated how unique [Sweden’s] approach is.”

For Tegnell, Sweden’s policy objective is the same as for most other Western countries–flatten the curve to avoid overrunning the healthcare system.

The primary difference is in Sweden’s laws. As he explained in the interview (emphasis added):

The Swedish laws on communicable diseases are mostly based on voluntary measures — on individual responsibility…This is the core we started from, because there is not much legal possibility to close down cities in Sweden using the present laws.

By itself, this almost implies that Sweden would have been just as coercive as other countries if they had the authority. (And as an American, it sounds extremely odd to hear a national government official acknowledge any legal constraints on their power. But I digress.)

However, in a separate April interview with Haaretz, Tegnell argued that the voluntary approach has strategic advantages over coercion. In particular, Tegnell noted that the voluntary measures could be kept in place for an extended period of time. In his words, “We believe that what we are doing is more sustainable and effective in the long term.”
The Importance of Sustainability
Like most other countries, Sweden’s experts share the view that the virus will only stop being a threat once herd immunity is reached or an effective vaccine is developed. “Every other solution is temporary,” Tegnell told Haaretz.
Since both of those solutions are likely months away, the sustainability of the policy response is critical. This is why Sweden’s approach might ultimately prove more successful than its peers.
Although Tegnell doesn’t say this outright, the subtext of Sweden’s approach seems to be that all countries’ COVID-19 policies will look like Sweden’s eventually.
The lockdowns cannot eradicate the virus on their own and cannot be kept in place indefinitely. That means that when the lockdowns are inevitably relaxed, the virus is still around and able to spread.
When the virus starts to spread anew, the authorities in most democratic countries won’t have the political ability to reimpose lockdowns. So their only real option is to impose lighter, mostly voluntary measures like Sweden has done from the start. 
Unfortunately, this is how things have played out in many places..
Consider the United States. Many states closed down before they had significant spread and reopened while new infections remained at low, but nonzero, levels. Now cases have surged in several states, and the lockdown measures being reimposed are far less strict than those enforced early on. Noncompliance is also on the rise. 
Today, most states’ lockdown policies are still more restrictive than Sweden’s. But this does show the unsustainable nature of the prior approach. Given that the states have landed on less restrictive policies anyway, the utility of the initial authoritarian policies is unclear. The collateral damage of those policies, on the other hand, is visible everywhere.
Premature Conclusions
While Sweden states that its policy goal is to flatten the curve like most other countries, it’s clear that they have taken a less aggressive approach. 
It follows that Sweden’s virus curve should be steeper than the curve seen in lockdown jurisdictions. We can see this visually in the chart below (adapted from PBS):
(Since Sweden is taking some measures to slow the spread, it’s likely that the relative steepness of their curve wouldn’t be as radically different as what this graphic implies. But it does illustrate the nature of the difference we should expect.)
This presents a major challenge for gauging the success of the different approaches while the pandemic is still underway.
The total deaths experienced by the countries in the chart above would be found by taking the area underneath the curve. Because Sweden has accepted a steeper curve, it should experience more total deaths early on. But the number of new daily deaths in Sweden should also drop to near zero earlier than it will elsewhere.
That’s what these curves would suggest, and it’s consistent with what has actually happened.
Now we can see the problems with some of the condemnation and praise of Sweden’s results. Yes, Sweden has experienced more deaths than many of its peers. And yes, for now, the pandemic in Sweden seems to be mostly over even as it rages on elsewhere. Neither of these outcomes should come as a surprise.
To consider the question settled right now is rather like declaring victory based on the score at halftime. It’s not the end of the story.
2020 08 02 Sweden Pandemic Curve Illustration
Where We Go From Here
As I write this, the virus looks to be mostly contained in Europe, but continues to spread significantly in the US. Based on the data we have so far, it is unlikely that Sweden or any other large country has reached true herd immunity. Promising vaccine headlines get published regularly. But even in the most optimistic scenario, we’re a few months away from a vaccine being proven safe and effective, let alone mass produced.
If current trends persist, the nationwide US per capita death toll is likely to catch and surpass Sweden in the coming months. In just the last two weeks, the gap between Sweden’s death toll and the US’s, has fallen from 30% higher to 19%. Virus cases have shot up in the most populous states (California, Texas, and Florida) that had been largely spared until this point. Increasingly, it looks like the US shutdown caused massive collateral damage without any lasting containment benefit. In a strictly US context, the Swedish approach is looking pretty good.
This conclusion is less obvious when looking at the results of countries in Asia, Europe, or the South Pacific. Countries like South Korea and Taiwan managed to slow the spread of COVID-19 with targeted quarantines instead of all-encompassing lockdown restrictions. Europe has several countries like Austria and Switzerland that locked down and then reopened quickly without reigniting a major new outbreak of the virus so far. In the South Pacific, New Zealand’s more comprehensive lockdown and travel restrictions managed to eliminate COVID-19 locally, and a major new outbreak has not yet occurred.
It’s too early to say which strategy will look optimal in the long run. The final analysis will also need to consider more holistic data points such as excess mortality and economic outcomes. That data isn’t available in real-time like the official COVID-19 statistics, but it will be necessary to properly compare the costs and benefits across countries.
For now, Sweden’s policy isn’t a panacea or a disaster. It remains a crucial control group for the lockdown experiments of 2020.
How to Misuse Global COVID-19 Statistics

NPR Report on Florida’s Record Infections Misleads Its Audience

This Monday, NPR consumers woke up to this alarming report on the coronavirus:

Florida Smashes U.S. State Record Of Daily New Cases: More Than 15,200

Not content to scare readers in the headline, the dire framing continues in body of the report, which was discussed on the Up First podcast. Some illustrative quotes below:

Florida reported 15,299 new coronavirus cases on Sunday, marking the largest single-day increase of any state since the start of the pandemic.
Sunday’s number exceeds New York’s peak of more than 12,200 new cases in one day back in April, when it was the epicenter of the outbreak…

As of Saturday, 7,186 people were hospitalized in Florida, according to the COVID Tracking Project.
Florida started reopening in early May and has continually shattered state records for single-day increases since cases began surging there in June.

Reading this, our discerning audience comes a way with a couple things:

  • Florida’s shattering records for new infections
  • Things in Florida are so bad right now that they’re even worse than New York at its peak

The problem with the statistics they’re using here is not that they are explicitly incorrect. They are correctly citing the official case numbers.

The problem is that they fail to provide any of the context needed for someone to understand what those case numbers actually mean, and what they can tell us about the severity of Florida’s situation.

First things first: Reporting case numbers without adjusting for population is just confusing.

Yes, Florida really has set the state record for new positive confirmed cases in a single day. Florida is also the third most populous state in the country–behind Texas and California, but 10% larger than New York.

So Florida took the lead today. And if current trends continue, Texas and California will claim the mantle in a few weeks’ time. Those records will be trumpeted too, but they will matter just as little.

A better way to compare totals among states is per capita, usually per 100,000 people. On this score, Florida still would have beaten out New York with Sunday’s case count (71 per 100k vs. 63 per 100k), but that comparison is at least superficially meaningful. It also has the virtue that it can be compared to other states and counties.

A more urgent problem with this analysis is that we know the official case numbers dramatically understate the true number of infections. This remains a problem today, but it was much worse early in the pandemic.

The reason for the understatement is the lack of testing capacity. When New York was at its peak, it only had enough tests to use them on healthcare workers and people with the most severe symptoms. It couldn’t test everyone with symptoms, let alone everyone who had contact with a COVID-positive person. As a result of this necessary strategy, many infections went undetected, and the percentage of tests coming back positive–the positivity rate–was extremely high.

Notably, this wasn’t really New York’s fault. It’s more a by-product of them being first (and the CDC and FDA botching the testing rollout).

In Florida today, the total number of infections are still being undercounted. But the testing shortage is less acute, so more people can be tested and the confirmed number is going to be closer to the real thing.

We can see this reflected in the data in two ways. The first is the positivity rate mentioned before. The second is per capita hospitalizations. (Since severely ill patients were able to get tested in New York, we can assume the hospitalization number there is reasonably accurate.)

At the height of its outbreak, New York was reporting a weekly positivity rate over 40% and 97 out of every 100,000 people were hospitalized.

But as of July 13, in Florida, the weekly positivity rate stands just below 19% and 37 out of every 100,000 people are hospitalized. (Note that my data source here–the COVID Tracking Project–only recently started getting hospitalization data for Florida, so that data point doesn’t exist earlier in the crisis.)

To be sure, Florida is definitely having an outbreak right now. No one is suggesting these numbers are a cause for celebration.

But it’s completely misleading to suggest that Florida’s situation today is worse than things were in New York a few months ago. The only metric that supports this notion right now is the one we know, with certainty, is unreliable.

The point here is that we want people to have a realistic understanding of the how bad or benign things truly are. It’s okay if people are worried, but we want the level of concern to be proportionate to what’s actually happening.

So yes, it’s a problem when Trump says the coronavirus is just like the flu and will magically disappear. It’s also a problem when hyperbolic media reports make people so afraid of COVID-19 that they refuse to go to the ER when they have a heart attack.

Three Observations on the Second Wave of COVID-19

Three Observations on the Second Wave of COVID-19

The long-feared second wave of COVID-19 in the United States appears to have arrived. National case numbers are making new records and two states have started to move back towards quarantine.

Since news reports on the virus continue to emphasize the wrong metrics, some important facts about the new wave often get missed. Here are three things to know about the new rise in cases.

1. The recent rise in cases cannot be explained by increases in testing.

This is an important point because the onset of a second jump in cases has been declared prematurely several times before now. These types of reports came in different flavors. Sometimes, they calculated percentage growth rates off extremely small numbers (at a county level for instance) to report an eye-popping rate of growth. More commonly, they failed to highlight the fact that total testing was increasing faster than new cases–suggesting the virus was probably just as common as it had been days earlier, but the state was able to confirm more cases.

This time is different. Many states are experiencing both a rise in absolute cases, and a rise in the percentage of cases that is coming back positive (the positivity rate). This is a clear sign that things in these places is getting worse with respect to the coronavirus.

As of June 27, these are all the states that was seeing both a positivity rate above 10% and weekly rise in that rate. The original data for the table below comes from The COVID Tracking Project:

So, while not all of these states are in crisis right now, it is correct to say we are seeing a pronounced jump in cases in many places. It’s not just a figment of the data / reporting like it had been before.

2. The rise in cases is primarily occurring in places that were not hit hard before.

When you look at a chart of the national numbers of new cases, you can clearly see the second wave that’s occurring. Using the data through June 27, our weekly figures of new cases is rapidly approaching the previous peaks set in April.

The trend below shows rolling 7-day positive cases for the US (all states plus DC and Puerto Rico) per 100,000 people:

On a national basis, the second wave description looks appropriate. But this obscures very different trends at the state level. In reality, the places where cases are rising did not experience much of a peak earlier on the crisis.

Consider the trends below for four states that have been making headlines in the past few days: Texas, Florida, California, and Arizona.

In this data, we see that Texas experienced a slight uptick in April, but it was much smaller than what we’re seeing now. For Arizona, Florida, and California, the recent rise they are experiencing is their first serious increase when adjusted for population. Conversely, states like New York and New Jersey saw their large increase in March and April, but are now seeing stable or declining cases.

This pattern demonstrates one of the many problems with demanding a nationwide lockdown in a country as large as the US. In effect, all states shut down (to varying degrees) based on the experience of New York, New Jersey, and a couple other hotspots. They did this without regard to whether the outbreaks they were experiencing could possibly warrant such dramatic action.

Now that some of these same states are facing a real outbreak close to home, they’re starting to do a new round of limited restrictions. In the face of an economy on life support, widespread social unrest, and an election year, it’s unclear much people will tolerate or comply with another aggressive attempt at quarantine.

3. So far, the second wave appears to be less lethal than the first wave.

Another important characteristic of the second wave is that, at least so far, it looks like to be less deadly than the earlier spikes.

Some commentators have made this point by looking at the trends in death counts for the new hotspots. However, this is not a good way to evaluate the lethality of the second wave at this point.

Deaths are a lagging indicator in this data. According to facts summarized by Our World in Data, death typically occurs between 2 weeks to 8 weeks after the onset of symptoms, which in turn show up several days after initial infection. Many of the cases being discovered now will eventually prove fatal, but they won’t be counted for several weeks. Looking at death rates today in the new hotspots risks providing a false sense of reassurance.

A better way to evaluate the likely lethality of this second wave in real time is to look at the trends in hospitalized COVID-19 cases.

We saw previously that the rise in new cases is now above the levels seen in April. Fortunately, for now the hospitalization data is not following the same trajectory.

In the chart below, we see the national trend in per capita positive cases combined with per capita hospitalization: