COVID-19 - Queens, et al

This is a heads up and a warning when using the Johns Hopkins data

To be clear, the COVID-19 plotter application uses data supplied by Johns Hopkins via github. They claim that this is the same data used in their World Map, but this analysis (05-05-2020) proves that it is not!!

To be absolutely specific, for several New York reporting entities (boroughs), the data on github shows a cumulative value of zero .. where as today's values on their map are shown in the table below.

Spot checks of various counties in Virginia and Florida show that that data matches exactly. Only New York seems to have a problem.

Currently, there are 275 US counties reporting zero confirmed and 1,636 US counties reporting zero deaths. (The dataset has a total of 3,147 US counties, territories, cities, and other.)

Initial Analysis | Number of cases does not add up | Population Issues | Would 5,000 make a difference


Initial Analysis

I originally thought that it wasn't that the data is "missing" but rather that it is reported in a misleading way.

The github data shows New York, New York as

Their map has Since the map omitted the population, I simply computed it.

I assumed that the github files grouped the data for the 5 boroughs under New York City, New York, but provide the population per borough. Thus, to get correct per 100,000 results, you need to include all 5.

When the 5 boroughs are combined,

where as, using just the data as provided However, it turned out that there were bigger problems than that.


Number of cases does not add up

Working on the assumption that the sum of the World Map numbers might be equal to what is provided via github - I decided to check that.

So .. Apparently, there is data missing .. or double counted .. or who knows what.

One likely theory is that the github data includes deaths that are reported as probably COVID-19 related, but that there are no tests to confirm that. (There are several reports of hospitals doing just that - perhaps to get the government payments, perhaps not. News reports about this keep disappearing from the web.) However, in that case, the number of unexplained deaths should be equal to the number of unexplained confirmed. The fact that there is a difference of 331 argues against that theory.

At any rate, it is hard to trust a data source with these types of issues - but it is the best available.


Population Issues

The github data has population by US county. The Johns Hopkins map has confirmed and confirmed per 100,000 which can be used to compute the population. The values in the table below compare these two values. Whether that is a small or large difference depends on what you are using it for.

Since the total number of COVID-19 deaths in New York State is just under 25,000, that might be a big difference. On the other hand, it is only a 0.8% error (uncertainty).

However, since Johns Hopkins is advertising the github data as being the same as what they are using on their map - this could be very significant since it indicates a lack of quality control in what they are providing.

But, that choice is yours.


Would 5,000 make a difference

For New York state as a whole Well, I think that is a big difference.

To put this in perspective, the normal rate appears to be between 2 and 14 deaths per 100,000. Omitting New York and New Jersey, the US has a rate of about 12, including them pushes it up to just over 20.


Author: Robert Clemenzi
URL: http:// mc-computing.com / Science_Facts / COVID-19 / Queens.html