To be clear, the COVID-19 plotter application uses data supplied by Johns Hopkins via github. They claim that this is the same data used in their World Map, but this analysis (05-05-2020) proves that it is not!!
To be absolutely specific, for several New York reporting entities (boroughs), the data on github shows a cumulative value of zero .. where as today's values on their map are shown in the table below.
New York, New York Data as of 05-05-20 | ||
---|---|---|
Borough | Confirmed | Deaths |
Queens | 52,845 | 4,102 |
Kings | 45,341 | 4,080 |
Bronx | 38,973 | 2,936 |
Richmond | 12,169 | 636 |
Currently, there are 275 US counties reporting zero confirmed and 1,636 US counties reporting zero deaths. (The dataset has a total of 3,147 US counties, territories, cities, and other.)
Initial Analysis
The github data shows New York, New York as
Confirmed 175,651 Deaths 19,057 Population 5,803,210 |
Confirmed 21,125 Deaths 1,774 Population 1,432,130 computed from 2,721.33 confirmed per 100,000 |
I assumed that the github files grouped the data for the 5 boroughs under New York City, New York, but provide the population per borough. Thus, to get correct per 100,000 results, you need to include all 5.
When the 5 boroughs are combined,
For all 5, 1,403.94 confirmed per 100,000 For all 5, 152.32 deaths per 100,000 |
New York City, New York, US - 3,026.79 confirmed per 100,000 New York City, New York, US - 328.39 deaths per 100,000 |
Number of cases does not add up
New York, New York Data as of 05-05-20 | ||
---|---|---|
Borough | Confirmed | Deaths |
Queens | 52,845 | 4,102 |
Kings | 45,341 | 4,080 |
Bronx | 38,973 | 2,936 |
Richmond | 12,169 | 636 |
New York City | 21,125 | 1,774 |
Sum | 170,453 | 13,528 |
New York City via github | 175,651 | 19,057 |
Difference | 5,198 | 5,529 |
So .. Apparently, there is data missing .. or double counted .. or who knows what.
One likely theory is that the github data includes deaths that are reported as probably COVID-19 related, but that there are no tests to confirm that. (There are several reports of hospitals doing just that - perhaps to get the government payments, perhaps not. News reports about this keep disappearing from the web.) However, in that case, the number of unexplained deaths should be equal to the number of unexplained confirmed. The fact that there is a difference of 331 argues against that theory.
At any rate, it is hard to trust a data source with these types of issues - but it is the best available.
Population Issues
New York, New York Population Data as of 05-05-20 | ||
---|---|---|
Borough | From github | Computed |
Queens | 2,253,858 | 2,278,902 |
Kings | 2,559,903 | 2,582,826 |
Bronx | 1,418,207 | 1,432,131 |
Richmond | 476,143 | 476,179 |
New York City | - | 1,432,130 |
Sum | - | 8,202,168 |
New York City via github | 8,140,241 | - |
Difference | 61,927 |
Since the total number of COVID-19 deaths in New York State is just under 25,000, that might be a big difference. On the other hand, it is only a 0.8% error (uncertainty).
However, since Johns Hopkins is advertising the github data as being the same as what they are using on their map - this could be very significant since it indicates a lack of quality control in what they are providing.
But, that choice is yours.
Would 5,000 make a difference
24999 / 23628065 = 105.8 deaths per 100,000 20000 / 23628065 = 84.6 deaths per 100,000 |
To put this in perspective, the normal rate appears to be between 2 and 14 deaths per 100,000. Omitting New York and New Jersey, the US has a rate of about 12, including them pushes it up to just over 20.
Author: Robert Clemenzi