Follow The Data! (Covid-19)

We keep hearing FOLLOW THE DATA and as a data nerd I say “Well ok, where the hell is the data?”

It turns out it’s on the CDC website, organized and displayed badly, exportable into nasty csv files and with data normalization standards that would get you not only flunked out of an Introduction to MySQL course but the professor would probably risk his career beating the shit out of you in front of the entire class so this never, ever happens again.


Example: deaths are tallied by 18-29 years old demo and comorbidities are arranged by the 0-24 demo (or 25-34!). This creates an obvious problem if someone may want to analyze these different datasets or combine them, outputting something meaningful and interesting, which we call parsing.

Hi CDC Website I want to know what # or % of all infections does the 18-29 demo comprise?

CDC Website: well, we ain’t got that…but you interested in infection data on dat fire 20-29 demo, son?!

WHY CDC?!

So most rational people want to know how badly this virus affects their demographic and what are some of the health factors involved, so they can protect themselves and adjust their anxiety level accordingly. This is totally normal.

But some people won’t like this data being arranged logically and clearly and will probably mention the unknown long-term effects of Covid, a few anecdotal nightmare cases etc. Any way to crank the anxiety levels back up.

There is a certain level of anti science/anti logic superstition at play here and it’s enforced by the “follow the science/data!” mask nazi types. That superstition is basically: you must give Covid your full attention and respect, much like a tribute or penance, otherwise you run the risk of a very embarrassing and ironic Covid illness or death, which WE WILL celebrate!

However, the data is pretty clear, here’s some top levels:

Deaths By Age Group & ComorbiditiesDeaths# ComorbiditiesAvg Comorbidities
All Ages249,570          1,896,103                      7.60
Under 1 year29 ND  ND 
0-17 years127 ND  ND 
1-4 years16 ND  ND 
5-14 years44 ND  ND 
15-24 years439 ND  ND 
0-24 Compiled Due to Data Normalization Issues655                        3,011                               4.60
18-29 years               1,059 ND  ND 
25-341,852                      12,404                               6.70
30-49 years10,916 ND  ND 
35-444,771                      33,627                               7.05
45-5412,701                      95,715                               7.54
50-6438,625 ND  ND 
55-6430,875                    240,061                               7.78
65-7453,579                    424,497                               7.92
75-8467,305                    518,834                               7.71
85+77,959                    567,954                               7.29
  Median CMs                               7.41

If you don’t life in hospice or long term care your chances of surviving Covid go way up.

If you’re not a 45-85 year old person with an average of 7 (median of 4) comorbidities your chances of surviving Covid go WAY up.

If you’re a very young healthy person Covid should hardly be on your radar (unless you’re visiting old sick people often). Literally be more worried about lightning storms and unsafely driven buses.

I think we can all agree that these institutions have failed us, public and private. I’m just a regular guy with above average MS Excel skills, there are people galaxies away better than I am at this sort of thing with much more resources (i spent 2 hours this afternoon on this). Why the absolute lack of curiousity and substandard data quality? It almost feels like some people don’t really want us to FOLLOW THE DATA.

For my fellow Excel nibbas the link to this workbook is here:

Google Sheets hobos can click here:

Leave a Comment