Tuesday, April 27, 2021
Mapping the virus in Michigan
Over the last week I took on a little project. The Michigan government website where I download COVID data each week has a map with each county colored according to the severity of the disease in that county. My problem with the data is that the color is based on the entire duration of the pandemic. The map doesn’t indicate which county was hot in November and which was hot this past March.
So I put my computer programming skills to work, skills developed while working as a programmer for 27 years in the auto industry. I had the virus data already – I could use the same data I download every week. Over the last few years I had been writing programs to draw various diagrams and charts for my genealogy research – also the virus chart I generate each week. This should be straightforward.
What I didn’t have was map data. Various websites said upload your data with us and we’ll draw maps for you. Others offered a program to buy, but I didn’t want to learn yet another programming language (I’ve already been learning Python, which wasn’t a thing when I did this for pay).
A bit of digging led me to the US Census website. They have maps to better look at census data (and they offer a lot more than the number of people in a census tract). I could download geometric data for all the counties in America.
This was in shapefile format. This format has both geometry and associated data. So I had to find a Python shapefile reader, then struggled to use it when it didn’t install as expected.
Even so, soon the evolving program could draw a map! Once I figured out how to find Michigan counties (strangely, Michigan counties are not consecutive in the file). And once I figured out how the data represented islands. Online examples were a big help. I hadn’t known there was data for sixty little islands around Isle Royale.
This first data set had a big problem. It showed the county boundary at the state boundary. That would be no problem for a state like Arizona, but the state boundary of Michigan runs up the middle of Lake Michigan, which is a long way from shore. This also affected counties along Lake Erie, Lake St. Clair, Lake Huron, and Lake Superior.
The second data set had some errors. One example was a lake that is in two counties. This data showed the entire lake in one county. It looked like that county reached into its neighbor.
So on to a third set. This was also much lower resolution, which was just fine for my needs.
But the map looked fat. Oh, yeah. – the data points were latitude and longitude and one degree north-south is a different number of miles than one degree east-west. I added a correction. Much better.
Here’s the map before I added virus data. Even with sixteen colors I ended up with adjacent counties with the same color because they aren’t in any particular order.
I read in the COVID spreadsheet and selected colors for my data scale. Then a big problem. What data do I represent?
Do I choose the color based on the number of cases in a county? If so, Wayne County with 1.75 million residents will always have a high number of cases and Keweenaw County, with 2116 residents, won’t. Oakland county had 21839 cases one month (out of 1.2 million residents). That’s larger than the population of 22 counties.
So I chose the number of cases as a percent of population. That works better. The highest value was Baraga County in November where 3.8% of the population became infected.
I first tried one color scale based on one maximum value so that I could compare one monthly map to another. That meant most of the maps showed most counties at the lowest level.
I finally settled on the percent of population with the scale recomputed for each map. There is lots of color in these maps. Even so, that isn’t completely satisfactory. The maps for July 2020 and March 2021 have a similar number of brightly colored counties, but the July max is 0.7% and the March max is 3.0%. Then there’s the February, 2021 map that looks quite severe with big areas of orange – but at 0.7% it’s peak is less than a quarter of the peak in March.
The county on the eastern edge in the second row of counties from the south is Wayne County. Within it is the city of Detroit. I added Detroit because the virus data reported Detroit and Wayne County as two separate areas. Adding it meant clicking on the border to read of latitude and longitude.
Last Saturday I downloaded Michigan’s COVID data again. The number of cases per day appears to be leveling off. In the week before the week of the download the number of deaths per day hit 65 twice.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment