Our business is helping companies turn data into insights. And we can’t help but wonder: what does the data show when it comes to where coronavirus is most prevalent?
To attempt to answer that (admittedly highly complex) question, we ran a simple regression analysis of COVID-19 cases per population against 3,000 demographic variables.
Before we share our findings, a few notes about our methodology.
- We sourced COVID-19 cases from USAFacts.org 4/23/2020 update. USAFacts provides government data, drawing from over 70 government agencies.
- Our demographic data was sourced from STI (Synergos Technologies Inc.) PopStats.
- All data was summarized at a county level and COVID-19 cases were normalized by population. Thus, we are not looking at the number of cases of COVID-19, but rather the number of cases per population. Otherwise, highly populated counties would dominate the results.
- Our findings show that there is a relationship between the demographic variables and COVID-19 cases per population but no attempt was made to control for other factors or prove causation. In other words, a strong relationship means that the variables are likely to be related, but it does not necessarily mean that a certain group is more likely to contract COVID-19.
Where (and With Whom) COVID-19 is More Prevalent
Now for the data. Our results show that several variables correlate strongly with COVID-19 cases per population. These variables include:
Urbanized populations
Higher population density is associated with higher COVID-19 cases per population, as are other urban indicators such as lower number of vehicles, high number of renters and shorter commute times.
Group quarters and institutionalized populations
Areas with higher numbers of people living together show higher cases of COVID-19. This includes elder care facilities, correctional institutions, and group homes. Even counties with larger household sizes show higher rates of coronavirus cases compared with counties with smaller household sizes.
Ancestry
Several ancestries correlate highly with COVID-19 cases. These include Dominican, Puerto Rican, Central and South American, Italian, Middle Eastern, Chinese and Korean. It is unclear how much of this relationship is the result of the prevalence of these groups in large dense metropolitan areas like New York versus early exposure of whether these groups were more likely to be impacted by community members traveling to places of early exposure such as China, Italy and South Korea.
Race/Ethnicity
Race and ethnicity have a moderate correlation with COVID-19 cases, as Black and Asian populations show a higher rate of COVID-19 compared to White and Hispanic populations.
Working-age Adults
The higher presences of age groups 30-55 are found in areas with higher COVID-19 cases. Despite the potential increased risk of COVID-19-related death with increased age, the data shows that counties with a higher population over 55 are likely to have lower COVID-19 cases.
Income and Education
COVID-19 rates are higher in counties with higher education and income. It is unclear how much differences in socioeconomic factors and lifestyle affect likelihood of COVID-19 exposure, but this result may be at least partially explained by pop density. Dense urban markets often have a higher percentage of high income and high education populations as well as a higher cost of living versus smaller and more rural markets. It is also unclear whether these urban, higher income markets have better access to COVID-19 testing and lower income areas show lower rates due to less testing.
Occupation
Perhaps not surprisingly, the amount of human contact in one’s employment is related to exposure. Lower COVID-19 cases are found in areas with higher percentages of agriculture, fishing, forestry and mining jobs, as well as transportation and construction workers. White collar professionals, sales personnel and personal care jobs show higher rates. Interestingly, the number or percent of health care workers did not correlate highly with COVID-19 cases.
Unemployment Claims
Areas with higher COVID-19 cases are more likely to have higher unemployment claims. In our upcoming blog on unemployment rates, we’ll show how areas hit hardest by COVID-19 as well as unemployment claims are those areas that rely heavily on tourism and service jobs, such as restaurants, hotels, and entertainment.
Get Reliable Data
As most of us know, the news providers report stories—not just facts. It can be confusing to understand what is actually happening with the coronavirus pandemic and what sources can be trusted. With limited testing and new data published daily, it can also be difficult to keep up with the changing landscape and plan for the future.
The SiteSeer team strives to keep people informed. We hope this information is useful and helps you understand what is happening in different areas of the country a little better. Knowledge is power when it comes to market planning, expanding the smart way and site selection. If we can help you understand the “new” future better, contact us to learn more about our powerful site selection software and professional services for retailers, economic developers, real estate professionals and other chain businesses.