!pip install geopandas

import geopandas
import pandas as pd
import re

import matplotlib.pyplot as plt

import plotnine
from plotnine import *

Focus on Municipalities

For the latest round of restrictions due to Covid19 in the Madrid region the focus has switched from basic health areas to municipalities.

This notebook explores the municipalities of the region and the current rates of infection.

The Madrid region has 179 municipalities with population ranging from 48 people in Madarcos to 3.3 million people in Madrid capital (based on the 2019 population register). There are 10 municipalities with more than 100_000 inhabitants: Madrid, Móstoles, Alcalá de Henares, Fuenlabrada, Leganés, Getafe, Alcorcón, Torrejón de Ardoz, Parla, Alcobendas.

The regional authority provides weekly figures for Covid19 infections for 178 municipalities and the 21 districts of Madrid capital.

The health ministry has imposed new restrictions on municipalities with a population of over 100_000 that have an accumulated incidence rate for the last 14 days of over 500 per 100_000 inhabitants (and meet additional criteria).

The 10 municipalities in the Madrid region with a population of over 100_000 had a rate of between 525 (Alcalá de Henares) and 1_166 (Fuenlabrada) at 29 September 2020. Madrid capital had a rate of 776.

72% of the population of the Madrid region live in these municipalities.

These municipalities now have Covid19 restrictions based on the criteria of the health ministry.

At the same date, there are an additional 19 municipalities with rates over 1_000, with a total population of 67_300: Humanes de Madrid, Cobeña, Villa del Prado, Cubas de la Sagra, Moraleja de Enmedio, Torrelaguna, Casarrubuelos, Villaconejos, Navalagamella, Fuentidueña de Tajo, Villar del Olmo, Orusco de Tajuña, El Berrueco, Rozas de Puerto Real, Canencia, Braojos, Cervera de Buitrago. El Atazar, Puebla de la Sierra.

The regional authority has imposed restrictions on 3 basic health areas outside of the 10 large municipalities that have restrictions: Humanes de Madrid, Reyes Católicos, Villa del Prado.

Read and prepare the data

df=pd.read_excel('./maps/Municipios/20codmun28.xls', header=2) 
# source https://www.ine.es/daco/daco42/codmun/codmunmapa.htm
df.head()
CPRO CMUN DC NOMBRE
0 28 1 4 Acebeda, La
1 28 2 9 Ajalvir
2 28 3 5 Alameda del Valle
3 28 4 0 Álamo, El
4 28 5 3 Alcalá de Henares
munis_df=pd.read_csv('./fastpages/covid19_tia_muni_y_distritos_s.csv', delimiter=';', encoding='latin')
# source http://datos.comunidad.madrid/catalogo/dataset/covid19_tia_muni_y_distritos
for col in ['tasa_incidencia_acumulada_activos_ultimos_14dias', 'tasa_incidencia_acumulada_ultimos_14dias', 'tasa_incidencia_acumulada_total']:
    munis_df[col] = pd.to_numeric(munis_df[col].apply(lambda x: re.sub(',', '.', x)))
munis_df.rename(columns={'tasa_incidencia_acumulada_ultimos_14dias':'tasa', 'casos_confirmados_ultimos_14dias':'casos', 'codigo_geometria':'codigo_geo'}, inplace=True)
munis_df.fecha_informe=munis_df.fecha_informe.apply(lambda x: x[5:10])
munis_df.casos_confirmados_activos_ultimos_14dias=munis_df.casos_confirmados_activos_ultimos_14dias.fillna(0).astype('int')
munis_df.casos=munis_df.casos.fillna(0).astype('int')
munis_df.casos_confirmados_totales=munis_df.casos_confirmados_totales.fillna(0).astype('int')
munis_df.codigo_geo=munis_df.codigo_geo.fillna(0).astype('int')
munis_df.head()
municipio_distrito fecha_informe casos_confirmados_activos_ultimos_14dias tasa_incidencia_acumulada_activos_ultimos_14dias casos tasa casos_confirmados_totales tasa_incidencia_acumulada_total codigo_geo
0 Madrid-Retiro 09/29 253 212.04 745 624.39 3940 3302.13 79603
1 Madrid-Salamanca 09/29 404 276.52 1120 766.58 4770 3264.80 79604
2 Madrid-Centro 09/29 347 257.33 1016 753.44 4820 3574.39 79601
3 Madrid-Arganzuela 09/29 310 201.56 1051 683.34 5293 3441.42 79602
4 Madrid-Chamartín 09/29 259 177.58 915 627.36 4543 3114.87 79605
gdf=geopandas.read_file('./maps/Municipios/municipios_y_distritos_madrid.shp')
# source http://datos.comunidad.madrid/catalogo/dataset/covid19_tia_muni_y_distritos
gdf[gdf.pob_pad19==gdf.pob_pad19.min()] # municipality with smallest population
codigo_geo pob_pad19 municipio_ geometry
123 0783 48 Madarcos POLYGON ((451455.976 4545468.339, 451953.119 4...
print(gdf.loc[:20,].pob_pad19.sum(), 'population Madrid capital')
3266126 population Madrid capital
print (len(gdf),' municipalities/districts')
199  municipalities/districts
gdf.loc[21:,][gdf.loc[21:,].pob_pad19>100000].sort_values('pob_pad19',ascending=False)[['municipio_','pob_pad19']]
# 9 other municipalities with a population over 100 000
municipio_ pob_pad19
79 Móstoles 209184
178 Alcalá de Henares 195649
59 Fuenlabrada 193700
80 Leganés 189861
195 Getafe 183374
22 Alcorcón 170514
184 Torrejón de Ardoz 131376
86 Parla 130124
177 Alcobendas 117040
gdf.loc[:20].sort_values('pob_pad19',ascending=False)[['municipio_','pob_pad19']]
# population of the 21 districts of Madrid
municipio_ pob_pad19
10 Madrid-Carabanchel 253099
7 Madrid-Fuencarral-El Pardo 245939
9 Madrid-Latina 238218
12 Madrid-Puente de Vallecas 234857
16 Madrid-Ciudad Lineal 216284
17 Madrid-Hortaleza 188187
13 Madrid-San Blas - Canillejas 158142
5 Madrid-Tetuán 158023
3 Madrid-Arganzuela 153803
18 Madrid-Villaverde 148946
1 Madrid-Salamanca 146104
4 Madrid-Chamartín 145849
11 Madrid-Usera 139741
6 Madrid-Chamberí 139379
2 Madrid-Centro 134848
8 Madrid-Moncloa-Aravaca 119419
0 Madrid-Retiro 119317
19 Madrid-Villa de Vallecas 110365
15 Madrid-Moratalaz 94570
20 Madrid-Vicálvaro 72091
14 Madrid-Barajas 48945
munis_df.tasa.fillna(0, inplace=True)
# allocate to bins of accumulated incidence rate for the last 14 days
cut_labels=['<200','200-400','400-600','600-800','800-1000','>1000']
cut_bins = [-1., 200., 400., 600., 800., 1000., max(munis_df.tasa)]
munis_df['tasa_bin'] = pd.cut(munis_df.tasa, bins=cut_bins, labels=cut_labels)

gdf['restricted']='0'
gdf.loc[:20,'restricted']='1'
gdf.loc[gdf.pob_pad19>100000,'restricted']='1'
# all districts in Madrid capital restricted

munis_df['restricted']='0'
munis_df.loc[gdf[gdf.restricted=='1'].index,'restricted']='1'

for muni in gdf[gdf.restricted=='1'].municipio_:
  munis_df.loc[munis_df.municipio_distrito==muni, 'restricted']='1'

gdf.head()
codigo_geo pob_pad19 municipio_ geometry restricted
0 079603 119317 Madrid-Retiro POLYGON ((443663.017 4473349.384, 443663.267 4... 1
1 079604 146104 Madrid-Salamanca POLYGON ((444067.804 4476571.218, 444057.220 4... 1
2 079601 134848 Madrid-Centro POLYGON ((439586.516 4475753.323, 439594.830 4... 1
3 079602 153803 Madrid-Arganzuela POLYGON ((440345.316 4472954.760, 440386.546 4... 1
4 079605 145849 Madrid-Chamartín POLYGON ((441493.281 4478894.285, 441494.562 4... 1
print (gdf[gdf.restricted=='1'].pob_pad19.sum(), 'people restricted in 10 municipalities,', 
       int(.5+100*gdf[gdf.restricted=='1'].pob_pad19.sum()/gdf.pob_pad19.sum()),'% of the population of the region')
4786948 people restricted in 10 municipalities, 72 % of the population of the region

def col_func(fecha): return fecha[3]+fecha[4]+fecha[2]+fecha[0]+fecha[1]
plotnine.options.figure_size = (14, 8)

ggplot(munis_df[munis_df.fecha_informe>'08/24'], aes(x='tasa_bin', fill='restricted')) \
+ geom_histogram(binwidth=1, alpha=0.6, position='stack') \
+ facet_wrap('fecha_informe', labeller=labeller(cols=col_func)) \
+ theme_minimal() \
+ labs(title="Evolución de municipios/distritos con restricciones en las últimas 6 semanas",
       x='Tasa incidencia acumulada ultimos 14 días',
       y="Número de municipios / distritos") 
/usr/local/lib/python3.6/dist-packages/plotnine/utils.py:1246: FutureWarning: is_categorical is deprecated and will be removed in a future version.  Use is_categorical_dtype instead
  if pdtypes.is_categorical(arr):
<ggplot: (-9223363273197295202)>
week_df=munis_df[munis_df.fecha_informe==munis_df.fecha_informe.unique().max()][['municipio_distrito','tasa','casos','tasa_bin','restricted']]
gdf=gdf.merge(week_df,left_index=True, right_index=True)
print('Restricted area in 10 municipalities')
week_df[week_df.restricted=='1'].sort_values('tasa', ascending=False)[['municipio_distrito',	'tasa',	'casos']]
Restricted area in 10 municipalities
municipio_distrito tasa casos
12 Madrid-Puente de Vallecas 1185.83 2785
59 Fuenlabrada 1165.72 2258
86 Parla 1155.82 1504
11 Madrid-Usera 1049.08 1466
18 Madrid-Villaverde 1005.73 1498
177 Alcobendas 979.15 1146
10 Madrid-Carabanchel 960.49 2431
19 Madrid-Villa de Vallecas 866.22 956
16 Madrid-Ciudad Lineal 814.21 1761
20 Madrid-Vicálvaro 808.70 583
13 Madrid-San Blas - Canillejas 808.13 1278
184 Torrejón de Ardoz 795.43 1045
195 Getafe 773.28 1418
9 Madrid-Latina 767.36 1828
1 Madrid-Salamanca 766.58 1120
22 Alcorcón 753.60 1285
2 Madrid-Centro 753.44 1016
5 Madrid-Tetuán 727.74 1150
80 Leganés 724.74 1376
15 Madrid-Moratalaz 707.41 669
3 Madrid-Arganzuela 683.34 1051
79 Móstoles 629.11 1316
4 Madrid-Chamartín 627.36 915
0 Madrid-Retiro 624.39 745
14 Madrid-Barajas 602.72 295
8 Madrid-Moncloa-Aravaca 584.50 698
7 Madrid-Fuencarral-El Pardo 561.52 1381
6 Madrid-Chamberí 558.19 778
178 Alcalá de Henares 524.92 1027
17 Madrid-Hortaleza 494.72 931
gdf['unrestricted']='0'
gdf.loc[week_df[20:][(week_df[20:].tasa>1000) & (week_df[20:].restricted=='0')].municipio_distrito.index, 'unrestricted']='1'

print((100_000*munis_df.loc[:20].casos.sum()/gdf.loc[:20].pob_pad19.sum()+.5).astype('int'),'rate for Madrid capital')
776 rate for Madrid capital
print(len(week_df[20:][(week_df[20:].tasa>1000) & (week_df[20:].restricted=='0')]),'other municipalities with a rate over 1000')
week_df[20:][(week_df[20:].tasa>1000) & (week_df[20:].restricted=='0')].sort_values('tasa', ascending=False)[['municipio_distrito',	'tasa',	'casos']]
19 other municipalities with a rate over 1000
municipio_distrito tasa casos
81 Braojos 2439.02 5
165 Canencia 1789.71 8
148 Rozas de Puerto Real 1698.11 9
29 Puebla de la Sierra 1538.46 0
152 Villa del Prado 1518.40 99
87 Fuentidueña de Tajo 1509.99 31
103 El Berrueco 1447.37 11
115 Humanes de Madrid 1423.29 281
172 Navalagamella 1341.00 35
194 Cervera de Buitrago 1333.33 0
157 Torrelaguna 1323.53 63
197 Villaconejos 1180.64 40
132 Casarrubuelos 1164.64 44
82 Cubas de la Sagra 1138.31 73
192 Orusco de Tajuña 1125.40 14
159 El Atazar 1123.60 0
26 Moraleja de Enmedio 1070.87 55
105 Villar del Olmo 1044.26 21
100 Cobeña 1036.62 77
print('total population of these municipalities is only',gdf.iloc[week_df[20:][(week_df[20:].tasa>1000) & (week_df[20:].restricted=='0')].municipio_distrito.index].pob_pad19.sum())
total population of these municipalities is only 67330
gdf.iloc[week_df[20:][(week_df[20:].tasa>1000) & (week_df[20:].restricted=='0')].municipio_distrito.index].sort_values('pob_pad19', ascending=False)[['municipio_','pob_pad19']]
municipio_ pob_pad19
115 Humanes de Madrid 19743
100 Cobeña 7428
152 Villa del Prado 6520
82 Cubas de la Sagra 6413
26 Moraleja de Enmedio 5136
157 Torrelaguna 4760
132 Casarrubuelos 3778
197 Villaconejos 3388
172 Navalagamella 2610
87 Fuentidueña de Tajo 2053
105 Villar del Olmo 2011
192 Orusco de Tajuña 1244
103 El Berrueco 760
148 Rozas de Puerto Real 530
165 Canencia 447
81 Braojos 205
194 Cervera de Buitrago 150
159 El Atazar 89
29 Puebla de la Sierra 65

Basic Health Areas

zones_df=pd.read_csv('./fastpages/covid19_tia_zonas_basicas_salud_s.csv', delimiter=';', encoding='latin')
# source: http://datos.comunidad.madrid/catalogo/dataset/covid19_tia_zonas_basicas_salud
for col in ['tasa_incidencia_acumulada_activos_ultimos_14dias', 'tasa_incidencia_acumulada_ultimos_14dias', 'tasa_incidencia_acumulada_total']:
    zones_df[col] = pd.to_numeric(zones_df[col].apply(lambda x: re.sub(',', '.', x)))
zones_df.rename(columns={'zona_basica_salud':'zona_basic','tasa_incidencia_acumulada_ultimos_14dias':'tasa', 'casos_confirmados_ultimos_14dias':'casos'}, inplace=True)
zones_df.fecha_informe=zones_df.fecha_informe.apply(lambda x: x[5:10])
print(len(zones_df[(zones_df.tasa>1000) & (zones_df.fecha_informe=='09/29')]),'basic health areas with a rate of over 1 000 per 100 000 population in the last 14 days')
zones_df[(zones_df.tasa>1000) & (zones_df.fecha_informe=='09/29')].sort_values('tasa', ascending=False)[['zona_basic','tasa']]
49 basic health areas with a rate of over 1 000 per 100 000 population in the last 14 days
zona_basic tasa
10 Alicante 1656.24
62 Cuzco 1640.59
215 Puerta Bonita 1525.38
210 Pozo de Tío Raimundo 1486.94
202 Peña Prieta 1483.68
85 Entrevías 1447.75
232 San Blas 1386.96
233 San Cristóbal 1360.38
111 Humanes de Madrid 1350.54
234 San Diego 1340.20
71 Doctor Trueta 1339.11
274 Villa del Prado 1335.76
7 Alcobendas - Chopera 1335.12
231 San Andrés 1326.57
132 Las Calesas 1290.74
206 Pintores 1283.70
161 Martínez de la Riva 1281.28
171 Miraflores 1277.18
285 Zofío 1274.35
115 Isabel II 1271.29
219 Rafael Alberti 1270.91
273 Villa Vallecas 1267.03
135 Las Margaritas 1244.43
224 Reyes Católicos 1202.16
193 Panaderas 1151.16
11 Almendrales 1148.20
169 Miguel Servet 1146.48
121 La Elipa 1137.61
59 Comillas 1128.86
250 Sierra de Guadarrama 1123.82
78 El Naranjo 1118.60
150 Los Rosales 1116.31
237 San Isidro 1114.40
255 Torrelaguna 1107.82
51 Ciudad San Pablo 1101.67
15 Ángela Uriarte 1084.83
268 Valleaguado 1081.95
134 Las Fronteras 1081.33
92 Francia 1078.18
67 Doctor Cirajas 1077.14
38 Campo de la Paloma 1073.69
40 Canillejas 1071.17
185 Numancia 1066.68
97 Gandhi 1065.68
284 Vista Alegre 1060.58
271 Vicálvaro - Artilleros 1030.43
6 Alcalá de Guadaira 1028.15
16 Antonio Leyva 1023.47
162 María Curie 1016.51
zones_df.tasa.fillna(0, inplace=True)
# allocates rates to bins
cut_labels=['<200','200-400','400-600','600-800','800-1000','>1000']
cut_bins = [-1., 200., 400., 600., 800., 1000., max(zones_df.tasa)]
zones_df['tasa_bin'] = pd.cut(zones_df.tasa, bins=cut_bins, labels=cut_labels)

week_df=zones_df[zones_df.fecha_informe==zones_df.fecha_informe.unique().max()][['zona_basic','tasa','casos','tasa_bin']]
df=(geopandas.read_file('./maps/zonas_basicas_salud/zonas_basicas_salud.shp')).merge(week_df)

# basic health areas with restrictions
df['restricted']='0'
for zone in ['Humanes de Madrid','Reyes Católicos','Villa del Prado']:
  df.loc[df.zona_basic==zone, 'restricted']='1'
df['>1000']=df.tasa.apply(lambda x: 1 if x>1000 else 0)

Maps of municipalities and basic health areas

fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, sharex=True, sharey=True, figsize=(18,9))

gdf.plot(ax=ax1, column=gdf.tasa_bin, cmap='Reds', legend=True)
gdf.plot(ax=ax1, color='white', edgecolor='grey', alpha=0.1)
ax1.set_title('Rate of accumulated incidence\n by municipality')
ax1.axis('off')

gdf.plot(ax=ax2, column=gdf.restricted_x, cmap='Reds', legend=False)
gdf.plot(ax=ax2, color='white', edgecolor='grey', alpha=0.1)
ax2.set_title('Restricted municipalities\npopulation >100 000\nrate of over 500 per 100 000')
ax2.axis('off');

gdf.plot(ax=ax3, column=gdf.unrestricted, cmap='Reds', legend=False)
gdf.plot(ax=ax3, color='white', edgecolor='grey', alpha=0.1)
ax3.set_title('Municipalities with a rate > 1000\npopulation < 100 000')
ax3.axis('off');

fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, sharex=True, sharey=True, figsize=(18,9))

df.plot(ax=ax1, column=df.tasa_bin, cmap='Reds', legend=True)
df.plot(ax=ax1, color='white', edgecolor='grey', alpha=0.1)
ax1.set_title('Rate of accumulated incidence\n by basic health area')
ax1.axis('off')

df.plot(ax=ax2, column=df['>1000'], cmap='Reds', alpha=1, legend=False)
df.plot(ax=ax2, color='white', edgecolor='grey', alpha=0.1)
ax2.set_title('Basic health areas with a rate > 1000')
ax2.axis('off');

df.plot(ax=ax3, column=df.restricted, cmap='Reds', alpha=1, legend=False)
gdf[gdf.restricted_x=='1'].plot(ax=ax3, color='red')
df.plot(ax=ax3, color='white', edgecolor='grey', alpha=0.1)
ax3.set_title('Restricted basic health areas\n(rate > 1 000 per 100 000)\nand restricted 10 large municipalities')
ax3.axis('off');

Closing Remarks

Colour coded maps showing the infection rates, whether for municipalities or basic health areas give a misleading impression because the population of the Madrid region is concentrated in the capital and the surrounding densely populated municipalities. Covid19 affects people but people are not uniformly distributed across the region.

Basic health areas are more heterogeneous, being based on population and access to local health centres, than municipalities, which vary enormously in population.